freeleaps-ops/docs
2025-09-25 10:41:30 +08:00
..
asserts feat: add Service Monitor and Error Alter Integration Guideline 2025-09-25 10:41:30 +08:00
examples (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
add_k8s_node.sh (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
Azure_K8s_Node_Addition_Runbook.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
bootstrap-k8s-cluster.sh (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
Current_Ingress_Analysis.md feat(infra): improve hyper link, add postgresql wiki 2025-09-04 15:51:55 -07:00
Custom_Resources_And_Operators_Guide.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
Ingress_Setup_And_Redirects_Guide.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
Kubernetes_Bootstrap_Guide.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
Kubernetes_Core_Concepts_Guide.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
Kubernetes_Fundamentals_For_Junior_Engineers.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
node_config.env.template (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
PostgreSQL_Gitea_Integration_Guide.md feat(infra): improve hyper link, add postgresql wiki 2025-09-04 15:51:55 -07:00
prometheus-metrics-intergration-guideline.md feat: add dashboard intergration guideline 2025-09-22 17:42:50 +08:00
PVC_Deep_Dive_Guide.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
RabbitMQ_Management_Analysis.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
README.md feat(infra): improve hyper link, add postgresql wiki 2025-09-04 15:51:55 -07:00
Reconciler_Architecture_Deep_Dive.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
Reconciler_Framework_Analysis.md (feat): Basic docs on Freeleaps Infra 2025-09-04 00:58:59 -07:00
Service Monitor and Error Alter Integration Guideline.md feat: add Service Monitor and Error Alter Integration Guideline 2025-09-25 10:41:30 +08:00

🚀 FreeLeaps DevOps Learning Path for Junior Engineers

Production-Ready Kubernetes & DevOps Documentation
Your gateway to understanding our actual infrastructure and becoming a DevOps expert


📋 Table of Contents

  1. 🎯 Quick Start Guide
  2. 🏗️ Your Production Infrastructure
  3. 📚 Core Learning Materials
  4. 🔧 Practical Exercises
  5. Essential Commands
  6. 🎓 Learning Path
  7. 🔍 Production Troubleshooting
  8. 📖 Additional Resources

🎯 Quick Start Guide

🚀 First Day Checklist

  • Access your production cluster: kubectl config use-context your-cluster
  • Explore the management UI: RabbitMQ Management UI
  • Check ArgoCD: Visit https://argo.mathmast.com
  • Review monitoring: Access Grafana dashboards
  • Understand your apps: Check freeleaps-devops-reconciler status

🔑 Essential Access Points

# Your production cluster access
kubectl config get-contexts
kubectl get nodes -o wide

# Your actual services
kubectl get svc -A | grep -E "(rabbitmq|argocd|jenkins|gitea)"

# Your actual namespaces
kubectl get namespaces | grep freeleaps

🏗️ Your Production Infrastructure

🌐 Production Domains & Services

Service Production URL Purpose Access
ArgoCD https://argo.mathmast.com GitOps deployment Web UI
Gitea https://gitea.freeleaps.mathmast.com Git repository Web UI
Jenkins http://jenkins.freeleaps.mathmast.com CI/CD pipelines Web UI (Internal access only)
RabbitMQ http://rabbitmq:15672 Message broker Management UI
Grafana https://grafana.mathmast.com Monitoring Dashboards

🔧 Production Architecture

┌─────────────────────────────────────────────────────────────┐
│                    PRODUCTION INFRASTRUCTURE                │
├─────────────────────────────────────────────────────────────┤
│  Azure Load Balancer (4.155.160.32)                         │
│  ┌─────────────────┐  ┌─────────────────┐  ┌──────────────┐ │
│  │   Ingress-NGINX │  │   cert-manager  │  │   ArgoCD     │ │
│  │   Controller    │  │   (Let's Encrypt)│  │   (GitOps)   │ │
│  └─────────────────┘  └─────────────────┘  └──────────────┘ │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐  ┌─────────────────┐  ┌──────────────┐ │
│  │   RabbitMQ      │  │   Jenkins       │  │   Gitea      │ │
│  │   (Message Q)   │  │   (CI/CD)       │  │   (Git Repo) │ │
│  └─────────────────┘  └─────────────────┘  └──────────────┘ │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐  ┌─────────────────┐  ┌──────────────┐ │
│  │   freeleaps-    │  │   freeleaps-    │  │   freeleaps- │ │
│  │   devops-       │  │   apps          │  │   monitoring │ │
│  │   reconciler    │  │   (Your Apps)   │  │   (Metrics)  │ │
│  └─────────────────┘  └─────────────────┘  └──────────────┘ │
└─────────────────────────────────────────────────────────────┘

📊 Production Namespaces

# Your actual namespaces
freeleaps-alpha          # Alpha environment
freeleaps-prod           # Production environment  
freeleaps-devops-system  # DevOps tools
freeleaps-controls-system # Control plane
freeleaps-monitoring-system # Monitoring stack

📚 Core Learning Materials

🎓 Phase 1: Kubernetes Fundamentals

  • Kubernetes Core Concepts Guide - Start here!

    • Production Connection: Your actual pods, services, and deployments
    • Real Examples: Based on your freeleaps-devops-reconciler deployment
    • Hands-on: Practice with your actual cluster
  • PVC Deep Dive Guide - Storage fundamentals

    • Production Connection: Your Azure disk storage classes
    • Real Examples: How your apps use persistent storage
    • Troubleshooting: Common storage issues in your environment

🔧 Phase 2: DevOps Infrastructure

  • Custom Resources & Operators Guide - Advanced concepts

    • Production Connection: Your freeleaps-devops-reconciler operator
    • Real Examples: How your CRDs work in production
    • Architecture: Understanding your operator pattern
  • Reconciler Architecture Deep Dive - Your core system

    • Production Connection: Your actual reconciler deployment
    • Real Examples: How your DevOps automation works
    • Troubleshooting: Common reconciler issues
  • Reconciler Framework Analysis - Technical deep dive

    • Production Connection: Your Python/Kopf operator framework
    • Real Examples: Code analysis from your actual implementation
    • Best Practices: How to improve your reconciler

🌐 Phase 3: Networking & Ingress

  • Ingress Setup & Redirects Guide - Web traffic management

    • Production Connection: Your actual ingress controllers
    • Real Examples: How your domains are configured
    • Troubleshooting: Common ingress issues
  • Current Ingress Analysis - Your actual setup

    • Production Connection: Your real ingress configurations
    • Real Examples: Your actual domain routing
    • Monitoring: How to check ingress health

📨 Phase 4: Messaging & Communication

  • RabbitMQ Management Analysis - Message broker
    • Production Connection: Your actual RabbitMQ deployment
    • Real Examples: Your message queues and exchanges
    • Management UI: How to use the built-in management interface

🗄️ Phase 4.5: Database Management

  • PostgreSQL & Gitea Integration Guide - Database operations
    • Production Connection: Your actual PostgreSQL deployments (Alpha vs Production)
    • Real Examples: How Gitea connects to PostgreSQL in your environments
    • Data Access: How to access and manage your Gitea database
    • Monitoring: Database health checks and performance monitoring

🚀 Phase 5: Operations & Deployment

  • Kubernetes Bootstrap Guide - Cluster setup

    • Production Connection: How your cluster was built
    • Real Examples: Your actual bootstrap process
    • Maintenance: How to maintain your cluster
  • Azure K8s Node Addition Runbook - Scaling

    • Production Connection: How to add nodes to your cluster
    • Real Examples: Your actual node addition process
    • Automation: Scripts for node management

🔧 Practical Exercises

🎯 Exercise 1: Explore Your Production Cluster

# 1. Connect to your cluster
kubectl config use-context your-production-cluster

# 2. Explore your namespaces
kubectl get namespaces | grep freeleaps

# 3. Check your actual deployments
kubectl get deployments -A | grep freeleaps

# 4. Monitor your reconciler
kubectl logs -f deployment/freeleaps-devops-reconciler -n freeleaps-devops-system

🎯 Exercise 2: RabbitMQ Management UI

# 1. Port forward to RabbitMQ management UI
kubectl port-forward svc/rabbitmq-headless -n freeleaps-alpha 15672:15672

# 2. Access the UI: http://localhost:15672
# Username: user
# Password: NjlhHFvnDuC7K0ir

# 3. Explore your queues:
# - freeleaps.devops.reconciler.queue
# - freeleaps.devops.reconciler.input

🎯 Exercise 3: ArgoCD GitOps

# 1. Access ArgoCD: https://argo.mathmast.com

# 2. Explore your applications:
# - freeleaps-devops-reconciler
# - freeleaps-apps
# - monitoring stack

# 3. Check deployment status
kubectl get applications -n argocd

🎯 Exercise 4: Monitor Your Infrastructure

# 1. Check cluster health
kubectl get nodes -o wide

# 2. Monitor resource usage
kubectl top nodes
kubectl top pods -A

# 3. Check ingress status
kubectl get ingress -A

Essential Commands

🔍 Production Monitoring

# Your cluster health
kubectl get nodes -o wide
kubectl get pods -A --field-selector=status.phase!=Running

# Your services
kubectl get svc -A | grep -E "(rabbitmq|argocd|jenkins|gitea)"

# Your reconciler status
kubectl get deployment freeleaps-devops-reconciler -n freeleaps-devops-system
kubectl logs -f deployment/freeleaps-devops-reconciler -n freeleaps-devops-system

🔧 Troubleshooting

# Check reconciler health
kubectl describe deployment freeleaps-devops-reconciler -n freeleaps-devops-system

# Check RabbitMQ status
kubectl get pods -n freeleaps-alpha | grep rabbitmq
kubectl logs -f deployment/rabbitmq -n freeleaps-alpha

# Check ingress issues
kubectl describe ingress -A
kubectl get events -A --sort-by='.lastTimestamp'

📊 Resource Management

# Monitor resource usage
kubectl top nodes
kubectl top pods -A

# Check storage
kubectl get pvc -A
kubectl get pv

# Check networking
kubectl get svc -A
kubectl get endpoints -A

🎓 Learning Path

📅 Week 1: Foundations

📅 Week 2: DevOps Infrastructure

📅 Week 3: Networking & Communication

📅 Week 4: Operations & Production


🔍 Production Troubleshooting

🚨 Common Issues & Solutions

1. Reconciler Not Working

# Check reconciler status
kubectl get deployment freeleaps-devops-reconciler -n freeleaps-devops-system
kubectl logs -f deployment/freeleaps-devops-reconciler -n freeleaps-devops-system

# Check RabbitMQ connection
kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_connections

2. Ingress Issues

# Check ingress controller
kubectl get pods -n ingress-nginx
kubectl logs -f deployment/ingress-nginx-controller -n ingress-nginx

# Check certificates
kubectl get certificates -A
kubectl describe certificate -n your-namespace

3. Storage Problems

# Check PVC status
kubectl get pvc -A
kubectl describe pvc your-pvc-name -n your-namespace

# Check storage classes
kubectl get storageclass

📊 Monitoring & Alerts

Key Metrics to Watch

  • Cluster health: Node status, pod restarts
  • Resource usage: CPU, memory, disk
  • Network: Ingress traffic, service connectivity
  • Applications: Reconciler health, RabbitMQ queues

Alerting Setup

# Check Prometheus targets
kubectl get targets -n freeleaps-monitoring-system

# Check Grafana dashboards
# Access: https://grafana.mathmast.com

📖 Additional Resources

🔗 Official Documentation

🎥 Video Resources

📚 Books

  • "Kubernetes in Action" by Marko Lukša
  • "GitOps and Kubernetes" by Billy Yuen
  • "RabbitMQ in Depth" by Gavin M. Roy

🛠️ Tools & Utilities

  • k9s - Terminal UI for K8s
  • Lens - Desktop IDE for K8s
  • kubectx - Context switching

🎯 Next Steps

🚀 Immediate Actions

  1. Set up your development environment with kubectl and helm
  2. Access your production cluster and explore the resources
  3. Complete the practical exercises in this guide
  4. Join the monitoring dashboards and understand the metrics

📈 Career Development

  1. Get certified: CKA (Certified Kubernetes Administrator)
  2. Contribute: Help improve the reconciler and infrastructure
  3. Learn: Stay updated with latest K8s and DevOps practices
  4. Share: Document your learnings and share with the team

🤝 Team Collaboration

  • Code reviews: Review reconciler changes
  • Documentation: Improve this guide based on your experience
  • Mentoring: Help other junior engineers
  • Innovation: Suggest improvements to the infrastructure

📞 Support & Contact

🆘 Getting Help

  • Team Slack: #devops-support channel
  • Documentation: This guide and linked resources
  • Code Reviews: GitHub pull requests
  • Pair Programming: Schedule sessions with senior engineers

📝 Feedback

  • Documentation: Create issues for improvements
  • Process: Suggest workflow optimizations
  • Tools: Recommend new tools or improvements

🎉 Welcome to the FreeLeaps DevOps team! You're now part of a production infrastructure that serves real users. Take ownership, learn continuously, and help us build amazing things!


Last updated: $(date)
Maintained by: FreeLeaps DevOps Team