407 lines
15 KiB
Markdown
407 lines
15 KiB
Markdown
# 🚀 FreeLeaps DevOps Learning Path for Junior Engineers
|
||
|
||
> **Production-Ready Kubernetes & DevOps Documentation**
|
||
> *Your gateway to understanding our actual infrastructure and becoming a DevOps expert*
|
||
|
||
---
|
||
|
||
## 📋 **Table of Contents**
|
||
|
||
1. [🎯 **Quick Start Guide**](#-quick-start-guide)
|
||
2. [🏗️ **Your Production Infrastructure**](#️-your-production-infrastructure)
|
||
3. [📚 **Core Learning Materials**](#-core-learning-materials)
|
||
4. [🔧 **Practical Exercises**](#-practical-exercises)
|
||
5. [⚡ **Essential Commands**](#-essential-commands)
|
||
6. [🎓 **Learning Path**](#-learning-path)
|
||
7. [🔍 **Production Troubleshooting**](#-production-troubleshooting)
|
||
8. [📖 **Additional Resources**](#-additional-resources)
|
||
|
||
---
|
||
|
||
## 🎯 **Quick Start Guide**
|
||
|
||
### **🚀 First Day Checklist**
|
||
- [ ] **Access your production cluster**: `kubectl config use-context your-cluster`
|
||
- [ ] **Explore the management UI**: [RabbitMQ Management UI](#rabbitmq-management-ui)
|
||
- [ ] **Check ArgoCD**: Visit `https://argo.mathmast.com`
|
||
- [ ] **Review monitoring**: Access Grafana dashboards
|
||
- [ ] **Understand your apps**: Check `freeleaps-devops-reconciler` status
|
||
|
||
### **🔑 Essential Access Points**
|
||
```bash
|
||
# Your production cluster access
|
||
kubectl config get-contexts
|
||
kubectl get nodes -o wide
|
||
|
||
# Your actual services
|
||
kubectl get svc -A | grep -E "(rabbitmq|argocd|jenkins|gitea)"
|
||
|
||
# Your actual namespaces
|
||
kubectl get namespaces | grep freeleaps
|
||
```
|
||
|
||
---
|
||
|
||
## 🏗️ **Your Production Infrastructure**
|
||
|
||
### **🌐 Production Domains & Services**
|
||
|
||
| **Service** | **Production URL** | **Purpose** | **Access** |
|
||
|-------------|-------------------|-------------|------------|
|
||
| **ArgoCD** | `https://argo.mathmast.com` | GitOps deployment | Web UI |
|
||
| **Gitea** | `https://gitea.freeleaps.mathmast.com` | Git repository | Web UI |
|
||
| **Jenkins** | `http://jenkins.freeleaps.mathmast.com` | CI/CD pipelines | Web UI |
|
||
| **RabbitMQ** | `http://rabbitmq:15672` | Message broker | Management UI |
|
||
| **Grafana** | `https://grafana.mathmast.com` | Monitoring | Dashboards |
|
||
|
||
### **🔧 Production Architecture**
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ PRODUCTION INFRASTRUCTURE │
|
||
├─────────────────────────────────────────────────────────────┤
|
||
│ Azure Load Balancer (4.155.160.32) │
|
||
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │
|
||
│ │ Ingress-NGINX │ │ cert-manager │ │ ArgoCD │ │
|
||
│ │ Controller │ │ (Let's Encrypt)│ │ (GitOps) │ │
|
||
│ └─────────────────┘ └─────────────────┘ └──────────────┘ │
|
||
├─────────────────────────────────────────────────────────────┤
|
||
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │
|
||
│ │ RabbitMQ │ │ Jenkins │ │ Gitea │ │
|
||
│ │ (Message Q) │ │ (CI/CD) │ │ (Git Repo) │ │
|
||
│ └─────────────────┘ └─────────────────┘ └──────────────┘ │
|
||
├─────────────────────────────────────────────────────────────┤
|
||
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │
|
||
│ │ freeleaps- │ │ freeleaps- │ │ freeleaps- │ │
|
||
│ │ devops- │ │ apps │ │ monitoring │ │
|
||
│ │ reconciler │ │ (Your Apps) │ │ (Metrics) │ │
|
||
│ └─────────────────┘ └─────────────────┘ └──────────────┘ │
|
||
└─────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### **📊 Production Namespaces**
|
||
|
||
```bash
|
||
# Your actual namespaces
|
||
freeleaps-alpha # Alpha environment
|
||
freeleaps-prod # Production environment
|
||
freeleaps-devops-system # DevOps tools
|
||
freeleaps-controls-system # Control plane
|
||
freeleaps-monitoring-system # Monitoring stack
|
||
```
|
||
|
||
---
|
||
|
||
## 📚 **Core Learning Materials**
|
||
|
||
### **🎓 Phase 1: Kubernetes Fundamentals**
|
||
- **[Kubernetes Core Concepts Guide](Kubernetes_Core_Concepts_Guide.md)** - *Start here!*
|
||
- **Production Connection**: Your actual pods, services, and deployments
|
||
- **Real Examples**: Based on your `freeleaps-devops-reconciler` deployment
|
||
- **Hands-on**: Practice with your actual cluster
|
||
|
||
- **[PVC Deep Dive Guide](PVC_Deep_Dive_Guide.md)** - *Storage fundamentals*
|
||
- **Production Connection**: Your Azure disk storage classes
|
||
- **Real Examples**: How your apps use persistent storage
|
||
- **Troubleshooting**: Common storage issues in your environment
|
||
|
||
### **🔧 Phase 2: DevOps Infrastructure**
|
||
- **[Custom Resources & Operators Guide](Custom_Resources_And_Operators_Guide.md)** - *Advanced concepts*
|
||
- **Production Connection**: Your `freeleaps-devops-reconciler` operator
|
||
- **Real Examples**: How your CRDs work in production
|
||
- **Architecture**: Understanding your operator pattern
|
||
|
||
- **[Reconciler Architecture Deep Dive](Reconciler_Architecture_Deep_Dive.md)** - *Your core system*
|
||
- **Production Connection**: Your actual reconciler deployment
|
||
- **Real Examples**: How your DevOps automation works
|
||
- **Troubleshooting**: Common reconciler issues
|
||
|
||
- **[Reconciler Framework Analysis](Reconciler_Framework_Analysis.md)** - *Technical deep dive*
|
||
- **Production Connection**: Your Python/Kopf operator framework
|
||
- **Real Examples**: Code analysis from your actual implementation
|
||
- **Best Practices**: How to improve your reconciler
|
||
|
||
### **🌐 Phase 3: Networking & Ingress**
|
||
- **[Ingress Setup & Redirects Guide](Ingress_Setup_And_Redirects_Guide.md)** - *Web traffic management*
|
||
- **Production Connection**: Your actual ingress controllers
|
||
- **Real Examples**: How your domains are configured
|
||
- **Troubleshooting**: Common ingress issues
|
||
|
||
- **[Current Ingress Analysis](Current_Ingress_Analysis.md)** - *Your actual setup*
|
||
- **Production Connection**: Your real ingress configurations
|
||
- **Real Examples**: Your actual domain routing
|
||
- **Monitoring**: How to check ingress health
|
||
|
||
### **📨 Phase 4: Messaging & Communication**
|
||
- **[RabbitMQ Management Analysis](RabbitMQ_Management_Analysis.md)** - *Message broker*
|
||
- **Production Connection**: Your actual RabbitMQ deployment
|
||
- **Real Examples**: Your message queues and exchanges
|
||
- **Management UI**: How to use the built-in management interface
|
||
|
||
### **🚀 Phase 5: Operations & Deployment**
|
||
- **[Kubernetes Bootstrap Guide](Kubernetes_Bootstrap_Guide.md)** - *Cluster setup*
|
||
- **Production Connection**: How your cluster was built
|
||
- **Real Examples**: Your actual bootstrap process
|
||
- **Maintenance**: How to maintain your cluster
|
||
|
||
- **[Azure K8s Node Addition Runbook](Azure_K8s_Node_Addition_Runbook.md)** - *Scaling*
|
||
- **Production Connection**: How to add nodes to your cluster
|
||
- **Real Examples**: Your actual node addition process
|
||
- **Automation**: Scripts for node management
|
||
|
||
---
|
||
|
||
## 🔧 **Practical Exercises**
|
||
|
||
### **🎯 Exercise 1: Explore Your Production Cluster**
|
||
```bash
|
||
# 1. Connect to your cluster
|
||
kubectl config use-context your-production-cluster
|
||
|
||
# 2. Explore your namespaces
|
||
kubectl get namespaces | grep freeleaps
|
||
|
||
# 3. Check your actual deployments
|
||
kubectl get deployments -A | grep freeleaps
|
||
|
||
# 4. Monitor your reconciler
|
||
kubectl logs -f deployment/freeleaps-devops-reconciler -n freeleaps-devops-system
|
||
```
|
||
|
||
### **🎯 Exercise 2: RabbitMQ Management UI**
|
||
```bash
|
||
# 1. Port forward to RabbitMQ management UI
|
||
kubectl port-forward svc/rabbitmq-headless -n freeleaps-alpha 15672:15672
|
||
|
||
# 2. Access the UI: http://localhost:15672
|
||
# Username: user
|
||
# Password: NjlhHFvnDuC7K0ir
|
||
|
||
# 3. Explore your queues:
|
||
# - freeleaps.devops.reconciler.queue
|
||
# - freeleaps.devops.reconciler.input
|
||
```
|
||
|
||
### **🎯 Exercise 3: ArgoCD GitOps**
|
||
```bash
|
||
# 1. Access ArgoCD: https://argo.mathmast.com
|
||
|
||
# 2. Explore your applications:
|
||
# - freeleaps-devops-reconciler
|
||
# - freeleaps-apps
|
||
# - monitoring stack
|
||
|
||
# 3. Check deployment status
|
||
kubectl get applications -n argocd
|
||
```
|
||
|
||
### **🎯 Exercise 4: Monitor Your Infrastructure**
|
||
```bash
|
||
# 1. Check cluster health
|
||
kubectl get nodes -o wide
|
||
|
||
# 2. Monitor resource usage
|
||
kubectl top nodes
|
||
kubectl top pods -A
|
||
|
||
# 3. Check ingress status
|
||
kubectl get ingress -A
|
||
```
|
||
|
||
---
|
||
|
||
## ⚡ **Essential Commands**
|
||
|
||
### **🔍 Production Monitoring**
|
||
```bash
|
||
# Your cluster health
|
||
kubectl get nodes -o wide
|
||
kubectl get pods -A --field-selector=status.phase!=Running
|
||
|
||
# Your services
|
||
kubectl get svc -A | grep -E "(rabbitmq|argocd|jenkins|gitea)"
|
||
|
||
# Your reconciler status
|
||
kubectl get deployment freeleaps-devops-reconciler -n freeleaps-devops-system
|
||
kubectl logs -f deployment/freeleaps-devops-reconciler -n freeleaps-devops-system
|
||
```
|
||
|
||
### **🔧 Troubleshooting**
|
||
```bash
|
||
# Check reconciler health
|
||
kubectl describe deployment freeleaps-devops-reconciler -n freeleaps-devops-system
|
||
|
||
# Check RabbitMQ status
|
||
kubectl get pods -n freeleaps-alpha | grep rabbitmq
|
||
kubectl logs -f deployment/rabbitmq -n freeleaps-alpha
|
||
|
||
# Check ingress issues
|
||
kubectl describe ingress -A
|
||
kubectl get events -A --sort-by='.lastTimestamp'
|
||
```
|
||
|
||
### **📊 Resource Management**
|
||
```bash
|
||
# Monitor resource usage
|
||
kubectl top nodes
|
||
kubectl top pods -A
|
||
|
||
# Check storage
|
||
kubectl get pvc -A
|
||
kubectl get pv
|
||
|
||
# Check networking
|
||
kubectl get svc -A
|
||
kubectl get endpoints -A
|
||
```
|
||
|
||
---
|
||
|
||
## 🎓 **Learning Path**
|
||
|
||
### **📅 Week 1: Foundations**
|
||
- **Day 1-2**: [Kubernetes Core Concepts](Kubernetes_Core_Concepts_Guide.md)
|
||
- **Day 3-4**: [PVC Deep Dive](PVC_Deep_Dive_Guide.md)
|
||
- **Day 5**: Practice exercises with your actual cluster
|
||
|
||
### **📅 Week 2: DevOps Infrastructure**
|
||
- **Day 1-2**: [Custom Resources & Operators](Custom_Resources_And_Operators_Guide.md)
|
||
- **Day 3-4**: [Reconciler Architecture](Reconciler_Architecture_Deep_Dive.md)
|
||
- **Day 5**: [Reconciler Framework Analysis](Reconciler_Framework_Analysis.md)
|
||
|
||
### **📅 Week 3: Networking & Communication**
|
||
- **Day 1-2**: [Ingress Setup & Redirects](Ingress_Setup_And_Redirects_Guide.md)
|
||
- **Day 3**: [Current Ingress Analysis](Current_Ingress_Analysis.md)
|
||
- **Day 4-5**: [RabbitMQ Management](RabbitMQ_Management_Analysis.md)
|
||
|
||
### **📅 Week 4: Operations & Production**
|
||
- **Day 1-2**: [Kubernetes Bootstrap](Kubernetes_Bootstrap_Guide.md)
|
||
- **Day 3-4**: [Azure Node Addition](Azure_K8s_Node_Addition_Runbook.md)
|
||
- **Day 5**: Production troubleshooting and monitoring
|
||
|
||
---
|
||
|
||
## 🔍 **Production Troubleshooting**
|
||
|
||
### **🚨 Common Issues & Solutions**
|
||
|
||
#### **1. Reconciler Not Working**
|
||
```bash
|
||
# Check reconciler status
|
||
kubectl get deployment freeleaps-devops-reconciler -n freeleaps-devops-system
|
||
kubectl logs -f deployment/freeleaps-devops-reconciler -n freeleaps-devops-system
|
||
|
||
# Check RabbitMQ connection
|
||
kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_connections
|
||
```
|
||
|
||
#### **2. Ingress Issues**
|
||
```bash
|
||
# Check ingress controller
|
||
kubectl get pods -n ingress-nginx
|
||
kubectl logs -f deployment/ingress-nginx-controller -n ingress-nginx
|
||
|
||
# Check certificates
|
||
kubectl get certificates -A
|
||
kubectl describe certificate -n your-namespace
|
||
```
|
||
|
||
#### **3. Storage Problems**
|
||
```bash
|
||
# Check PVC status
|
||
kubectl get pvc -A
|
||
kubectl describe pvc your-pvc-name -n your-namespace
|
||
|
||
# Check storage classes
|
||
kubectl get storageclass
|
||
```
|
||
|
||
### **📊 Monitoring & Alerts**
|
||
|
||
#### **Key Metrics to Watch**
|
||
- **Cluster health**: Node status, pod restarts
|
||
- **Resource usage**: CPU, memory, disk
|
||
- **Network**: Ingress traffic, service connectivity
|
||
- **Applications**: Reconciler health, RabbitMQ queues
|
||
|
||
#### **Alerting Setup**
|
||
```bash
|
||
# Check Prometheus targets
|
||
kubectl get targets -n freeleaps-monitoring-system
|
||
|
||
# Check Grafana dashboards
|
||
# Access: https://grafana.mathmast.com
|
||
```
|
||
|
||
---
|
||
|
||
## 📖 **Additional Resources**
|
||
|
||
### **🔗 Official Documentation**
|
||
- **[Kubernetes Documentation](https://kubernetes.io/docs/)** - Official K8s docs
|
||
- **[ArgoCD Documentation](https://argo-cd.readthedocs.io/)** - GitOps platform
|
||
- **[RabbitMQ Documentation](https://www.rabbitmq.com/documentation.html)** - Message broker
|
||
- **[Helm Documentation](https://helm.sh/docs/)** - Package manager
|
||
|
||
### **🎥 Video Resources**
|
||
- **Kubernetes Crash Course**: [TechWorld with Nana](https://www.youtube.com/watch?v=s_o8dwzRlu4)
|
||
- **ArgoCD Tutorial**: [ArgoCD Official](https://www.youtube.com/watch?v=MeU5_k9ssOY)
|
||
- **RabbitMQ Basics**: [RabbitMQ Official](https://www.youtube.com/watch?v=deG25y_r6OI)
|
||
|
||
### **📚 Books**
|
||
- **"Kubernetes in Action"** by Marko Lukša
|
||
- **"GitOps and Kubernetes"** by Billy Yuen
|
||
- **"RabbitMQ in Depth"** by Gavin M. Roy
|
||
|
||
### **🛠️ Tools & Utilities**
|
||
- **[k9s](https://k9scli.io/)** - Terminal UI for K8s
|
||
- **[Lens](https://k8slens.dev/)** - Desktop IDE for K8s
|
||
- **[kubectx](https://github.com/ahmetb/kubectx)** - Context switching
|
||
|
||
---
|
||
|
||
## 🎯 **Next Steps**
|
||
|
||
### **🚀 Immediate Actions**
|
||
1. **Set up your development environment** with kubectl and helm
|
||
2. **Access your production cluster** and explore the resources
|
||
3. **Complete the practical exercises** in this guide
|
||
4. **Join the monitoring dashboards** and understand the metrics
|
||
|
||
### **📈 Career Development**
|
||
1. **Get certified**: [CKA (Certified Kubernetes Administrator)](https://www.cncf.io/certification/cka/)
|
||
2. **Contribute**: Help improve the reconciler and infrastructure
|
||
3. **Learn**: Stay updated with latest K8s and DevOps practices
|
||
4. **Share**: Document your learnings and share with the team
|
||
|
||
### **🤝 Team Collaboration**
|
||
- **Code reviews**: Review reconciler changes
|
||
- **Documentation**: Improve this guide based on your experience
|
||
- **Mentoring**: Help other junior engineers
|
||
- **Innovation**: Suggest improvements to the infrastructure
|
||
|
||
---
|
||
|
||
## 📞 **Support & Contact**
|
||
|
||
### **🆘 Getting Help**
|
||
- **Team Slack**: #devops-support channel
|
||
- **Documentation**: This guide and linked resources
|
||
- **Code Reviews**: GitHub pull requests
|
||
- **Pair Programming**: Schedule sessions with senior engineers
|
||
|
||
### **📝 Feedback**
|
||
- **Documentation**: Create issues for improvements
|
||
- **Process**: Suggest workflow optimizations
|
||
- **Tools**: Recommend new tools or improvements
|
||
|
||
---
|
||
|
||
**🎉 Welcome to the FreeLeaps DevOps team! You're now part of a production infrastructure that serves real users. Take ownership, learn continuously, and help us build amazing things!**
|
||
|
||
---
|
||
|
||
*Last updated: $(date)*
|
||
*Maintained by: FreeLeaps DevOps Team*
|
||
|