# 🐰 RabbitMQ Management Analysis & Production Guide > **Complete Guide to Managing RabbitMQ in Your FreeLeaps Production Environment** > *From configuration to monitoring to troubleshooting* --- ## 📋 **Table of Contents** 1. [🎯 **Quick Start**](#-quick-start) 2. [🏗️ **Your Production Setup**](#️-your-production-setup) 3. [🔧 **Current Configuration Analysis**](#-current-configuration-analysis) 4. [📊 **Management UI Guide**](#-management-ui-guide) 5. [🔍 **Production Monitoring**](#-production-monitoring) 6. [🚨 **Troubleshooting Guide**](#-troubleshooting-guide) 7. [⚡ **Performance Optimization**](#-performance-optimization) 8. [🔒 **Security Best Practices**](#-security-best-practices) 9. [📈 **Scaling & High Availability**](#-scaling--high-availability) 10. [🛠️ **Maintenance Procedures**](#️-maintenance-procedures) --- ## 🎯 **Quick Start** ### **🚀 First Day Checklist** - [ ] **Access RabbitMQ Management UI**: Port forward to `http://localhost:15672` - [ ] **Check your queues**: Verify `freeleaps.devops.reconciler.*` queues exist - [ ] **Monitor connections**: Check if reconciler is connected - [ ] **Review metrics**: Check message rates and queue depths - [ ] **Test connectivity**: Verify RabbitMQ is accessible from your apps ### **🔑 Essential Commands** ```bash # Access your RabbitMQ cluster kubectl get pods -n freeleaps-alpha | grep rabbitmq # Port forward to management UI kubectl port-forward svc/rabbitmq-headless -n freeleaps-alpha 15672:15672 # Check RabbitMQ logs kubectl logs -f deployment/rabbitmq -n freeleaps-alpha # Access RabbitMQ CLI kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_queues ``` --- ## 🏗️ **Your Production Setup** ### **🌐 Production Architecture** ``` ┌─────────────────────────────────────────────────────────────┐ │ RABBITMQ PRODUCTION SETUP │ ├─────────────────────────────────────────────────────────────┤ │ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │ │ │ freeleaps- │ │ freeleaps- │ │ freeleaps- │ │ │ │ devops- │ │ apps │ │ monitoring │ │ │ │ reconciler │ │ (Your Apps) │ │ (Metrics) │ │ │ └─────────────────┘ └─────────────────┘ └──────────────┘ │ │ │ │ │ │ │ │ AMQP 5672 │ AMQP 5672 │ │ │ │ HTTP 15672 │ HTTP 15672 │ │ │ └────────────────────┼────────────────────┘ │ │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ RABBITMQ CLUSTER │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Node 1 │ │ Node 2 │ │ Node 3 │ │ │ │ │ │ (Primary) │ │ (Replica) │ │ (Replica) │ │ │ │ │ │ Port: 5672 │ │ Port: 5672 │ │ Port: 5672 │ │ │ │ │ │ UI: 15672 │ │ UI: 15672 │ │ UI: 15672 │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ``` ### **📊 Production Namespaces** | **Environment** | **Namespace** | **Purpose** | **Status** | |-----------------|---------------|-------------|------------| | **Alpha** | `freeleaps-alpha` | Development & Testing | ✅ Active | | **Production** | `freeleaps-prod` | Live Production | ✅ Active | ### **🔧 Production Services** ```bash # Your actual RabbitMQ services kubectl get svc -n freeleaps-alpha | grep rabbitmq kubectl get svc -n freeleaps-prod | grep rabbitmq # Service details: # - rabbitmq-headless: Internal cluster communication # - rabbitmq: External access (if needed) # - rabbitmq-management: Management UI access ``` --- ## 🔧 **Current Configuration Analysis** ### **📋 Configuration Sources** #### **1. Helm Chart Configuration** ```yaml # Location: freeleaps-ops/freeleaps/helm-pkg/3rd/rabbitmq/ # Primary configuration files: # - values.yaml (base configuration) # - values.alpha.yaml (alpha environment overrides) # - values.prod.yaml (production environment overrides) ``` #### **2. Reconciler Configuration** ```yaml # Location: freeleaps-devops-reconciler/helm/freeleaps-devops-reconciler/values.yaml rabbitmq: host: "rabbitmq-headless.freeleaps-alpha.svc.cluster.local" port: 5672 username: "user" password: "NjlhHFvnDuC7K0ir" vhost: "/" ``` #### **3. Python Configuration** ```python # Location: freeleaps-devops-reconciler/reconciler/config/config.py RABBITMQ_HOST = os.getenv('RABBITMQ_HOST', 'localhost') RABBITMQ_PORT = int(os.getenv('RABBITMQ_PORT', '5672')) RABBITMQ_USERNAME = os.getenv('RABBITMQ_USERNAME', 'guest') RABBITMQ_PASSWORD = os.getenv('RABBITMQ_PASSWORD', 'guest') ``` ### **🔍 Configuration Analysis** #### **✅ What's Working Well** 1. **Helm-based deployment** - Consistent and repeatable 2. **Environment separation** - Alpha vs Production 3. **Clustering enabled** - High availability 4. **Management plugin** - Web UI available 5. **Resource limits** - Proper resource management #### **⚠️ Issues Identified** ##### **1. Configuration Mismatch** ```yaml # ❌ PROBLEM: Different image versions # Helm chart: bitnami/rabbitmq:4.0.6-debian-12-r0 # Reconciler: rabbitmq:3.12-management-alpine # ❌ PROBLEM: Different credentials # Alpha: username: "user", password: "NjlhHFvnDuC7K0ir" # Production: Different credentials (not shown in config) ``` ##### **2. Security Concerns** ```yaml # ❌ PROBLEM: Hardcoded passwords in values files auth: username: user password: "NjlhHFvnDuC7K0ir" # Should be in Kubernetes secrets ``` ##### **3. Network Configuration** ```yaml # ❌ PROBLEM: Inconsistent hostnames # Reconciler uses: rabbitmq-headless.freeleaps-alpha.svc.cluster.local # But should use service discovery ``` ### **🎯 Recommended Improvements** #### **1. Centralized Configuration** ```yaml # Create a centralized RabbitMQ configuration # Location: freeleaps-ops/config/rabbitmq/ rabbitmq-config: image: repository: bitnami/rabbitmq tag: "4.0.6-debian-12-r0" auth: username: ${RABBITMQ_USERNAME} password: ${RABBITMQ_PASSWORD} clustering: enabled: true name: "freeleaps-${ENVIRONMENT}" ``` #### **2. Secret Management** ```yaml # Use Kubernetes secrets instead of hardcoded values apiVersion: v1 kind: Secret metadata: name: rabbitmq-credentials namespace: freeleaps-alpha type: Opaque data: username: dXNlcg== # base64 encoded password: TmphbEhGdm5EdUM3SzBpcg== # base64 encoded ``` #### **3. Service Discovery** ```yaml # Use consistent service discovery # Instead of hardcoded hostnames, use: RABBITMQ_HOST: "rabbitmq-headless.${NAMESPACE}.svc.cluster.local" ``` --- ## 📊 **Management UI Guide** ### **🌐 Accessing the Management UI** #### **Method 1: Port Forward (Recommended)** ```bash # Port forward to RabbitMQ management UI kubectl port-forward svc/rabbitmq-headless -n freeleaps-alpha 15672:15672 # Access: http://localhost:15672 # Username: user # Password: NjlhHFvnDuC7K0ir ``` #### **Method 2: Ingress (If configured)** ```bash # If you have ingress configured for RabbitMQ # Access: https://rabbitmq.freeleaps.mathmast.com ``` ### **📋 Management UI Features** #### **1. Overview Dashboard** - **Cluster status** and health indicators - **Node information** and resource usage - **Connection counts** and message rates - **Queue depths** and performance metrics #### **2. Queues Management** ```bash # Your actual queues to monitor: # - freeleaps.devops.reconciler.queue (heartbeat) # - freeleaps.devops.reconciler.input (input messages) # - freeleaps.devops.reconciler.output (output messages) # Queue operations: # - View queue details and metrics # - Purge queues (remove all messages) # - Delete queues (with safety confirmations) # - Monitor message rates and consumer counts ``` #### **3. Exchanges Management** ```bash # Your actual exchanges: # - amq.default (default direct exchange) # - amq.topic (topic exchange) # - amq.fanout (fanout exchange) # Exchange operations: # - View exchange properties and bindings # - Create new exchanges with custom types # - Monitor message routing and performance ``` #### **4. Connections & Channels** ```bash # Monitor your reconciler connections: # - Check if reconciler is connected # - Monitor connection health and performance # - View channel details and limits # - Force disconnect if needed ``` #### **5. Users & Permissions** ```bash # Current user setup: # - Username: user # - Permissions: Full access to vhost "/" # - Tags: management # User management: # - Create new users for different applications # - Set up proper permissions and access control # - Monitor user activity and connections ``` ### **🔧 Practical UI Operations** #### **Monitoring Your Reconciler** ```bash # 1. Check if reconciler is connected # Go to: Connections tab # Look for: freeleaps-devops-reconciler connections # 2. Monitor message flow # Go to: Queues tab # Check: freeleaps.devops.reconciler.* queues # Monitor: Message rates and queue depths # 3. Check cluster health # Go to: Overview tab # Monitor: Node status and resource usage ``` #### **Troubleshooting via UI** ```bash # 1. Check for stuck messages # Go to: Queues > freeleaps.devops.reconciler.input # Look for: High message count or no consumers # 2. Check connection issues # Go to: Connections tab # Look for: Disconnected or error states # 3. Monitor resource usage # Go to: Overview tab # Check: Memory usage and disk space ``` --- ## 🔍 **Production Monitoring** ### **📊 Key Metrics to Monitor** #### **1. Cluster Health** ```bash # Check cluster status kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl cluster_status # Monitor node health kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_nodes ``` #### **2. Queue Metrics** ```bash # Check queue depths kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_queues name messages consumers # Monitor message rates # Use Management UI: Queues tab > Queue details > Message rates ``` #### **3. Connection Metrics** ```bash # Check active connections kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_connections # Monitor connection health kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_channels ``` #### **4. Resource Usage** ```bash # Check memory usage kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl status # Monitor disk usage kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- df -h ``` ### **🚨 Alerting Setup** #### **1. Queue Depth Alerts** ```yaml # Alert when queue depth exceeds threshold # Queue: freeleaps.devops.reconciler.input # Threshold: > 100 messages # Action: Send Slack notification ``` #### **2. Connection Loss Alerts** ```yaml # Alert when reconciler disconnects # Monitor: freeleaps-devops-reconciler connections # Threshold: Connection count = 0 # Action: Page on-call engineer ``` #### **3. Resource Usage Alerts** ```yaml # Alert when memory usage is high # Threshold: Memory usage > 80% # Action: Scale up or investigate ``` ### **📈 Monitoring Dashboard** #### **Grafana Dashboard** ```yaml # Your existing RabbitMQ dashboard # Location: freeleaps-ops/cluster/manifests/freeleaps-monitoring-system/kube-prometheus-stack/dashboards/rabbitmq.yaml # Access: https://grafana.mathmast.com # Dashboard: RabbitMQ Management Overview ``` #### **Key Dashboard Panels** 1. **Queue Depth** - Monitor message accumulation 2. **Message Rates** - Track throughput 3. **Connection Count** - Monitor client connections 4. **Memory Usage** - Track resource consumption 5. **Error Rates** - Monitor failures --- ## 🚨 **Troubleshooting Guide** ### **🔍 Common Issues & Solutions** #### **1. Reconciler Connection Issues** ##### **Problem**: Reconciler can't connect to RabbitMQ ```bash # Symptoms: # - Reconciler logs show connection errors # - No connections in RabbitMQ UI # - Pods restarting due to connection failures # Diagnosis: kubectl logs -f deployment/freeleaps-devops-reconciler -n freeleaps-devops-system kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_connections # Solutions: # 1. Check network connectivity kubectl exec -it deployment/freeleaps-devops-reconciler -n freeleaps-devops-system -- ping rabbitmq-headless.freeleaps-alpha.svc.cluster.local # 2. Verify credentials kubectl get secret rabbitmq-credentials -n freeleaps-alpha -o yaml # 3. Check RabbitMQ status kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl status ``` #### **2. Queue Message Accumulation** ##### **Problem**: Messages stuck in queues ```bash # Symptoms: # - High message count in queues # - No consumers processing messages # - Increasing queue depth # Diagnosis: kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_queues name messages consumers # Solutions: # 1. Check consumer health kubectl logs -f deployment/freeleaps-devops-reconciler -n freeleaps-devops-system # 2. Restart consumers kubectl rollout restart deployment/freeleaps-devops-reconciler -n freeleaps-devops-system # 3. Purge stuck messages (if safe) # Via Management UI: Queues > Queue > Purge ``` #### **3. Memory Pressure** ##### **Problem**: RabbitMQ running out of memory ```bash # Symptoms: # - High memory usage # - Slow performance # - Connection drops # Diagnosis: kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl status kubectl top pods -n freeleaps-alpha | grep rabbitmq # Solutions: # 1. Increase memory limits kubectl patch deployment rabbitmq -n freeleaps-alpha -p '{"spec":{"template":{"spec":{"containers":[{"name":"rabbitmq","resources":{"limits":{"memory":"2Gi"}}}]}}}}' # 2. Restart RabbitMQ kubectl rollout restart deployment/rabbitmq -n freeleaps-alpha # 3. Check for memory leaks kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_queues name memory ``` #### **4. Cluster Issues** ##### **Problem**: RabbitMQ cluster not healthy ```bash # Symptoms: # - Nodes not in sync # - Replication lag # - Split-brain scenarios # Diagnosis: kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl cluster_status kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_nodes # Solutions: # 1. Check node connectivity kubectl get pods -n freeleaps-alpha | grep rabbitmq # 2. Restart problematic nodes kubectl delete pod rabbitmq-0 -n freeleaps-alpha # 3. Rejoin cluster if needed kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl join_cluster rabbit@rabbitmq-0 ``` ### **🛠️ Debugging Commands** #### **Essential Debugging Commands** ```bash # Check RabbitMQ status kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl status # List all queues kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_queues # List all exchanges kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_exchanges # List all bindings kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_bindings # List all connections kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_connections # List all channels kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_channels # Check user permissions kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_users kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_user_permissions user ``` #### **Advanced Debugging** ```bash # Check RabbitMQ logs kubectl logs -f deployment/rabbitmq -n freeleaps-alpha # Check system logs kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- journalctl -u rabbitmq-server # Check network connectivity kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- netstat -tlnp # Check disk usage kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- df -h # Check memory usage kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- free -h ``` --- ## ⚡ **Performance Optimization** ### **🎯 Performance Tuning** #### **1. Memory Optimization** ```yaml # Optimize memory settings # Location: values.alpha.yaml configuration: |- # Memory management vm_memory_high_watermark.relative = 0.6 vm_memory_high_watermark_paging_ratio = 0.5 # Message store msg_store_file_size_limit = 16777216 msg_store_credit_disc_bound = 4000 ``` #### **2. Disk Optimization** ```yaml # Optimize disk settings configuration: |- # Disk free space disk_free_limit.relative = 2.0 # Queue master location queue_master_locator = min-masters # Message persistence queue.default_consumer_prefetch = 50 ``` #### **3. Network Optimization** ```yaml # Optimize network settings configuration: |- # TCP settings tcp_listen_options.backlog = 128 tcp_listen_options.nodelay = true # Heartbeat heartbeat = 60 # Connection limits max_connections = 1000 max_connections_per_user = 100 ``` ### **📊 Performance Monitoring** #### **Key Performance Indicators** 1. **Message Throughput** - Messages per second 2. **Latency** - Message processing time 3. **Queue Depth** - Messages waiting to be processed 4. **Memory Usage** - Heap and process memory 5. **Disk I/O** - Write and read operations #### **Performance Benchmarks** ```bash # Your expected performance: # - Message rate: 1000+ messages/second # - Latency: < 10ms for local messages # - Queue depth: < 100 messages (normal operation) # - Memory usage: < 80% of allocated memory # - Disk usage: < 70% of allocated storage ``` --- ## 🔒 **Security Best Practices** ### **🛡️ Current Security Analysis** #### **✅ Security Strengths** 1. **Network isolation** - RabbitMQ runs in Kubernetes namespace 2. **Resource limits** - Memory and CPU limits set 3. **Non-root user** - Runs as non-root in container 4. **TLS support** - SSL/TLS configuration available #### **⚠️ Security Weaknesses** 1. **Hardcoded passwords** - Passwords in YAML files 2. **Default permissions** - Overly permissive user access 3. **No audit logging** - Limited security event tracking 4. **No network policies** - No ingress/egress restrictions ### **🔧 Security Improvements** #### **1. Secret Management** ```yaml # Use Kubernetes secrets apiVersion: v1 kind: Secret metadata: name: rabbitmq-credentials namespace: freeleaps-alpha type: Opaque data: username: dXNlcg== # base64 encoded password: --- # Reference in Helm values auth: existingSecret: rabbitmq-credentials existingSecretPasswordKey: password existingSecretUsernameKey: username ``` #### **2. User Access Control** ```yaml # Create application-specific users # Instead of one user with full access: # - freeleaps-reconciler (reconciler access only) # - freeleaps-monitoring (read-only access) # - freeleaps-admin (full access, limited to admins) ``` #### **3. Network Policies** ```yaml # Restrict network access apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: rabbitmq-network-policy namespace: freeleaps-alpha spec: podSelector: matchLabels: app: rabbitmq policyTypes: - Ingress - Egress ingress: - from: - namespaceSelector: matchLabels: name: freeleaps-devops-system ports: - protocol: TCP port: 5672 - protocol: TCP port: 15672 ``` #### **4. Audit Logging** ```yaml # Enable audit logging configuration: |- # Audit logging log.file.level = info log.file.rotation.date = $D0 log.file.rotation.size = 10485760 # Security events log.security = true ``` --- ## 📈 **Scaling & High Availability** ### **🏗️ Current HA Setup** #### **Cluster Configuration** ```yaml # Your current clustering setup clustering: enabled: true name: "freeleaps-alpha" addressType: hostname rebalance: false forceBoot: false partitionHandling: autoheal ``` #### **Replication Strategy** ```yaml # Queue replication # - Queues are replicated across cluster nodes # - Automatic failover if primary node fails # - Data consistency maintained across cluster ``` ### **🚀 Scaling Strategies** #### **1. Horizontal Scaling** ```bash # Scale RabbitMQ cluster kubectl scale statefulset rabbitmq -n freeleaps-alpha --replicas=5 # Verify scaling kubectl get pods -n freeleaps-alpha | grep rabbitmq kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl cluster_status ``` #### **2. Vertical Scaling** ```yaml # Increase resource limits resources: requests: cpu: 500m memory: 1Gi limits: cpu: 2000m memory: 4Gi ``` #### **3. Queue Partitioning** ```yaml # Partition large queues across nodes # Strategy: Hash-based partitioning # Benefits: Better performance and fault tolerance ``` ### **🔧 High Availability Best Practices** #### **1. Node Distribution** ```yaml # Ensure nodes are distributed across availability zones # Use pod anti-affinity to prevent single points of failure affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - rabbitmq topologyKey: kubernetes.io/hostname ``` #### **2. Data Replication** ```yaml # Configure proper replication # - All queues should have at least 2 replicas # - Use quorum queues for critical data # - Monitor replication lag ``` #### **3. Backup Strategy** ```bash # Backup RabbitMQ data kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl export_definitions /tmp/rabbitmq-definitions.json # Restore from backup kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl import_definitions /tmp/rabbitmq-definitions.json ``` --- ## 🛠️ **Maintenance Procedures** ### **📅 Regular Maintenance Tasks** #### **Daily Tasks** ```bash # 1. Check cluster health kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl cluster_status # 2. Monitor queue depths kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_queues name messages # 3. Check connection count kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_connections | wc -l # 4. Review error logs kubectl logs --tail=100 deployment/rabbitmq -n freeleaps-alpha | grep ERROR ``` #### **Weekly Tasks** ```bash # 1. Review performance metrics # Access Grafana dashboard: RabbitMQ Management Overview # 2. Check disk usage kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- df -h # 3. Review user permissions kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_users kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_user_permissions user # 4. Backup configurations kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl export_definitions /tmp/weekly-backup-$(date +%Y%m%d).json ``` #### **Monthly Tasks** ```bash # 1. Security audit # Review user access and permissions # Check for unused queues and exchanges # Verify network policies # 2. Performance review # Analyze message rates and latency # Review resource usage trends # Optimize configurations # 3. Capacity planning # Project growth based on usage trends # Plan for scaling if needed # Review backup and disaster recovery procedures ``` ### **🔧 Maintenance Scripts** #### **Health Check Script** ```bash #!/bin/bash # scripts/rabbitmq-health-check.sh NAMESPACE="freeleaps-alpha" POD_NAME=$(kubectl get pods -n $NAMESPACE -l app=rabbitmq -o jsonpath='{.items[0].metadata.name}') echo "🐰 RabbitMQ Health Check - $(date)" echo "==================================" # Check cluster status echo "📊 Cluster Status:" kubectl exec -it $POD_NAME -n $NAMESPACE -- rabbitmqctl cluster_status # Check queue depths echo "📋 Queue Depths:" kubectl exec -it $POD_NAME -n $NAMESPACE -- rabbitmqctl list_queues name messages consumers # Check connections echo "🔗 Active Connections:" kubectl exec -it $POD_NAME -n $NAMESPACE -- rabbitmqctl list_connections | wc -l # Check resource usage echo "💾 Resource Usage:" kubectl top pods -n $NAMESPACE | grep rabbitmq ``` #### **Backup Script** ```bash #!/bin/bash # scripts/rabbitmq-backup.sh NAMESPACE="freeleaps-alpha" BACKUP_DIR="/tmp/rabbitmq-backups" DATE=$(date +%Y%m%d_%H%M%S) mkdir -p $BACKUP_DIR echo "📦 Creating RabbitMQ backup..." # Export definitions kubectl exec -it deployment/rabbitmq -n $NAMESPACE -- rabbitmqctl export_definitions /tmp/rabbitmq-definitions-$DATE.json # Copy backup file kubectl cp $NAMESPACE/deployment/rabbitmq:/tmp/rabbitmq-definitions-$DATE.json $BACKUP_DIR/ echo "✅ Backup created: $BACKUP_DIR/rabbitmq-definitions-$DATE.json" ``` ### **🚨 Emergency Procedures** #### **1. RabbitMQ Node Failure** ```bash # If a RabbitMQ node fails: # 1. Check node status kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl list_nodes # 2. Restart failed node kubectl delete pod rabbitmq-1 -n freeleaps-alpha # 3. Verify cluster health kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl cluster_status ``` #### **2. Data Loss Recovery** ```bash # If data is lost: # 1. Stop all consumers kubectl scale deployment freeleaps-devops-reconciler -n freeleaps-devops-system --replicas=0 # 2. Restore from backup kubectl cp backup-file.json freeleaps-alpha/deployment/rabbitmq:/tmp/ kubectl exec -it deployment/rabbitmq -n freeleaps-alpha -- rabbitmqctl import_definitions /tmp/backup-file.json # 3. Restart consumers kubectl scale deployment freeleaps-devops-reconciler -n freeleaps-devops-system --replicas=1 ``` #### **3. Performance Emergency** ```bash # If performance is severely degraded: # 1. Check resource usage kubectl top pods -n freeleaps-alpha | grep rabbitmq # 2. Scale up resources kubectl patch deployment rabbitmq -n freeleaps-alpha -p '{"spec":{"template":{"spec":{"containers":[{"name":"rabbitmq","resources":{"limits":{"memory":"4Gi","cpu":"2000m"}}}]}}}}' # 3. Restart RabbitMQ kubectl rollout restart deployment/rabbitmq -n freeleaps-alpha ``` --- ## 🎯 **Summary & Next Steps** ### **📊 Current State Assessment** #### **✅ Strengths** 1. **Production-ready setup** - Clustering, monitoring, management UI 2. **Helm-based deployment** - Consistent and repeatable 3. **Environment separation** - Alpha vs Production 4. **Integration working** - Reconciler successfully using RabbitMQ 5. **Monitoring available** - Grafana dashboards and metrics #### **⚠️ Areas for Improvement** 1. **Security hardening** - Remove hardcoded passwords, implement secrets 2. **Configuration standardization** - Centralize configuration management 3. **Performance optimization** - Tune settings for your workload 4. **Documentation** - Create runbooks for common operations 5. **Automation** - Implement automated health checks and alerts ### **🚀 Recommended Actions** #### **Immediate (This Week)** 1. **Implement secret management** - Move passwords to Kubernetes secrets 2. **Standardize configuration** - Create centralized RabbitMQ config 3. **Set up monitoring alerts** - Configure alerts for critical metrics 4. **Document procedures** - Create runbooks for common operations #### **Short Term (Next Month)** 1. **Security audit** - Review and improve security posture 2. **Performance tuning** - Optimize settings based on usage patterns 3. **Automation** - Implement automated health checks and backups 4. **Training** - Train team on RabbitMQ management and troubleshooting #### **Long Term (Next Quarter)** 1. **High availability** - Implement multi-zone deployment 2. **Disaster recovery** - Set up automated backup and recovery procedures 3. **Advanced monitoring** - Implement predictive analytics and alerting 4. **Capacity planning** - Plan for growth and scaling ### **📚 Additional Resources** #### **Official Documentation** - **[RabbitMQ Documentation](https://www.rabbitmq.com/documentation.html)** - Official guides - **[RabbitMQ Management UI](https://www.rabbitmq.com/management.html)** - UI documentation - **[RabbitMQ Clustering](https://www.rabbitmq.com/clustering.html)** - Cluster setup #### **Community Resources** - **[RabbitMQ Slack](https://rabbitmq-slack.herokuapp.com/)** - Community support - **[RabbitMQ GitHub](https://github.com/rabbitmq/rabbitmq-server)** - Source code - **[RabbitMQ Blog](https://blog.rabbitmq.com/)** - Latest updates and tips #### **Books & Courses** - **"RabbitMQ in Depth"** by Gavin M. Roy - **"RabbitMQ Essentials"** by Lovisa Johansson - **RabbitMQ Tutorials** - Official tutorial series --- **🎉 You now have a comprehensive understanding of your RabbitMQ production environment! Use this guide to maintain, monitor, and optimize your message broker infrastructure.** --- *Last updated: $(date)* *Maintained by: FreeLeaps DevOps Team*