freeleaps-ops/jobs/freeleaps-data-backup
2025-08-05 18:39:02 +08:00
..
argo-app feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00
ci fix: simplify Jenkins configuration to avoid build issues 2025-08-05 18:39:02 +08:00
helm-pkg/freeleaps-data-backup feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00
.gitignore feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00
.gitkeep ci(bump): bump reconciler image version for alpha to snapshot-9f1a2bc 2025-08-04 15:59:50 +08:00
backup_script.py feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00
build.sh feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00
deploy-argocd.sh feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00
Dockerfile feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00
README.md feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00
requirements.txt feat: add Freeleaps PVC backup job with ArgoCD deployment 2025-08-05 18:07:05 +08:00

Freeleaps PVC Backup Job

This job creates daily snapshots of critical PVCs in the Freeleaps production environment using Azure Disk CSI Snapshot feature.

Overview

The backup job runs daily at 00:00 PST (Pacific Standard Time) and creates snapshots for the following PVCs:

  • gitea-shared-storage in namespace freeleaps-prod
  • data-freeleaps-prod-gitea-postgresql-ha-postgresql-0 in namespace freeleaps-prod

Components

  • backup_script.py: Python script that creates snapshots and monitors their status
  • Dockerfile: Container image definition
  • build.sh: Script to build the Docker image
  • deploy-argocd.sh: Script to deploy via ArgoCD
  • helm-pkg/: Helm Chart for Kubernetes deployment
  • argo-app/: ArgoCD Application configuration

Features

  • Creates snapshots with timestamp-based naming (YYYYMMDD format)
  • Uses PST timezone for snapshot naming
  • Monitors snapshot status until ready
  • Comprehensive logging to console
  • Error handling and retry logic
  • RBAC permissions for secure operation
  • Resource limits and security context
  • Concurrency control (prevents overlapping jobs)
  • Helm Chart for flexible configuration
  • ArgoCD integration for GitOps deployment
  • Incremental snapshots for cost efficiency

Building and Deployment

1. Build and Push Docker Image

# Make build script executable
chmod +x build.sh

# Build the image
./build.sh

# Push to registry
docker push freeleaps-registry.azurecr.io/freeleaps-pvc-backup:latest

2. Deploy via ArgoCD

# Deploy ArgoCD Application
./deploy-argocd.sh

3. Monitor in ArgoCD

# Check ArgoCD application status
kubectl get applications -n freeleaps-devops-system

# Access ArgoCD UI
kubectl port-forward svc/argocd-server -n freeleaps-devops-system 8080:443

Then visit https://localhost:8080 in your browser.

Option 2: Direct Helm Deployment

1. Build and Push Docker Image

# Build the image
./build.sh

# Push to registry
docker push freeleaps-registry.azurecr.io/freeleaps-pvc-backup:latest

2. Deploy with Helm

# Deploy using Helm Chart
helm install freeleaps-data-backup ./helm-pkg/freeleaps-data-backup \
  --values helm-pkg/freeleaps-data-backup/values.prod.yaml \
  --namespace freeleaps-prod \
  --create-namespace

Monitoring

Check CronJob Status

kubectl get cronjobs -n freeleaps-prod

Check Job History

kubectl get jobs -n freeleaps-prod

View Job Logs

# Get the latest job name
kubectl get jobs -n freeleaps-prod --sort-by=.metadata.creationTimestamp

# View logs
kubectl logs -n freeleaps-prod job/freeleaps-data-backup-<timestamp>

Check Snapshots

kubectl get volumesnapshots -n freeleaps-prod

Configuration

Schedule

The job runs daily at 00:00 PST. To modify the schedule, edit the cronjob.schedule field in helm-pkg/freeleaps-data-backup/values.prod.yaml:

cronjob:
  schedule: "0 8 * * *"  # UTC 08:00 = PST 00:00

PVCs to Backup

To add or remove PVCs, modify the backup.pvcs list in helm-pkg/freeleaps-data-backup/values.prod.yaml:

backup:
  pvcs:
    - "gitea-shared-storage"
    - "data-freeleaps-prod-gitea-postgresql-ha-postgresql-0"
    # Add more PVCs here

Snapshot Class

The job uses the csi-azuredisk-vsc snapshot class with incremental snapshots enabled. This can be modified in helm-pkg/freeleaps-data-backup/values.prod.yaml:

backup:
  snapshotClass: "csi-azuredisk-vsc"

Resource Limits

Resource limits can be configured in helm-pkg/freeleaps-data-backup/values.prod.yaml:

resources:
  requests:
    memory: "256Mi"
    cpu: "200m"
  limits:
    memory: "512Mi"
    cpu: "500m"

How It Works

Snapshot Naming

Snapshots are named using the format: {PVC_NAME}-snapshot-{YYYYMMDD}

Examples:

  • gitea-shared-storage-snapshot-20250805
  • data-freeleaps-prod-gitea-postgresql-ha-postgresql-0-snapshot-20250805

Processing Flow

  1. PVC Verification: Each PVC is verified to exist before processing
  2. Snapshot Creation: Individual snapshots are created for each PVC
  3. Status Monitoring: Each snapshot is monitored until ready
  4. Independent Processing: PVCs are processed independently (one failure doesn't affect others)

Incremental Snapshots

The job uses Azure Disk CSI incremental snapshots, which:

  • Save storage costs by only storing changed data blocks
  • Create faster than full snapshots
  • Maintain full recovery capability

Troubleshooting

Common Issues

  1. Permission Denied: Ensure RBAC is properly configured
  2. PVC Not Found: Verify PVC names and namespace
  3. Snapshot Creation Failed: Check Azure Disk CSI driver status
  4. Job Timeout: Increase timeout in the values file if needed

Debug Mode

To run the script locally for testing:

# Install dependencies
pip install -r requirements.txt

# Run with local kubeconfig
python3 backup_script.py

Security

  • The job runs with minimal required permissions
  • Non-root user execution
  • Dropped capabilities
  • Resource limits enforced
  • No privileged access

Maintenance

Cleanup Old Snapshots

Old snapshots can be cleaned up manually:

# List all snapshots
kubectl get volumesnapshots -n freeleaps-prod

# Delete specific snapshot
kubectl delete volumesnapshot <snapshot-name> -n freeleaps-prod

# Delete snapshots older than 30 days (example)
kubectl get volumesnapshots -n freeleaps-prod -o jsonpath='{.items[?(@.metadata.creationTimestamp<"2024-07-05T00:00:00Z")].metadata.name}' | xargs kubectl delete volumesnapshot -n freeleaps-prod

Updating Configuration

To update the backup configuration:

  1. Modify the appropriate values file in helm-pkg/freeleaps-data-backup/
  2. Commit and push changes to the repository
  3. ArgoCD will automatically sync the changes
  4. Or manually upgrade with Helm: helm upgrade freeleaps-data-backup ./helm-pkg/freeleaps-data-backup --values values.prod.yaml

Backup Data

What Gets Backed Up

  • gitea-shared-storage: Gitea repository data, attachments, and configuration
  • data-freeleaps-prod-gitea-postgresql-ha-postgresql-0: PostgreSQL database data

Recovery

To restore from a snapshot:

# Create a PVC from snapshot
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: restored-pvc
  namespace: freeleaps-prod
spec:
  dataSource:
    name: <snapshot-name>
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
EOF