Skip to content

Backup & Recovery SOP

Version: 1.0
Last Updated: 2024-01-01
Owner: DevOps Team

Purpose

Ensure critical data is regularly backed up and can be restored in a timely manner.

Scope

Databases, configuration files, persistent volumes, and infrastructure state.

Backup Schedule

Asset Frequency Retention Method
PostgreSQL Daily 30 days pg_dump → S3
MongoDB Daily 14 days mongodump → S3
Application files Daily 7 days tar → S3
Terraform state On change 90 days S3 versioning

Procedure

Database Backup

# PostgreSQL backup
pg_dump -h <host> -U <user> -d <database> -Fc > backup-$(date +%Y%m%d).dump

# Upload to S3
aws s3 cp backup-$(date +%Y%m%d).dump s3://backups/postgres/

# Verify backup exists
aws s3 ls s3://backups/postgres/

Recovery Procedure

Database Restore

# Download backup from S3
aws s3 cp s3://backups/postgres/backup-20240101.dump .

# Restore PostgreSQL
pg_restore -h <host> -U <user> -d <database> -c backup-20240101.dump

# Verify data
psql -h <host> -U <user> -d <database> -c "SELECT count(*) FROM users;"

Recovery Testing

Test Schedule

Full recovery drills are performed quarterly. Database restores are tested monthly.

Test Frequency Responsible
Database restore Monthly DB Admin
Full file restore Quarterly DevOps
DR failover Bi-annually DevOps + Infra

Verification

  • Backup completed successfully
  • Backup file size is reasonable
  • Restore tested within SLA window
  • Monitoring of backup jobs is active