Disaster Recovery Policy

Created by Andy Robinson, Modified on Fri, 11 Oct at 2:48 PM by Andy Robinson

Full DR Policy and drills are logged to be done under User Story 210382: Add a system for testing disaster recovery for Azure client production data. Once complete this will be updated with the "in place" policy.


Azure SQL Databases

Backups

Automated Backups: Azure SQL Database provides automatic backups by default. (no action)

Point-in-Time Restore Verification: Test restoring backups to verify data integrity and recovery processes (to split SQL pool)

Frequency: Restore verification (monthly)

Reporting: Store results - for distribution to clients on request


Performance Monitoring and Tuning

Performance Monitoring: Use Azure SQL Analytics, Query Performance Insight, or other monitoring tools to track performance metrics

Index Maintenance: Rebuild indexes

Statistics: UpdateStats

Query Optimisation: Identify and optimise long-running queries

Frequency: Update Stats (daily), Index maintenance (weekly), Query optimisation (monthly)

Reporting: Log slow running queries to C8 team


Security Management

TBC


Database Maintenance

Integrity Checks: Run DBCC CHECKDB

Update Statistics: Ensure statistics are updated to maintain query performance.

Frequency: Integrity checks (weekly), Update statistics (weekly).


Disaster Recovery Planning

DR Drills: Conduct disaster recovery drills to test failover and recovery procedures

Review DR Plan: Update and review the "disaster recovery plan" based on drill outcomes

Frequency: DR drills (annually), DR plan review (annually).


Azure Storage Accounts

Replication: Replicated across different geographical locations.

Frequency: One-time setup with periodic review (annually)


Regular Backups

Backup Strategy: Recovery Point stratergy for 31 days

Frequency: As per RPO (Recovery Point Objective) requirements


Monitoring and Alerts

Metrics and Logs: Enable and review metrics and logs for storage accounts to monitor usage, performance, and detect anomalies

Alerts: Set up alerts for critical metrics and events (e.g., storage capacity, transaction rates)

Frequency: Review of alerts and logs (weekly)


Data Integrity Checks

Azure Blob Storage: Use features like Azure Blob Storage's lifecycle management policies to automatically check and maintain data integrity

Frequency: As per policy schedule (weekly)


Disaster Recovery Drills

Failover Testing: Conduct failover testing to ensure that data can be successfully replicated and accessed from the secondary region

Recovery Procedures: Document and test the recovery procedures to ensure they are effective and up-to-date.

Frequency: Drill (annually)


Geo-Replication Testing

Read-Access Geo-Redundant Storage (RA-GRS): Regularly test accessing data from the secondary region in read-only mode to ensure it is available

Frequency: Test data access (quarterly)


Reporting Back to Clients

Frequency: Report of checks and results of all of the above (quarterly)

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article