Disaster Recovery Part One: The Foundation of Your Disaster Recovery Plan
This will be the first of a three-part series about my experiences with a Disaster Recovery (DR) project for a long time client that was entering into a new line of business. Part one of the series will be some background on Disaster Recovery and where it fits in to an enterprise application ecosystem. Part two: Minimizing RPO by Replicating with Oracle Dataguard will discuss how and why Oracle Dataguard was used to provide a robust database replica that maintained less than a 5-minute lag behind the production site. Part three of the series will detail the reasons for a shift from DataGuard to Oracle GoldenGate.
The focus of this article is to provide a bit of background on what Disaster Recover is and what is reasonable to expect from your Disaster Recovery solution. I have been involved in dozens of DR projects mostly revolving around the database or storage part of the solution. Many of the DR projects I have been a part of were driven by the IT department as a way to safeguard the assets that they are responsible for, usually seeking input from the business to define what is required of them. Much less common is the DR project that is driven by the business as part of a new application roll-out.
RTO – Recovery Time Objective
Recovery Time Objective is loosely defined as the amount of time it takes for the system to be generally available. This includes the time it takes to find out something is wrong, for IT and the business to meet and discuss the issues, and for IT to attempt to resolve the issues before declaring a failover. The most common RTO our team has seen companies establish is 2 hours. Although less common, we still have several clients who require high availability and need to have a 15-minute RTO.
RPO – Recovery Point Objective
Recovery Point Objective is loosely defined as the amount of data that can be ‘lost in transit’ when the application loses serviceability. We typically see this number in the 15 to 30-minute range, though this ‘loss’ is often managed to provide less than one minute of data loss. Once we are armed with this information, we can proceed to design solutions that will meet these requirements.
Who is responsible for an organization’s Disaster Recovery?
The person who has ownership of the application is usually responsible. It is very common for the burden to fall on IT Management in order to ensure that a proper solution is designed and put in place based on requirements that are provided by the business. When engaging in a DR project with a client, we obtain the customers RTO and RPO data first. With these metrics being core requirements, IT must educate the business on the availability of meeting the organization’s RPO and RTO requirements as well as the costs associated. Our team will typically engage a Business Analyst to assist the client in making appropriate decisions on what the reality of the RTO and RPO needs are and determine the best way to utilize the budget available. As with most things in life, DR is a juggling act where we try to use the allotted budget to cover as many scenarios as possible.
Disaster Recovery plays a big part in the application ecosystem. It is usually the first step along a transition from a SMB mindset to an enterprise solution. It is important for everyone to understand what DR is not. It is not a backup or a replacement for a solid backup and recovery solution. It is not Business Continuity, and it is not a way to process additional workload. DR is exactly what it says…a way to make the system available when there is a catastrophic failure on the primary site.
About the author
Dan Elliott has been with Eagle since its inception in 2003. In Eagle’s early days, Dan was “the tech guy,” responsible for everything tech-related, while Chuck Egerter handled everything admin. Dan is now our leading Database Systems Architect and Senior DBA responsible for managing the Technical Delivery Team. Dan is a results-driven technical leader with experience in implementing a vast variety of IT solutions. He oversees the team that manages and monitors hundreds of databases worldwide on various versions of Oracle providing installations, maintenance, test environments, and primary DBA services.