Over the past 2 articles (read part 1 and part 2), we have examined what could potentially happen that could bring your business to a halt in the event of a natural or man-made disaster. We also looked at the critical aspects of what would be required in order to build a Business Continuity Plan, including exposing the critical aspects of the business by performing a Business Impact Analysis. We’ve determined the amount of time it would take to Return the Business to Operation (RTO) and how much data we are willing to sacrifice in order to reach our Recovery Point Objective (RPO). Now, it’s time to bring all the pieces of the puzzle together to develop our disaster recovery plan and procedures, in order to make sure the business is going to be operational again within the defined timeframes.
Step 1: Redundancy
Having used your business impact analysis to identify the systems, services and applications at risk, you can now replicate those services into a redundant platform that can either be hosted or managed somewhere else, or at the very least run from a remote location. Any system or service that has a single point of failure should have a Plan B associated with it. The cloud can be a great tool for utilizing redundant systems and services readily in the event of the loss of local physical systems and data. It doesn’t have to be perfect, it just needs to be able to run most of the functions of your department while the local business is in recovery. Services like Amazon Web Services, Office 365 and hosting providers can be your backup at a lower price point.
You have to include your non-IT personnel as well. Your redundancy here might be a staff accountant in California who can manage your receivables and payables and get a handle on any cash and transactions that might be coming through the bank. This person may have a local copy of your QuickBooks files that could be uploaded to a hosted virtual server that the rest of your team will be able to get to. Look at those areas where you can implement redundancy to minimize your downtime, and include those resources within your budgeting.
Step 2: Documentation, Contacts, Software and Tools
Make sure you have all your documentation and instructions available in multiple formats: printed out, burned to CD, stored either online or at a repository facility like Iron Mountain. You want multiple options to get at your procedures, because it does you no good if all your documentation is stored online and you have no way to reach it if there is no internet connection immediately available.
If a large–scale disaster should happen, you never know or what who may be delayed, injured or, worse, lost permanently. You are going to need people to fill in those missing pieces of the puzzle when those areas of expertise are missing, and this may include catching up on days lost due to your disaster. Be prepared and expect to hire some temporary workforce assistance in order to utilize the procedures to complete a job function. If you don’t have those procedures, you are just wasting time and money. Provide them with instructional procedures in order to minimize the amount of time to get the temps trained. Remember, the more you have, the more you are prepared, the faster you can get your business running again!
With the documentation, make sure you include copies of all the licenses and software for IT to have in case they have to rebuild any of the infrastructure. You may not need to include every last bit of software, but you should have the key software and licenses to get your primary systems back online. Windows server, Microsoft Exchange, Windows Workstation Operating Systems, SQL server, financial, billing invoice programs, backup and restoration software are primary key examples. Make sure you have compiled all your vendor and supplier contacts and included them within your disaster recovery and business continuity plan.
Step 3: Restoration
This is the longest part of your disaster recovery initiatives. Most of it will be done by the IT team, but they are going to need help in order to get your systems up and running simultaneously in the least amount of time possible.
- Activate the secondary office and services
- Order your equipment and network services.Use expedited options for shipping and service delivery
- Program your network equipment and build out your servers
- Restore your Tier-1 data and servers first
- Get Tier-1 systems and services functional as quickly as possible to meet your primary RTO and RPO
- Have personnel utilize Tier-1 services and verify that they are operational
- Restore your Tier-2 data and servers second
- Get Tier-2 systems and services functional as quickly as possible to meet your Tier-2 RTO and RPO
- Have personnel utilize Tier-2 services and verify that they are operational.Troubleshoot any problems
- Restore your Tier-3 and any remaining data and services
- Get all remaining systems and services functional
- Troubleshoot, triage and continue restoration of your business
Many company services are dependent on IT. Be sure to take into consideration the minimum amount of time needed to get the essentials up and operational, before the rest of the company can start anything. There will be plenty of papers and rubble to go through to find out what is salvageable. Everyone should be ready to get down and dirty and pitch in!
Step 4: Test, TEST, T E S T !
It’s been said before, but it’s definitely worth repeating! Your disaster recovery plan is only as good as the amount of time and testing you have invested in it. If you don’t test your plan, how do you know it is going to work? And how are you going to discover what might have been forgotten? You don’t necessarily have to test the ENTIRE plan, but you should make sure that all your backup equipment and procedures can be switched over in both a controlled and uncontrolled fashion. You want to verify that you are able to complete your objectives and meet or exceed your RTO and RPO. Like learning to play the piano, your team and the staff will only get better the more you practice.
Step 5: Review
What went wrong with your disaster recovery plan? What areas were not considered? What was missing or was a complete failure? What could be improved for next time? Your disaster recovery plan is never going to be perfect, but it’s all in what you seek to accomplish. After your disaster recovery test and any actual disaster, make sure you perform a post mortem. Update all procedures and documents and make sure that your new operational procedures work!
If you have completed your business continuity plan and completed and tested your disaster recovery plan, you will be well on your way to saving your business and keeping your customers happy. If you haven’t started, “What are you waiting for?!” Get working on your business impact analysis today!