“Two out of every five companies struck with a major disaster are unable to recover. Of the survivors, one third go out of business within the next two years. ”
— Gartner Study – 1996
Availability and Integrity – why you need Business Continuity Plans and Disaster Recovery Plans.
Did you know that Information Security considers “Availability” and “Integrity” to be just as important as “Confidentiality”? If your IT systems go down or your data is lost, it will be just as serious for your business as if you were ‘hacked’.
Availability means “Your users can access your systems”. They are able to work. Without availability, you have nothing. Failover, redundancy, and quality systems design affects your availability.
Integrity means “You have data, and it is good data.” Some cyber-attackers focus on corrupting business data, then demand a “ransom” to restore it. If your data is destroyed, will you be able to perform work? Backups and segmenting your sensitive data from the rest of the network are big ways to improve integrity.
Confidentiality means “Only the right people are accessing your systems.” This is the focus of most cyber-security efforts, and ironically, has the least impact on whether your business can function.
Do you see how availability and integrity are just as important, or more important, than confidentiality? We focus on availability and integrity for this reason.
What are these plans for?
We write this plan to be used when you are having a very bad day.
– Amira Armond, President of Kieri Solutions
During normal times…
When there is no hurricane, these plans are used primarily to show that your business is being proactive about risks.
They are a required component of many cyber-security compliance frameworks and are required by law for some types of businesses, particularly medical providers.
Many businesses are asked by their customers to provide public-facing Business Continuity Plans (BCPs) to show that they are responsible partners and will be capable of operating in the future.
The process of writing a plan and reviewing it will make your business more resilient. Often we find issues such as missing backups, no identified alternative site, or lack of a communications plan. Before finalizing the plan, we will work with your team to fix these issues.
When something goes horribly wrong, these plans are designed to help your staff respond appropriately, even if the lights are out.
The plan will remind you to do things like communicate with news organizations (depending on your business) or give contact information to reach your insurance company.
The plan will either have step-by-step instructions to perform recovery, or it will point you to the correct procedures document to use.
The best way to get management excited about a disaster plan is to burn down the building across the street. — Dan Erwin, Security Officer, Dow Chemical Co.
What is the difference between a Business Continuity Plan and a Disaster Recovery Plan?
Business Continuity Plans cover more aspects of your operations. Most IT compliance frameworks like HIPAA require a Business Continuity Plan, not a Disaster Recovery Plan.
The business continuity plan will include getting your critical information systems online, ideally through fail-over, but also focuses on how your employees will be able to continue providing services.
A business continuity plan addresses these topics:
- Description of your business operations and critical IT systems
- Identifies risks that can impact your ability to do business. If your business is considered critical infrastructure (such as a public utility or medical provider), we consider more risks. Each risk is evaluated for likelihood and impact. Examples include:
- Pandemic (for critical infrastructure)
- Regional disruption such as flooding or hurricane, which causes your employees to evacuate
- Building disruption such as fire or power outage
- Physical attacker or bomb threat (depending on type of business)
- Information system disruption such as ransomware or cyber-attack
- Identifies plans and procedures to continue operations during several scenarios. For example, a critical medical provider might coordinate with the police department and emergency medical services to set up temporary operations in a public space.
- Communication plans for coordinating with internal employees, customers, and the public.
- Contacts used during an emergency, such as police, hospitals, corporate insurance, and corporate attorneys.
Disaster Recovery Plans (DRPs) generally focus more on IT systems rather than overall business operations.
A disaster recovery plan addresses these topics:
- Identify specific operations (normally IT systems) which are the focus.
- Identify risks that impact your IT systems such as:
- Specific system failures (such as a server crashing)
- Server room disruption (fire, flooding, power outage)
- Cyber-attack requiring rebuild of your IT systems
- User error requiring recovery of data
- Details about how your IT systems are backed up and procedures to perform backups and test these backups.
- Plans for how to fail over operations, move to a new site, or recover your systems.
How much will downtime cost your business?
Check out this recovery calculator from Datto.com. Enter your variables (average employee salary, revenue, etc) to find out how much a critical system outage can cost your business.
I do not fear computers. I fear the lack of them. — Isaac Asimov
Stakeholders for your Business Continuity Plan
Why are we talking about stakeholders? Isn’t the IT guy responsible for disaster recovery? The answer is no. If you are leaving all responsibility on your IT person, you are being unfair to them. Making your business resilient takes C-level input because it is highly dependent on corporate strategy, goals, and budgeting.
Your corporate officers should identify how important IT systems are to the company. Can the business survive without them for a week? How about a day? Would it cause front-page news if customers couldn’t access their services? What about losing data? Could your company survive if the last 8 hours of changes to the data were lost? These questions are highly dependent on the company type. Amazon Web Services cannot afford to lose data or have even a few minutes of outage. A construction company might be able to absorb a few days of outage without their customer’s noticing.
In larger companies, it is common for IT to maintain a server that they don’t know much about. The “marketing department uses it”. Without input from each department, you won’t know how critical a system is, and what level of continuity is needed. This feedback should go up the chain to the corporate officers for prioritization.
You also need a leadership role for your IT department, such as a CIO. This person understands how much your IT department can do, and what it should be doing. For example, it is entirely possible for a 100 person company to operate with just one (fast) IT person. But if life get hectic, the first thing to fall through the cracks are the preventative tasks and maintenance. Things like installing security patches, using processes to control system changes and document configurations, and backing up systems. Your CIO can add staff, bring in consultants, or prioritize tasks to make sure that prevention work happens. In addition, your CIO should schedule irregular tasks like testing backups, performing failovers, and running incident response drills. If leadership doesn’t champion drills and testing, they tend to be forgotten.
Finally, we get to the IT staff. They do most of the heavy lifting; engineering and designing solutions to meet the corporate goals. They are responsible for making sure that each system has a recovery plan, and updating it over time. They should also periodically (quarterly or monthly) attempt to use these recovery procedures. Your IT staff are responsible for making sure that the design and procedures are realistic. Without this sanity check, a disaster recovery plan is much less effective.
Why choose Kieri Solutions to write your plan?
We are local to businesses in DC, Frederick, Baltimore. Rockville, Gaithersburg, and Columbia MD.
We think that Business Continuity Plans and Disaster Recovery Plans should be more than a piece of paper. Your Disaster Recovery Plan should have detailed procedures to follow to get your operations running again. Your Business Continuity Plan should have insurance policy numbers, contact information for your vendors, communications templates, and a well thought out risk assessment and response to a large number of possible incidents.
We research, train, prepare, and test the ability to recover from the unexpected.
Our staff just came back from a Cyber War Training event in North Virginia. Our employees have been doing DoD-level disaster recovery and fail over (they call it “Continuity Of Operations”) planning, testing, and support since 2005. In our careers, we have coordinated enterprise fail-overs, recovered hundreds of failed servers, and designed secure military networks to handle infrastructure attacks automatically.
We will suggest improvements
If there are glaring problems such as the backups are not enabled, or certain technologies are known to fail often, or if the proposed response to an incident would not be effective, we will give you a heads up. If you want help, we can help implement most fixes. For example, while writing recent a BCP, we discovered that two critical systems were not encrypted per HIPAA requirements, and one system wasn’t being backed up. We worked with company engineers to fix the problems before finalizing the BCP.
How does the DR / BCP process work?
A systems architect who specializes in ‘Resilient IT’ will be assigned to your business.
There will be an initial call with your management to identify the scope of the plan (for example, do you only want to focus on one critical system, or all business operations?). We will also work with management to identify key service levels such as the amount of time a system can be down and how much data can be lost.
We will brainstorm a list of possible business impacts (such as hardware failure, flood, cyber incident, and more). This helps guide the questions later.
There will be several calls, in-person visits, or screen shares with your technical experts to gather data about how the system is designed, how it is backed up, and what failover or redundancies are configured. We will also talk through various scenarios to see how the company would respond.
We will research the risk of each type of incident. For example, we might check flood histories in your area or research the failure rate of your network devices. We will also make architectural diagram(s) to show critical systems and dependencies for your operations to continue.
Through this process, we will be drafting a business continuity or disaster recovery plan. The next step is normally identifying exact procedures to recover operations. Your IT staff might provide these procedures, we might research vendor documentation to find them, or we might work with your IT staff to discover the best method.
Around now, your BCP or DRP is version 1.0.
We highly recommend testing the procedures and other information in the plan (such as contact numbers for your vendors) as soon as possible. Invariably, testing will identify missing steps or faulty equipment. The easiest form of testing is called a “tabletop exercise”. This is where we run a scenario and talk through each step. For example, we might move from discovery of a problem (who do you report it to?), to pulling in experts (internal and external), walking through how we would notify clients, and calling the insurance company.
If you are willing, we will work with your system administrators to perform test fail-overs and restores from backup. If gaps are found, we will help you solve them, either by updating the procedures or re-engineering systems.
Schrodinger’s Backup: “The condition of any backup is unknown until a restore is attempted.”
Tip: Put ‘real’ information in your plan.
This is information that is used if something really goes wrong (such as insurance policy information or procedures to restore from backup). You may want to create a second, public-facing plan which has sensitive information removed. This public facing plan can be provided to your clients to prove that your company is being responsible and diligent.
You will want to have copies of your plan printed out at multiple locations. If you can’t get to the office, the plan will help you contact vendors and insurance. It will remind you to do things like communicate with your customers or news organizations. And it will guide you in recovery procedures when things have gone wrong.
We have the experience with databases, cloud, virtualization, backups, SAN, networking, server hardware, and other technologies used by your business.
Kieri Solutions is at our heart a systems engineering company. We are used to designing and implementing solutions for real companies that want Resilient IT. So when we talk to your IT staff, it will be peer-to-peer, not a disappointing process of trying to explain concepts to a non-technical person.
We are local, and will be available to support in the future.
When you fly in a consultant from a big name company, you will probably never meet that person again. In contrast, once we have performed a project for you, we stand by our work and will respond if you have problems later on. We will also remember you and your network – you won’t be starting from scratch with us.
Our rates are typically half that of a big-name company.
Since we don’t need to fly our employees around, and because we have a smaller footprint, we don’t need to charge crazy rates. We will be glad to give you a no-risk estimate.R
Examples of recent projects
- Write HIPAA-compliant business continuity plans for medical providers.
- Install Veeam Backup & Restore suite for VMware and Windows servers. Train staff to recover from various scenarios (file-level restore, server-level restore).
- Design disaster recovery site and test ability to quickly recover from Ransomware and other scenarios.
- Use Netapp SnapRestore and Snapshot technology to recover large (4TB+) virtual machines in 1-2 minutes.
- Identify missing backups and lack of redundancy during SaaS business continuity planning, work with engineers to resolve before finishing BCP.
- Configure and successfully test cloud restores using Datto and Office 365.
- Create Continuity Of Operations Plans (COOP) (DoD specific disaster recovery plan) for US Navy deployed hospital systems.