Knowledge Ridge

Keys to IT Continuity Success

Keys to IT Continuity Success

December 23, 2025 7 min read IT
Keys to IT Continuity Success

Q1. Could you briefly describe your current role as Global BCP & DR Lead and the types of systems and regions you are accountable for?

In my current role, I am responsible for leading Global IT Business Continuity and Disaster Recovery across several regions. My remit covers both corporate IT and manufacturing environments, and I oversee a broad set of business-critical systems. These include ERP and MES platforms, identity and access management, finance systems, collaboration tools, and applications that are tightly integrated with operational technology and have a direct impact on plant operations.

My responsibilities go well beyond maintaining documentation. I focus on building and sustaining a consistent, repeatable lifecycle for business continuity and disaster recovery. This includes running Business Impact Analyses, defining and implementing recovery strategies, carrying out disaster recovery testing, managing risk acceptance, and providing governance. Each region I support has its own maturity level and risk profile, which adds complexity and requires a tailored approach.

 

Q2. Many organizations have documented BCP/DR plans—what typically breaks first when a real disruption happens, and why?

In my experience, the first thing to break during a real disruption is rarely the technology itself. Instead, it is usually the assumptions made during planning—such as who will make decisions, whether key people will be available, how critical assets will be accessed, and how interdependent systems will be recovered.

Too often, BCP and DR plans are created with the assumption that resources will be available and escalation paths will work as intended. In reality, disruptions rarely happen under ideal conditions. Organizations may face partial outages, competing priorities, and confusion about who has authority to escalate. The gap between what is documented and how things actually work under pressure is where most failures occur.

 

Q3. From your experience, where do BIAs most often go wrong: business ownership, data quality, or assumptions around RTO/RPO?

All of these factors matter, but in my experience, the biggest challenges come from unrealistic assumptions about Recovery Time Objectives and Recovery Point Objectives. Too often, BIAs are treated as a compliance checkbox instead of a tool for making real decisions.

Business owners sometimes sign off on RTOs and RPOs without a clear understanding of what is technically possible or what recovery really involves. Meanwhile, IT teams may accept these targets without the right capabilities or investment to deliver them. This misalignment usually shows up during testing, or worse, during a real incident. The gap between expectations and what can actually be achieved can be significant. 

 

Q4. How should leadership differentiate between major incidents and true crisis scenarios, and what governance gaps do you most often see during escalation?

In my view, a major incident turns into a true crisis when it has a significant business impact and requires decisive leadership, especially if there is potential for external exposure—be it regulatory, financial, or reputational. One of the most common governance gaps I see is delayed escalation. Teams often try to handle the situation on their own for too long, which can lead to critical decisions being made without the right authority or context.

Another issue I often encounter is fragmented governance during escalation. IT, business, and facilities teams tend to work in silos instead of following a unified crisis management approach. Effective crisis response relies on having a clear governance model, with defined escalation triggers, clear roles, and decision-making authority.

 

Q5. As enterprises rely more on managed services and cloud vendors, what practical controls actually reduce third-party continuity risk beyond contractual SLAs?

Service-level agreements are important, but in my experience, they offer limited protection when a real disruption occurs. Reducing third-party continuity risk depends much more on transparency and hands-on validation. This means understanding a vendor’s recovery assumptions, testing integrations, validating dependencies, and running joint disaster recovery exercises or tabletop simulations.

In my experience, hands-on testing gives far more confidence than relying on contracts alone. These activities show how a vendor’s recovery processes will actually work during a disruption, which is essential for building real end-to-end resilience.

 

Q6. With ISO 22301 adoption increasing, what separates organizations that are merely audit-compliant from those that are genuinely resilient?

Organizations that focus mainly on audits tend to concentrate on having the right documentation to meet the standard. While documentation is necessary, in my experience, truly resilient organizations go further and focus on how people, processes, and systems actually perform under pressure.

The difference is most evident in how frequently plans are reviewed, how effectively lessons learned are captured and implemented, and whether corrective actions result in tangible changes to behaviors and processes. ISO 22301 delivers real value when it is treated as a management system—a continuous cycle of improvement—rather than simply a certification requirement. True resilience is about ongoing adaptation and readiness, not just passing an annual audit.

 

Q7. If you were advising a board or investor, which resilience indicators would you prioritize to assess whether an organization is fully equipped for enterprise-wide disruption?

If I were advising a board or investor, I would focus on three key indicators to assess whether an organization is truly resilient:

 

 

 

  1. Actual recovery performance versus stated RTO and RPO
    It is important to look at recovery outcomes from real tests or actual incidents. The main question is whether the targets set in BIAs are consistently met in practice.
  2. Maturity of risk acceptance
    This means checking whether leadership has a clear view of existing risk gaps and whether those risks have been formally acknowledged and accepted. It shows whether decisions are made deliberately, rather than exposing the organization to risk by default.
  3. Clarity of governance during disruption
    Effective resilience depends on having clear decision-making structures during a disruption. This means knowing who has authority, how quickly decisions can be made, and what information is available to support them. Any ambiguity in governance during a crisis is a major vulnerability.

Taken together, these indicators give a much more accurate and realistic picture of how prepared an organization really is, compared to just counting the number of documented plans.

 


Comments

No comments yet. Be the first to comment!

Newsletter

Stay on top of the latest Expert Network Industry Tips, Trends and Best Practices through Knowledge Ridge Blog.

Our Core Services

Explore our key offerings designed to help businesses connect with the right experts and achieve impactful outcomes.

Expert Calls

Get first-hand insights via phone consultations from our global expert network.

Read more →

B2B Expert Surveys

Understand customer preferences through custom questionnaires.

Read more →

Expert Term Engagements

Hire experts to guide you on critical projects or assignments.

Read more →

Executive/Board Placements

Let us find the ideal strategic hire for your leadership needs.

Read more →