Executive Summary
In October 2025, two of the world's largest cloud service providers — Amazon Web Services (AWS) and Microsoft Azure — suffered major outages just days apart. On October 20, AWS's US-EAST-1 region suffered a Domain Name System (DNS) failure — the system that translates web addresses into IP addresses — while Azure customers faced widespread access issues on October 29 due to a global identity management disruption.
Together, these events served as a wake-up call for organizations that rely heavily on the public cloud, revealing that even the biggest names in cloud computing can go down.
For financial institutions and other critical sectors, now is the time to reassess cloud dependency, review business continuity plans, and strengthen resilience strategies. Cloud convenience does not replace redundancy.
Who Can Be Affected?
Any organization that hosts critical systems, data, or operations in the cloud — especially those relying on a single provider or region — is vulnerable to disruption.
Even industries with strong uptime expectations, such as banking, healthcare, and government services, felt the effects of the October AWS outage. Major platforms like Snapchat, Fortnite, Coinbase, Halifax Bank, and the Bank of Scotland were all impacted when AWS's DNS failure took key infrastructure offline. Less than two weeks later, Azure customers worldwide experienced authentication and connectivity issues that left users unable to log in or access applications.
These incidents underscore that cloud reliability is not infallible, no matter how large or sophisticated the provider.
How Does This Threat Work?
Both incidents illustrate how complex, interconnected systems can fail — and how those failures cascade quickly.
- AWS US-EAST-1 outage: A DNS failure in one of AWS's busiest regions caused services across multiple industries to go dark for hours.
- Azure outage: A configuration error in Microsoft's identity services created global login failures for Azure Active Directory and related applications.
For many organizations, the issue wasn't just downtime — it was dependency. If your environment is built on a single cloud provider or region, even a brief service interruption can disrupt customer access, transaction processing, and critical internal operations.
Increased Probability
Cloud outages are not new, but the frequency and scale of recent incidents are reminders that "rare" events can and do happen. As cloud infrastructure becomes more centralized and interconnected, the blast radius of a single point of failure increases.
High availability, once considered a built-in benefit of the cloud, now depends on intentional architecture: redundancy across regions, hybrid strategies that blend on-premises and cloud workloads, and well-tested failover plans.
Organizations that assume "99.99% uptime" means total reliability may be unprepared for the real-world implications of even a few hours of downtime.
What Can You Do?
Diversify Cloud Deployment
Avoid hosting all workloads in a single region or provider. Deploying across multiple regions or cloud providers can help maintain availability if one platform experiences an outage.
Test Your Recovery Objectives
Define and routinely test your recovery time objectives (RTO) and recovery point objectives (RPO). Simulate downtime events to ensure recovery plans work in practice, not just on paper.
Strengthen Backup Connectivity
Ensure your business can still operate if a cloud service is disrupted. This may include on-premises backups, local failover servers, or alternative authentication methods.
Evaluate Vendor Dependencies
Review your vendor ecosystem for hidden dependencies. Some "on-premises" applications rely on cloud-based services for updates or authentication without your awareness.
Build a Culture of Resilience
Technology alone won't solve the problem. Educate leadership and teams on resilience planning, incident response, and business continuity practices. Resilience is not just an IT function — it's an organizational mindset.
Don't Wait Until It's Too Late
The October AWS and Azure outages are a reminder that even the most reliable technology has limits. The goal isn't to eliminate downtime entirely but rather to minimize impact and ensure continuity when downtime happens.
Organizations that proactively test failover plans, diversify infrastructure, and plan for the unexpected will weather the next outage far better than those who assume the cloud will always be up.
A little redundancy today can save a lot of explaining tomorrow.
![]()
Stay Ahead of Cloud Disruptions
Cloud environments are complex, with vulnerabilities that can change rapidly. A cloud security assessment identifies risks before they become major issues.
Read More
As your organization grows and incorporates more vendor relationships, the need for a strong vendor management program also grows.
Read More

.png?width=400&name=SBSIWebinarsBundles_WebMenu%20(1).png)