Looking Into the Black Box. How Resiliency Starts with Visibility.
Every holiday season, tens of millions of consumers shop online for gifts and deals, and then eagerly await packages to arrive shortly thereafter on their doorsteps. We now take for granted knowing when those deliveries will happen down to the hour, but that wasn’t always the case.
Started in the late 1970s, FedEx staked its business on delivering packages overnight. But initially there was no way to know where any given package was inside its shipping system. FedEx realized real-time visibility was needed inside the black box that connected its hubs and trucks together. As a result, the tracking number was created, leading to revolutionary operational efficiencies. This granular visibility also resulted in a more resilient system.
Just as FedEx needed to see inside its interconnected shipping network, many enterprises must understand and control their highly interconnected mission-critical applications across environments to ensure the highest operational performance and resiliency. And modern application architectures use microservices and cloud-native functions that rely far more on complex interconnections among applications, users, and databases.
Meanwhile, many large enterprises rely on custom IT software systems that sit between these applications, users, and databases to ensure the delivery of inter-application communications and messaging. These middleware systems, such as messaging queues, FTP servers, and API gateways, are deployed to ensure the high reliability of the entire IT infrastructure so business runs more smoothly, and in many cases, these solutions have been in place for many years. Enterprises still spend billions of dollars a year on middleware to process tens of millions of transactions a day. And they’re not just legacy on-premises systems; public cloud providers like AWS and Azure now can boast significant market share for their own middleware services.
The irony is that while critical systems were put into place to improve transactional reliability, their use can actually decrease systemic resiliency by obscuring application relationships and dependencies that are too numerous and complex to track. That leads to blind spots that can have massive negative impacts that enterprises can no longer ignore, given the exploding costs of service outages and cyber risk.
To ensure better operational and cyber resilience, enterprises like yours must gain visibility and insights beyond what network-based point-to-point dependencies can provide. You must begin to observe and understand the data flows from the application relationships that extend through middleware in your infrastructure. Data flows are essentially the complex communications among connected applications that originate from users, exchange information, and depend on underlying data.
Without fully understanding data flows, operations teams can’t govern or control their application environments effectively. This includes planning for changes to applications or infrastructure with confidence or understanding the impact of a failure, because changes result in unanticipated service outages and disruptions. For example, migrating one application to public cloud may break other applications unexpectedly, or planned maintenance to a less important application results in unplanned downtime for a mission-critical application. Cyber risk is also a big issue because middleware increases the attack surface in a way that is seldom well understood. Unknown vulnerabilities can expand the blast radius of breaches, as attackers are able to move laterally inside the perimeter and access sensitive data.
Enterprises have a hard enough time to discover and map application relationships for their IT infrastructure. Today, that is a costly and manual process, prone to errors and quickly outdated data because legacy software solutions cannot solve these new challenges. And discovering information flows through middle systems presents even more headaches.
Like FedEx having to ensure packages can reach their destinations through a complex interconnected shipping system, enterprises must begin to improve their operational and cyber resiliency through complex relationships and information flows among applications, users, and data – even if there are hub-like systems like middleware in between.