Cloud Networking: From Magic Smoke to Black Boxes


Written by: Sarah Lantz, Security Specialist
Connect with Sarah on LinkedIn

Cloud networking was once hailed as the savior of our industry. While DevOps has embraced it, security and networking continue to find challenges with managing and scaling this platform. With the continued adoption of this approach from vendors and clients; and acronyms like Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) becoming common vernacular, organizations find themselves pushed ever further into adapting to the black box of cloud networking.

The Magic Smoke that Makes Technology Work

Consider what traveling through a data center is like from a packet’s perspective. It encounters an edge router, possibly an edge firewall, leading into a switch stack. It then controls like zones and Virtual Local Area Network (VLANs) where tagging occurs to direct where the packet goes. It then navigates host-level controls such as host firewalls before eventually reaching its target host destination within the organization. Here, the “magic smoke” of hardware reigns supreme. After all, it stops working when the “magic smoke” leaves the hardware.

How much hardware sits between that ingress point and the destination host? With a trending shift towards using popular cloud solutions such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Provider (GCP), these types of data center encounters fall along the wayside. Therefore, cloud networking is like a black box, as the organization can’t see the underlying connectivity.  For instance, VLAN tagging is no longer the controlling metric for sending traffic along a path. Some argue that cloud data management characteristics such as Availability Zones or Regions are the virtual equivalents of data center firewalls or similar controls. I would not agree, given that these are physically distinct locations treated differently from cloud-based control mechanisms. One example would be Network ACLs and Security Groups used by AWS, Azure, GCP, and other cloud-based solutions.

Looking at the Black Box

These controls feel lacking compared to an on-premises physical firewall. The ability to use flexible, dynamic address groups generally doesn’t exist in your cloud provider. Rules instead feel more akin to the firewalls of years past, where limiting traffic by subnet address and port is the only control you can apply. Let alone the sense that when traffic leaves a virtualized asset in the cloud, there is no insight into how the traffic arrived at another part of the deployment. The intentional black box of cloud deployments makes networking teams who inherit these designs feel unsure of themselves when an issue arises.

One common design tactic is to route traffic freely within the cloud provider’s controls but then route it through a virtual firewall to provide the controls networking, and security teams are used to. This design works to a degree. However, it often proves cost-prohibitive for deployments that use more than two regions. Even when isolated to a singular region, this design usually has capacity limitations and is counter-intuitive to cloud design and auto-scaling.

This design tactic is an example of trying to replicate data center networking within the cloud. Network teams want it to work in a way with which they are familiar. DevOps just needs it to work. Fast-forward a few months to when the implementation budget is dry. Now the network team has a mess to untangle, and every attempt at resolution is met with pushback.

From Magic Smoke to Black Boxes

So, how do you migrate from the “magic smoke-powered” hardware to the black box networking of the cloud in such circumstances? Rebuilding the application from scratch to be cloud native is both painful and expensive.  Many development hours were spent breaking apart the application and building the network and security rules to allow only good traffic. Cloud deployments get sold as a cost-saving exercise for the business after all. A slow redesign and migration are possible, but every day those assets remain exposed through incomplete network rules. A “tear it all down and move it” approach is in keeping with how cloud resources get utilized; however, when trying to redirect traffic by Domain Name System (DNS) and Time to Live (TTL) expirations, your ability to scale the application is effectively non-existent.

Perhaps your organization has been lucky so far. You have held off on such a migration to cloud resources, imagining that hardware will remain available. However, hardware can sometimes be problematic to acquire, creating pressure to embrace cloud-based technical solutions fully. Understanding that networking via the cloud is not the same as via a data center means understanding that new data management strategies are needed. Specifically, this should be done holistically, knowing that selected methods must be repeatable, scalable, and easy to understand while providing sufficient controls to protect your organization’s assets.

Nexum’s Roadside Map Shop

Nexum is here to help you both plan and implement this architecture. We have helped many customers migrate successfully to cloud-based solutions, designing and implementing precise and repeatable cloud-based network architecture to provide appropriate services.

Check Out More Resources

Nexum Resources

Enterprise Logging Best Practices

Each quarter, the managed security team at Nexum shares insights from our first*defense SNOCC. In this post, we decided to share some general logging best practices that are likely to benefit every organization.

Read More »