Visibility Series Recap
In Part 1 of this series, Tackling the Visibility Monolith, we discussed the categories of visibility and the objectives for a successful visibility project.
In Part 2, Flashlight on Shadow IT, we dove into the importance of understanding and identifying shadow IT.
Then in Part 3 we talked about Network Alerts, their history, and the problems they can cause.
For our final post in this series, we’ll wrap things up by discussing why documentation matters and how visibility helps with operations.
Many engineers know the feeling of the late-night support call. Let’s say application changes occurred during the night, and now the application no longer works. The engineer, in a likely caffeine-deficient state, needs to understand how the application works – what pieces are involved, where they are located, and through what protocols they communicate. Network diagrams don’t exist, and the change window is closing.
The engineer needs to determine which of the current security controls are keeping the application from functioning. Could it be host-level firewalls? Virtual local area network (VLAN) assignment and some trunk that needs to be changed? There are numerous controls in the network, and there is no guiding map to help detangle this problem.
The engineer in this scenario will fall back onto simple things. If they have to, they will drag out some drawing tools, even MS Paint, and try to get a visual representation of how the application should work. It’s not uncommon for the engineer to look at the logs and make a quick rule or route change to make it functional. This is where centralized logging from Part 3 in this series becomes popular. However, these quick fixes inevitably become permanent changes, working around the standard firewall rules and routes, which creates a potential hole for other traffic to worm its way through. Depending on its nature, a late-night break/fix may even open the organization up to Shadow IT (as discussed in Part 2 of this series).
Mapping the Tunnels
Pretend your network is a city. If you are dropped into the middle of a city with no map and no cell phone, you could still navigate your way around if the street signs are readable. Similarly, in your network you likely know where the firewalls are and have a database (or Excel sheet) of switches and routers. Eventually, you will find out how these talk to each other and trace the traffic between the devices. When we discussed Network Alerting and log storage, perhaps this is where you turn. Can you find this traffic as it traverses the network?
In time, the problem is resolved. Now in closing the bridge for the night, you hope to clean up the scribbles that became your map and get it into your organization’s documentation solution tomorrow. Then after some sleep, the day’s schedule starts up, and there goes that idea as you move on to the next project.
A Different Road
Visibility project requests may come from realizing this problem. Maybe management agrees that such a thing would be beneficial to the organization. Generally, visibility projects require an understanding of a given location’s function, how that function is delivered, and the business requirements for the application. The early problem I have seen with this approach is the time it will take to find everyone who knows the disparate pieces, let alone create a cohesive document.
As mentioned above, the first hurdle is finding what applications the organization hosts, and then tracking down the owners for each. But let’s not forget about Shadow IT. How well do you know what is deployed and what everyone uses? The next challenge is to understand how those applications communicate with the rest of the organization. The time and resources for this project are going to add up quickly.
Perhaps the company is keen on micro-segmentation and isn’t sure how to get there; so they bring in an outside organization to help review what they have and how they approach segmentation. The goal is to ensure network visibility as traffic moves between routers and switches. But in two years when the application team decides to tear down some of the internal apps and replace them with something else, who is responsible for updating that documentation? Certainly not the outside consultant team.
An open-source tool can work with some caveats if the organization tries to leverage existing tools. Your Network Mapper (Nmap) forks can scan and show you the hops between devices (assuming, of course, those intervening devices have internet control message protocol, or ICMP, enabled, and it’s not just blocked by a firewall somewhere). Parsing that data into something readable and valuable might be extra challenging since it is presented from where the tool is run.
When trying to turn all this information into cohesive documentation, how would you map this out? You could create multiple pages in a Visio diagram to show logical and physical connections for a given application. Or type it all out with screenshots as supporting evidence in some shared document. Maybe, you opt for a different type of self-hosted knowledge base. Either way, presenting the data effectively and cohesively is a challenge.
Asking the Locals
If the street signs in this hypothetical city are unreadable, you could ask the people around you for help. In the same way, we can ask the end devices how they communicate. This certainly has its caveats. You are approaching this potential troubleshooting process from the top of the network stack, looking at protocols used, and working down the OSI model. The team also insisted this application was working before the change, so why would VLANs have changed, right?
The idiom I hear often is “trust but verify.” The application team thinks it communicates over port 443. But filtering for traffic in the log repository shows there aren’t any logs. You might then check with a packet capture to verify it’s leaving the source machine. The packet capture reveals it’s traveling to port 8443. No difference at all, really.
If a tool existed that integrated with the host from the start, you could have saved yourself some time – a tool that reads what traffic is going in and out of that machine, and could be used across the entire organization. Host firewalls often have this data, and depending on how the tool is integrated, it might even control which ports are open in the first place. Such a tool could quickly verify what expected and allowed traffic might exist between two hosts without first needing to take packet captures.
This would require an agent in place on the various hosts. I know, I’ve brought up “yet another agent.” However, its potential value is impressive. Using data from traceroutes, these agents may give information about what routers exist between a given host pair. Add some grouping functionality, and that sounds like a reasonably snazzy tool. It seems the need for this approach is still growing. In time, it may better represent those switches and the transit of firewall traffic to make an application work.
This hypothetical tool exists in a limited capacity. I was inspired by a few of Nexum’s technology partners in writing this article, and in the near future, I believe their solutions will reflect this article’s approach. Those solutions aren’t quite there yet, and open source or self-built automations are too complex and difficult to operate to be reliable. But don’t lose hope; Nexum is here to help in the meantime.
Leaving the City Better Than You Found It
The challenge of getting “yet another agent” onto systems is a fair point to be raised. However, this agent, when used correctly, has the potential to help all teams involved. Nexum has been looking at a few solutions from our technology partners in this space and continues to help organizations build their baseline documentation (depending on how they plan to grow). If this is a challenge you are looking to address, reach out to our team of experts.
Jump to –