How Observability Can Help Manage Complex IT NetworksHow Observability Can Help Manage Complex IT Networks
The key to improving IT infrastructure management is observability, a matter of growing concern for IT leaders as networks become more complex.
When setting out on a digital transformation journey, organizations usually end up with complex infrastructures -- the opposite of the initial goal of these projects.
This is because teams update existing legacy applications and infrastructure, while adding multi-cloud, virtual, and cloud-native capabilities. Eventually, IT pros find themselves managing diverse, complex, and distributed networks across cloud, system, application, and database infrastructures.
A recent IDC report indicates a myriad of obstacles hinder the ability of IT teams to contribute to business goals successfully, in large part due to ineffective tools to manage IT infrastructure.
Sascha Giese, head geek at SolarWinds, says to get a handle on the resulting complexity, organizations tend to accumulate monitoring and managing tools, with the goal of simplifying systems overnight. “But instead, using a wide variety of tools to manage networks or infrastructures causes silos to develop, only hindering IT teams further,” he explains.
These silos worsen operational blind spots, delay problem resolution, and increase security exposures. “Ultimately, this leads to overwhelmed IT pros that can’t keep up with app modernization or infrastructure dynamics,” he adds. “A long-term solution to the struggles faced by IT professionals is observability.”
Integrated observability solutions measure the internal states of systems by examining the outputs from various layers. These tools look at applications and systems in their entirety -- from the end-user experience to server-side metrics and logs.
“Not only does it show what's happening with IT tools, but it helps teams understand the ‘why’,” Giese says. “A well-built observability system uses AI/ML to rapidly identify course correction or provide the essential insights that allow an IT pro to act immediately.
He explains that with observability, service is predictable, and downtime is significantly reduced. “In addition, teams can become more proactive in issue and anomaly detection -- allowing them to achieve optimum IT performance, compliance, and resilience.”
IT Complexity Stymies Infrastructure Management
William Morgan, CEO and co-founder of Buoyant, agrees that complexity is the biggest challenge facing anyone trying to manage infrastructure.
“As our infrastructure becomes more capable, it tends to also specialize and become more complex,” Morgan says. “Unfortunately, tooling to manage it tends to end up being equally complex, especially when the tools are still fairly new.”
He explains nowhere is this more obvious than in the service mesh space, notorious for its complexity.
“Everything in computing is difficult for humans to see, simply because humans are so much slower than any computer,” Morgan says. “Almost anything we can do to provide visibility into what’s really happening inside the application can be a big help in understanding.”
This means not just fixing things that break, but improving things that are working, or explaining them to users and new developers.
He points to the oldest observability tool, ad-hoc logging -- still in use today -- but adds tools like distributed tracing can provide a standard layer of visibility into the entire application without requiring application changes.
This in turn reduces the burden on developers (less code to write) and on support staff (fewer distinct things to learn).
“As an industry, we’ve created many tools for observability over the years, from print statements to distributed tracing,” Morgan says. “Network analytics bring a welcome uniformity to observability.”
He adds that at a certain level, network traffic is the same no matter what the application is doing, so you can easily get equivalent transparency for every service in your application.
At the same time, it's not possible to understand the details about what’s going on inside a specific service by watching the network from outside (especially in a world with encryption).
“Network analytics are a useful tool in your toolbox, but not a panacea,” he says.
Bringing IT Observability to the Entire Team
The entire technical organization, from developers to platform engineers to customer support staff and the C-suite, need observability across the entire application.
Morgan points out developers need detailed information about how well each piece of the application is functioning, while platform engineers need to easily see areas where the infrastructure is limiting performance of the application as a whole.
“Again, in a microservice architecture, it’s critical for these these stakeholders to have the visibility they need anywhere in the application, no matter which service is failing, how deeply it’s buried in the call graph or how far from end-user visibility it is,” Morgan says.
From his perspective, it’s not enough to be able to quickly see failures in front-end services, which is why it’s important that the observability tools be applied uniformly across all services within the application.
Collaboration Critical to Observability Projects
Giese adds that when updating IT environments significantly, collaboration between IT teams and the C-suite is crucial, especially as implementing observability solutions within budget and time constraints can be a challenge.
As such, strategic discussions must take place between IT pros and senior leadership -- with discussions focusing on priorities and the necessity for investment of both time and money.
He says all too often, a lack of alignment between IT professionals and the wider business is rooted in disconnected goals.
“To successfully prove the worth of observability, IT pros must be prepared with water-tight proposals that use the language of business and align IT goals with overall targets,” he says. “Only then will this essential solution become a key part of the IT professional’s digital transformation toolkit.”
Giese adds that using AI to automate repeat actions, observability tools can also increase IT capacity. “Without spending time responding to false alerts or easy fixes, IT professionals are free to tackle the problems that interest them and push the organization forward,” he says.
However, the more advanced features that come with observability, like automation and ML, require the environment to be somewhat prepared.
Conversely, AI doesn’t need much preparation, as the system will understand what it’s looking at within a few days and provides so-called actionable intelligence -- the system will independently watch the current state, create baselines, and spot anomalies.
“Others call it smart automation, but whatever the name, it’s a way for IT to outsource tasks to a machine, and the engine makes decisions on the data,” Giese explains. “We use deeper analytics from, for example, the network or the infrastructure layer to get this data.”
What to Read Next:
7 IT Infrastructure Skills in High Demand
Does DevSecOps Require Observability to Get the Job Done?
Why We Need Infrastructure-led Innovation to Transform Network Security
About the Author
You May Also Like