The pace of change in today’s IT world is truly astonishing. The traditional roles and descriptions between various IT roles have been blurred. In today’s complex IT environments, Development, Application Support, and IT Operations collaborate often to ensure application service levels are met. Operations, traditionally, relied on siloed monitoring tools to figure out how the infrastructure support applications. However, a new and emerging approach called ops-centric APM claims to provide horizontal visibility into transaction’s performance across all the silos that support applications. All the way from the desktop, down to the database and to the application code. In this series of posts, I’ll examine the subtleties of the ops-centric APM approach compared with developer-centric APM, and what it means for IT.
Let’s start with the developer-centric APM approach. These tools use bytecode instrumentation (BCI) technology to gather application performance data. They instrument application at the method level and provide KPI and notification based on the performance of these methods. This allows them to find problems related to the application code. Many of these tools evolved from code profilers. The language and user-interface used by those tools is generally compelling to developers. Bytecode instrumentation technology is known to have overheads and in production the best practices are to limit the instrumentation to API calls only, and avoid deep method instrumentation.
The main drawback of BCI technology is its limited view of application deployments. BCI technology can only be applied to Java and .Net applications. However, a production enterprise application depends on many other components that are simply out of the radar of BCI technology. For example, proxy servers, web-servers (e.g. Apache, Sun One, etc.), load-balancers, ESBs, SSO, rich clients, and many other components are not visible when depending on BCI. The bottom line is that developer-centric APM lacks a holistic view of how the entire infrastructure supports applications.
An operations-centric approach to APM is different in terms of requirements. Ops handles numerous applications and don’t have the bandwidth to get familiar with the code of each managed and packaged application. They require a tool that can easily be deployed across the enterprise without depending on Development to help set it up, and with a broader visibility into the entire infrastructure that support the application, beyond the .Net and Java processes. Finally they require an “always on” solution that models, baselines and shows the trends of applications performance across the entire infrastructure.
Consider a production deployment on a packaged application. You will have few Apache web-server in the front-end connected to the organization LDAP and SSO platform. Next tier will be a cluster of java processes behind a load-balancer and at the backend you will probably find a database, web-services, ESBs and a mainframe.
IT Operations are in charge of keeping this running and meeting its SLAs. CPU reading, Network throughout, and other resource consumption metrics are important, but, they don’t tell much about the application performance and the end-user-experience. Dev-centric APM tools are usually hard to implement on a packaged application, and even when they work they only provide visibility to the performance of the Java containers. A failure at the SSO level will not be visible to these tools.
Ops-centric APM promises to provide an “always on” horizontal monitoring solution that tracks each end user request from the desktop down to the database, showing how each silo in the transaction path impact the transaction performance and the end user experience response time. The continuous modeling of the application performance helps to sense changes with the application performance and isolate the root cause across the tiers by comparing the performance model to a baseline period.
The operations-centric approach should be strongly considered by IT shops that need broad base coverage in a complex environment. What capabilities are most important to you or your IT team?