The Transaction Tracing Opportunity

Corporate computing environments can be complex involving many different programming languages, applications, middleware components, and endpoints. Many composite applications integrate new software, legacy code, and packaged solutions. Enterprise application performance monitoring works by solving the transaction tracing problem.  This means finding a way to identity application latency by tracking a transaction from a mobile application or browser (or a Windows desktop), across the application server and messaging system, to one or more databases and then back to the user. (And that is just a simple example.)

This is called the “holistic” approach to performance monitoring.  What matters to the end user is not the seek time on the disk, the latency across the network, or the SQL statement running out of control.  The user just wants to know why the application is running sloooooow.

This application performance monitoring tool knows the architecture of the entire IT infrastructure, so it knows what components are involved in each type of transaction.  The monitoring tool presents a dashboard to the IT operations staff, so they can monitor end-to-end performance.  If the application is operating within service levels then everything is green.  If not, the dashboard turns red.  Either way the tools lets the analyst click on components in the topology and drill down to see the response time of each piece.

The performance monitoring tool presents two possibilities to the analyst:  (1) the dashboard is red or (2) the dashboard is green.

Dashboard is Red

If the help desk starts getting calls that the system is slow, the support persons consults the application monitoring dashboard.  If the dashboard shows red then the application is operating outside norms or established thresholds.  With the end-to-end performance monitoring tool, the analyst selects the application marked red and then clicks to show the tiers or components that make up the transaction.  The tool breaks down the application into each tier–LDAP, shared services, web server, proxy, messaging, and application server–showing the response time in each.  Then the analyst clicks on any of the components, the performance monitoring tool shows the response time for each transaction marking in red those components that are operating outside norms.

Dashboard is Green

What do you do when users calls and the dashboard is green?  Green means the application is operating within the agreed service level, say, 99%.  But that number is only an average; it does not include all events.

If the user is experiencing latency and the dashboard is green, the analyst cannot infuriate the customer by telling them that everything is green and he or she cannot reproduce the problem.  Instead the analyst needs to drill down into the application to get transaction-level details.  If the top level dash board shows green, drill down into the application the user is using.  Then drill down into each component until you get transaction-level details.  Sort these by response time.  There the analyst sees that a particular transaction is taking too long.  Click on the transaction to see the details of, for example, the Java methods or the LDAP calls. It could be that this user is looking up something or entering data that is not so frequently used.  Then analyst could discover, for example, that the LDAP group lookup (&(cn=something)(objectclass=group)) is taking too long.  This could indicate a corruption the in the LDAP for that particular group or a coding issue.

This is the basic approach to using the performance monitoring tool to solve the transaction tracing problem.