The Evolution of Transaction Management – Part 2

lanir-shachamTransaction management tools monitor the entire system, not just singular components. This article discusses how transaction management enables true availability.

Transaction management is a natural continuation of the past 15 year evolution of IT systems management. In the last few years, the data center has finally started to stabilize—the number of “node types” seems to be stabilizing and each node has had tools developed for it.

It seems that distributed systems are not going to see any additional node types on top of proxies, Web servers, application servers, databases, message brokers and storage anytime soon. Not to undermine the complexity of service oriented architectures, but as far as solving the problem of tracking transactions is concerned it seems that things are stabilizing enough to enable a viable solution.

Silo-specific tools can solve 90% of the problems, leaving IT departments with the hardest to solve: the last 10 percent. This last 10 percent is characterized, for example, by those application bottlenecks that occur even though all of the silo-specific tools are showing 100 percent availability. Traditional monitoring tools struggle to obtain good response time metrics outside of the mainframe. Simply locating the source of the latency is usually the longest step towards resolution when dealing with mission-critical incidents.

If the monitoring tools at all tiers are showing 100 percent availability, how do you know there is a problem? Either the enterprise has put in place an end-user measurement tool, or the help desk is receiving user complaints.

The IT organization’s number one priority is very simple: ensure that all transactions are executing correctly and in a timely manner.

The demand for business transaction management tools originates from just that sort of an issue where service levels are not acceptable, but current tools fail to show it. What has enabled the development of these transaction management tools is the data center’s recent stabilization. With the number of node types staying constant for a few years, it is finally possible to catch up and enable full visibility into the entire data center. This is exactly what transaction management solutions must provide.

Transaction Management Enables True Availability

The only way to resolve the problems of traditional monitoring tools is to take into account the interactions between the nodes and to connect every click of the user to the many events that are triggered by that activation. This enables both being able to see everything in the business context—the user’s click of a button—and solving problems where the symptom and cause are located at different tiers. For example, IBM WebSphere Portal v5.1.1 had a bug where the internal caching wasn’t working properly, which caused the server to send out 15,000 SQL statements per second to DB2, causing the database to overload and provide poor response times.

Transaction management tools monitor the entire system by tracking every single transaction that is activated by the users throughout the entire data center, collecting information on all of the interactions along the way. This is the essence of connecting business and IT—understanding the business context by linking every single event in the data center to the click of a user.

With transaction management, a common language is created, where performance can be measured as the time it takes for the transaction to travel between all of the nodes and back. Since all interactions are recorded along the way, if something goes wrong–for example, a latency that is out of specification—then the event that caused the problem can be immediately singled out since the latencies at each tier are known and all of the necessary data has been collected.

What Qualifies as a Transaction Management Solution?

Every transaction management solution should be able to track a transaction that was sent out from a browser, through a load balancer, and to a proxy. Let’s say that this proxy is not a known vendor’s proxy, but instead one developed by a start-up. The code is still there and still working, but no one has the code, just the binary.

The proxy is sending the work to another proxy and the transaction is split across two different applications, each with its own different Web server. Each application is running on its own app server. One is Java-based, while the other is a homegrown C++ program. The first app server is sending out database requests to four different databases: Oracle, DB2, Sybase and SQL server. The second app server is SOAP-based, sending SOAP requests to an external application. There is an MQ at the external application. ‘Puts’ are being sent to the MQ and a mainframe is receiving the messages from the MQ.

If a vendor wants to be called a BTM vendor, they have to be able to track the transaction across all of these components. They have to be able to account for the time every transaction spends at every point in the data center.

Developing a Transaction Management Solution
Developing this kind of solution is far from trivial. The task of connecting each event within the data center to a specific transaction is a big challenge in distributed and heterogeneous systems.

The general concept of how to execute a transaction management solution is straightforward. Agents must be installed at all tiers, collecting information about everything that flows through that particular tier and all of the collected data is sent back to a central dedicated server that correlates all of the events to the single user’s click of a button in the application. The various methods of gluing these events together are the core competency of the solution, along with having collectors that comply with all of the various servers that one finds in the data center.

Many transaction management solutions on the market today include some of the essential elements, but fail to compile all of the necessary components into one solution. All too often these products:

  • Introduce too much overhead when intercepting transaction segments at different tiers, making it impossible to continuously capture all transactions in the production environment.
  • Lack coverage for all of the various tiers, including Web servers, non-J2EE/.NET application servers, message brokers, databases, and mainframes.
  • Are unable to correlate between the various hops that a transaction makes through the data center, which greatly reduces the value of transaction monitoring and limits it to individual silos.
  • Are too invasive for many enterprises to adopt; often, code changes and long implementation times are required.
  • Are unable to break down the response time into browser rendering time, network time, and the time spent at the various tiers within the data center.

SharePath from Correlsense is designed to solve these problems. Not only does it capture all of the relevant transaction data, but it presents the data in usable graphs with tools that immediately convey specific actions required to resolve the current problem.