Start seeing with SharePath free download here

Have any questions? Just call us 508-318-6488

February 26, 2014

No Comments

Posted by Elad Katav

Why Enterprise APM Projects Fail

I am on a plane – on my way back from the best ever visit to a customer in the UK. One of our largest customers asked that prior to beginning his project he would like to visit another customer site to discuss “implementation.” He wanted to learn from a successful customer and improve his plan to utilize SharePath. I was so pleased from the results of the meeting that it made me realize that I know – not just why APM projects succeed – but also why they fail. That’s what this post is all about. In order to understand why APM projects fail – let’s first discuss what is so special and challenging about the APM market and especially the Enterprise APM market. Enterprise APMThis is what a successful enterprise APM project looks like

Standard APM tools focus on one isolated application built on one technology stack. That’s simply not what the enterprise APM is all about. Enterprise IT is like a spider web: many dedicated apps connected to legacy applications, linked to commercial applications, etc. The enterprise may have multiple middleware layers mixing message brokers, service busses, queues and other components. Employees and customers may access applications using browsers, rich clients, and even terminals. Enterprise IT has many layers, some are old – really old – and it can get very complicated, very fast. The major challenge in addressing enterprise application performance is simply this: Who owns APM? Think about it for a moment – most companies do not have an application performance management group. When I ask the question – “Who owns end-to-end application performance at your company?” – I get many different answers. It may be application development, application support, QA, operations, engineering, capacity planning, networking, and the most common two – “No one” or “It depends.” Even more than that – APM is treated like old-fashioned network monitoring – “When we have a problem, please let me know by sending an alert. “ This is a huge mistake; APM is not just about crisis management but also about performance analytics.

The simple question is why two different clicks that met the terms of a SLA, on the same system, at the same time, have different response times – is usually an enigma. Very few companies can answer that question, and that’s what APM is mainly about. Companies need to find these answers by closely observing performance phenomena and analyzing them. Even more importantly, they need to take actions to prevent the next occurrence. Crisis management is only one kind of performance issue, and usually a result of lack of effective analytics. Enterprise APM is not easy – that sounds like an overly simple statement, but it is true. In order to understand performance you have to understand programming, databases, networks, servers, operating systems, middleware, storage, and information security.

People who can understand the complete picture are not that easy to find and are usually very expensive. When these smart guys are not around, performance becomes the problem of everyone and many (too many) different groups “own” the problem. The challenge is that these groups look at problems differently, speak different technology dialects, and own different management tools. This situation usually forces a reactive approach to performance and not active one. Enterprise APM software can’t fix these problems by itself – buying a software tool doesn’t address the root problem – it can trigger the discussion that we are so keen to have with our customers: How does APM fit in your organization? Your answer to that question is why Enterprise APM projects succeed and fail.

Deployment vs. Implementation It is important to distinguish between the two different parts of an APM project: deployment and implementation. Good deployment involves solid R&D behind the product, extremely professional support, effective training, and a successful configuration process that supports business needs. It might sound counterintuitive, but deployment rarely fails. Deployment issues always arise and bugs will be part of the software industry forever. A good and responsive R&D organization can fix it all. Add effective and reliable customer communication and most deployments succeed.

Implementation is a totally different story. This is a different project, with a different set of owners, a different challenge, and a different goal because it involves ALL IT departments. Each of these groups has their own role in APM. Enterprise APM implementation managers should ask – how with the minimum installation and the least time can I get the maximum value? Do I need to do all the crazy things that everyone is talking about? What are the real technical requirements and where do they lead the organization? Why do I need all those installations in order to deliver a simple report and to start delivering better performance?

Troubleshooting vs. “Value-Shooting” Troubleshooting is about fixing problems. That is obviously important and it may have been the justification for an enterprise APM deployment. However, it isn’t enough. The goal is to create value for the company – more uptime, increased revenue, happier customers. How should we deal with this challenge? Again, we must distinguish between deployment and implementation. During deployment, APM providers engage in troubleshooting, a straightforward technical process that often uses templates to get things done. During implementation however, APM providers must engage in a process I call “value-shooting.” This involves asking the question, what value can we deliver to you?

The answer is what we call at Correlsense the “Target 4” : alerts, analysis, live dashboards, and reports. Any personnel at your IT can benefit from one of the four. Analysis is the hardest part, it might involve advanced installation and change in organisation processes therefor I recommend to start shooting the Value form the first day of the deployment. From the first collector in test environment we can start measure the ROI (remember – APM is for test as well as for production).  When an APM Project Focuses on Troubleshooting and Deployment – It will Probably Fail

Why is APM such an important issue in the enterprise? Take out your iPhone and use Google to search for the term “Enterprise APM” – how long does it take? Now think for a second what Google had to do to respond to your search? How many servers were involved? Think about the algorithm that prioritized the results, think about the algorithm that targeted the advertising on the side of the results page (remember – it is specific to your browser location), and now think about the performance. It’s insanely good! Do the same while rebooting your Mac and while uploading a picture to Facebook – these companies changed the the concept of performance in the technology industry forever.

Performance is not a luxury anymore. It is expected. More and more people are asking themselves why it takes eight seconds to access a single insurance policy from a LOCAL database and searching the entire web takes a few milliseconds. Everybody in your organization is using Google; expectations are changing. There is no reason your company can’t deliver the same level of performance as the leading Internet companies. There is simply no excuse for it to take more time to login to a banking system than to boot a Mac.
Good luck with your project and feel free to comment.

Elad Katav, COO

September 28, 2012

No Comments

Webinar: 5 APM and Capacity Planning Imperatives for a Virtualized World

The proliferation of virtualized applications has greatly increased the complexity of capacity planning and performance management. Monitoring and forecasting CPU utilization is no longer enough. IT operations and capacity planners now must understand and optimize their applications and infrastructure from the end user to the data center.

Join Correlsense and Metron-Athene for an online seminar which will explore key performance management and capacity planning strategies for a virtualized world. We will discuss:

  • What you need to know about capacity management when operating in both physical and virtual environments
  • How performance monitoring in virtual environments relates to your capacity management goals
  • What is unique about capacity and performance management for virtualized applications
Presentation Slides:

January 16, 2012

No Comments

Business Transaction Management – the Next Generation of Business Service Management

Why a New Generation? What’s Wrong with the Old One?

Traditional systems management tools focused on monitoring the health of individual components. Tools like IBM Tivoli, BMC patrol, CA Unicenter, and HP Openview, initially focused on management of servers, services, and resources. In those days, the equation was relatively simple – 100% CPU utilization = bad, 10% CPU utilization = good.

However, the increasing complexity of applications introduced numerous new enterprise application components including databases, connection pools, Web servers, application servers, load balancing routers, and middleware. The business service management (BSM) industry followed shortly after, and began offering tools for database management, network traffic monitoring, application metrics mining, and analyzing Web server access logs.

Each of these business service management tools “speaks” a different language – database management tools speak in “SQL statements,” network traffic tools use “packets,” while systems monitoring report in “CPU and disk usage.”

What Happens When the Application Crashes or Hangs? What Do You Do if a Single Transaction Suffers Slow Response Times?

In comes the war room. To cope with the proliferation of information sources, enterprises developed the notion of the war room. Whenever slow response times or poor performance of critical applications is detected, relevant personnel are grouped together into a room for brainstorming and joint monitoring. This involves a large amount of professionals, since a single transaction may flow through several infrastructure components. For example, a financial transaction will trigger an HTTP request to an Apache Web server installed on top of Red Hat Enterprise Linux, which in turns calls a WebSphere application server on a Windows machine, flowing through an MQSeries queue, eventually querying an Oracle database.

Members of the war room typically include Java and J2EE performance experts, Microsoft Windows system managers, Unix (Linux, Solaris, HP-UX, etc.) system managers, database administrators, network sysadmins, and proxy specialists, just to name a few. This is a lengthy process that can take thousands of man hours to complete.

The New Paradigm – Business Transaction Monitoring

The new generation of systems monitoring and management tools, widely referred to as Business Transaction Management (or BTM), offer a new approach. Instead of monitoring SQL statements, TCP/IP packets, and CPU utilization, Transaction Management tools view everything from an application perspective. In the world of transaction management, an application is considered as a collection of transactions and events, each triggering actions on the infrastructure. The goal is to track every transaction end-to-end and correlate to the information collected from the infrastructure. Such an end-to-end view enables to quickly isolate and troubleshoot the root cause of performance problems and start tuning proactively. This application-centric information base enables a group of professionals working together to speak the same language and focus on facts, rather than guesswork.

According to IDC (Business Transaction Management – Another Step in the Evolution of IT Management), BTM will likely become a core offering of established IT system management vendors, since it can contribute to almost every aspect of IT management – ranging from performance management, SLA management, and capacity planning, to change and configuration management (CMDB).

December 2, 2011

No Comments

Business Transaction Management – A New Generation

Traditional systems management tools focused on monitoring the health of individual components. Tools like IBM Tivoli, BMC patrol, CA Unicenter, and HP Openview initially focused on management of servers, services, and resources. In those days, the equation was relatively simple–100% CPU utilization = bad, 10% CPU utilization = good. However, the increasing complexity of applications introduced numerous new enterprise application components including databases, connection pools, Web servers, application servers, load balancing routers, and middleware. The business service management industry followed shortly after, and began offering tools for database management, monitoring of network traffic, mining application metrics, and analyzing webserver access logs. Each of these business service management tools “speaks” a different language – database management tools speak in “SQL statements,” network traffic tools use “packets,” while systems monitoring report in “CPU and disk usage.”

So what happens when the application crashes or hangs? What do you do if a single transaction suffers slow response times?

In Comes the War Room

To cope with the proliferation of information sources, enterprises came up with the notion of the war room. Whenever slow response times or poor performance of critical applications is detected, relevant personnel are grouped together into a room for brainstorming and joint monitoring. This involves a large amount of professionals, since a single transaction may flow through several infrastructure components. For example, a financial transaction will trigger an HTTP request to an Apache Web server installed on top of Red Hat Enterprise Linux, which in turns calls a WebSphere application server on a Windows machine, flowing through an MQSeries queue, eventually querying an Oracle database. Members of the war room typically include Java and J2EE performance experts, Microsoft Windows system managers, Unix (Linux, Solaris, HP-UX, etc.) system managers, database administrators (DBAs), network sysadmins, and proxy specialists, just to name a few. This is a lengthy process that can take thousands of man hours to complete.

The New Paradigm – Business Transaction Monitoring

The “new generation” of systems monitoring and management tools, widely referred to as Business Transaction Management (or BTM), offer a new approach. Instead of monitoring SQL statements, TCP/IP packets, and CPU utilization, Transaction Management tools view everything from an application perspective. In the world of transaction management, an application is considered as a collection of transactions and events, each tiggering actions on the infrastructure. The goal is to track every transaction end to end and correlate to the information collected from the infrastructure. Such an end-to-end view enables to quickly isolate and troubleshoot the root cause of performance problems and start tuning proactively. This application-centric information base enables a group of professionals working together to “speak” the same language and focus on facts, rather than guesswork.

According to IDC (Business Transaction Management – Another Step in the Evolution of IT Management), BTM will likely become a core offering of established IT system management vendors, since it can contribute to almost every aspect of IT management–ranging from performance management, SLA management, and capacity planning, to change and configuration management (CMDB).

November 22, 2010

No Comments

Transaction-Based Capacity Planning

Managing day-to-day IT operations is like piloting a large freightliner; some days, the trip can be smooth sailing. Other days are fraught with stress. Why is our e-commerce site slowing down at 11:00 pm every Thursday? How will I support the roll-out of a new service without adding more hardware? Having the analytics needed to accurately isolate problems and make informed business decisions is important. This paper discusses the Correlsense and Metron approach to expertly capturing and correlating transaction-level data, and how this information can be used by enterprises to determine whether they have sufficient capacity at both the service and component levels to meet the needs of the business.

Transaction-Level Data Drives Business Planning
Every IT shop has a dashboard of information that shows what’s happening at any given time. For most IT organizations this is made up from an arsenal of tools, with each one able to only handle a specific task, or tools that only conduct random samplings, which, in turn, does not give a complete snapshot of what’s really happening.

Understanding how service levels are affected by infrastructure and application components requires studying how transactions are performing. A transaction includes everything that happens in the data center from the moment the user clicks a button until they get a response. Transactions are what drive business. Transactions are what the user experiences, transactions are what are traversing through the various infrastructure, network, and application components. Simply put, transactions are the common denominator that links the business and all of the various elements of IT.

Capacity Planning Challenges
The following comments are typical of the challenges facing capacity managers today:

We can monitor and understand the performance of our estate at the component level, but we are struggling to determine the performance of our applications and services and how they are affected by the components.

Capacity planning is an important step in forecasting how IT can support the growing needs of the business. One of the challenges that many capacity planning prospects and customers have is simply knowing which servers and components serve which applications to begin with. Without this knowledge it is difficult to go through a proper capacity planning process.

Many solutions focus on monitoring the individual IT components and creating usage models to predict their utilization along with business growth. In simple application architectures this will suffice, but in complex and dynamic architectures, where applications communicate with each other, it is difficult to predict the true impact of business growth of a specific department or product lines because the complexity makes it hard to tell which infrastructure items (servers and their logical software components) are serving which business applications.

The main challenges of capacity planning, with respect to infrastructure planning and the optimization of resources, are:

  • What is the expected business growth next year, and how will it affect our infrastructure?
  • Which infrastructure items (servers and logical components) are serving which business application or business transaction?
  • Which application infrastructure can be consolidated without degrading service levels?
  • How will the relationship between the underlying components affect both the services and, in a wider context, the business that uses them?
  • How do the requirements of constantly changing environments impact the monitoring requirements?

These challenges are very hard to address when:

  • The IT infrastructure is complex, dynamic and based on a multi-tier structure
  • Multiple applications continuously interact with each other across multi-platform infrastructures

We’ve consolidated and virtualized our servers and now they’re running at much higher utilization, but now our service performance targets don’t seem to be as relevant.

How to Jointly Address the Challenges
The transaction is the missing link which correlates across all of these components. By being able to trace transactions, Correlsense SharePath is capable of providing the required contextual visibility which enables capacity planning for complex environments, resulting in better and more accurate capacity planning. By feeding transaction contextual data into Metron’s Athene modeling software, this data can then become the basis for addressing all of the capacity questions faced by an organization.

About Correlsense SharePath
Correlsense offers software that provides enterprises with an IT Reliability™ platform to keep business applications working properly.

The company’s flagship product, SharePath, provides unprecedented visibility into how transactions perform across the enterprise’s infrastructure, both from a birds-eye view and down to the very deep transaction details, which helps to rapidly pinpoint and resolve performance problems. Using patent-pending transaction path detection technology, SharePath traces every discrete transaction from the click of an end-user through each hop in the data center, while maintaining its context, 24×7, with negligible overhead and 100% accuracy, as it is purpose-built for a production environment.

By being able to record and correlate every transaction activation across both physical and virtual components, IT gains full visibility and transaction contextual metrics, which is required to ensure IT Reliability for both packaged and homegrown applications.

The rich data from SharePath is used by enterprises to rapidly pinpoint and solve problems and to gain unprecedented insights that can help to:

  • Reduce time to isolate and resolve performance issues, eliminating finger-pointing and the need for “war rooms” whenever a performance issue arises
  • Reduce the risks in rolling out new services, and reduce the length of these rollouts
  • Understand how configuration changes impact application performance and service levels
  • Optimize applications and their use of infrastructure resources to allow a better user experience, and enable infrastructure consolidation
  • Improve the capacity planning process

About Metron Athene

Athene Structure

Athene, the most scalable product in its class, provides enterprise ITIL-aligned Capacity Management, automatic performance analysis and reporting for physical and virtual environments. Athene enables the capacity manager to quickly identify what systems to focus on first, where the potential capacity ‘pinch points’ will occur and what to do about them.

CPU Trend

With the widest and most flexible range of automated data capture mechanisms businesses around the globe use Athene to:

  • Better understand how the underlying infrastructure components are performing
  • Analyze performance trends to ensure IT infrastructure continues to meet the requirements of the business
  • Accurately monitor the current service levels relative to the environment and predict how these may change based on real-life business scenarios.
  • Manage infrastructure costs by modeling hardware and workload scenarios, preventing over-expenditure on hardware/software and assuring optimal levels of capacity
  • Diagnose the true cause of system performance problems
  • Reduce the skills and manpower required to actively manage performance and capacity
  • Provide a single pane of glass to view enterprise capacity and performance management
  • Reduce virtual and physical server sprawl

Correlsense and Metron
The primary goals for capacity planning are to plan infrastructure resources based on expected future demand; maintain quality of service, minimize ‘surprises’ (such as performance degradations and outages) and the costs associated with correcting them; and optimize resource utilization to enable consolidation and reduce costs.

SharePath has the ability to see how transactions perform across the infrastructure, and can therefore create an accurate dynamic auto-detected topology map. This real-time topology map provides the missing link between the business applications and transactions being executed in the data center and the IT infrastructure they rely upon. In addition, SharePath knows for each transaction type (e.g., “login”, “send_money”, “buy_stock”…) which tiers are utilized, and what workload it consumes on each and every tier (e.g., Web, application, database, ESB, Web services, etc.) for each transaction type, volumes, and by which department.

SharePath

Athene is designed to capture data from the widest range of infrastructure components possible. Capacity and performance information is stored in a central database, analyzed, and provides the core output for the Capacity Management process. Athene monitors current performance, analyzes recent behavior, reviews past trends and predicts future service levels, with advice and exception reporting on alarms and alerts.

By combining the core strengths of the two products, SharePath can provide the necessary data to Athene, so that, for example, based on 30% expected growth in the Sales department, Athene can know exactly which servers and components will be affected, which transactions are activated, and what the volumes are.

transaction-based capacity planning

Metron can then use this data to build more accurate “transaction-based capacity planning” workload models, which better fit the complex and dynamic architectures of today’s IT infrastructure. The outcome is a more accurate prediction of:

  • Which infrastructure components should be strengthened or provisioned (or deprovisioned)
  • How application consolidation impacts business, and which departments and transactions are affected
transaction-based capacity planning

The benefits to the customer are:

  • Reduced risk of capacity planning mistakes/wrong assumptions for complex and dynamic architectures
  • Improved service delivery
  • Cost savings by optimizing resources with business demand and growth

Summary
The combination of Correlsense SharePath and Metron Athene provides a complete capacity management solution that allows organizations to understand the performance of both the core system and the critical services. By combining business forecast information, IT organizations can predict whether they have sufficient capacity to meet the needs of the business at both the component and service levels. Using these complimentary products, enterprises can move to the next level of virtualization performance assurance by optimizing the performance of the virtual infrastructure, while maintaining the required business service levels.

In summary, the utilization of Correlsense SharePath and Metron Athene can provide valuable insight into the performance of enterprise data center environments never available before, along with a sound basis from which to build a comprehensive customer-focused capacity management process.


© 2010 Metron
Metron, Metron-Athene and the Metron logo as well as Athene and other names of products referred to herein are trade marks or registered trade marks of Metron Technology Limited. Other products and company names mentioned herein may be trade marks of the respective owners. Any rights not expressly granted herein are reserved.

September 28, 2010

No Comments

Metron Capacity Planning Webinar

Correlsense, a provider of IT reliability solutions, will hold a webinar with its partner, Metron, a performance management and capacity planning specialist. The webinar, “Transaction-based Capacity Planning for Greater IT Reliability™,” will discuss how transaction-level data delivered by Correlsense is used by Metron to improve the capacity planning process, reduce costs and better align IT operations with business objectives.

Attendees will learn the answers to critical questions facing IT and business executives, including:

  • How dynamic and complex IT environments bring challenges to capacity planning and how they can be addressed
  • How to combine business forecast information with infrastructure performance metrics to predict whether there is sufficient capacity at both the component and service levels
  • How performance metrics with transaction context can deliver unique results

The two companies recently announced a partnership to deliver advanced solutions for transaction-based capacity planning. By tracing transactions, Correlsense SharePath provides the required contextual visibility that enables capacity planning for complex environments, resulting in better and more accurate results. By feeding transaction-contextual data into Metron’s Athene modeling software, this data becomes the basis for addressing all of the capacity questions faced by an organization.

About Correlsense
Correlsense SharePath provides a breakthrough in IT Reliability™ by enabling for the first time both a bird’s-eye and detailed view of how business transactions perform across the four dimensions of end-users, applications, infrastructure and business processes. While other service management and performance management applications focus on identifying problems at individual components, SharePath automatically detects and traces each entire transaction path, from a click in the browser through all its hops across data center tiers. With the ability to record and correlate individual transaction activations across both physical and virtual components, IT gains full visibility of the transaction metrics required to ensure IT Reliability™ for packaged and homegrown applications. The rich data from SharePath is used by enterprises to rapidly pinpoint and solve problems and to gain unprecedented insights for their IT Service Management initiatives such as ITIL. For more information please visit www.correlsense.com.

September 13, 2010

No Comments

New Webinars on Transaction Management and Capacity Planning

Last month we announced a partnership with Metron to provide advanced solutions for transaction-based capacity planning. To showcase our combined offering, we’re holding two live Webinars in October.

We’ll tackle common problems such as:

  • What is the best approach for predicting the true impact of business growth on a specific department or product line?
  • Which application infrastructure can be consolidated without degrading service levels?
  • How will the relationship between the underlying components affect both the services and, in a wider context, the business that uses them?

By tracing transactions, Correlsense SharePath provides the required contextual visibility which enables capacity planning for complex environments, resulting in better and more accurate capacity planning. By feeding transaction contextual data into Metron’s Athene modeling software, this data becomes the basis for addressing all of the capacity questions faced by an organization. We hope you’ll join us!

August 31, 2010

No Comments

Correlsense SharePath for IT Reliability (Part 3 of 3)


: 4B2oXbxSlVY 640by480

.

Lanir Shacham, Correlsense CTO, on using transaction data for capacity planning, performance management, cost allocation, change management, and auditing.
Duration: 3:16

May 25, 2009

No Comments

The Wall

I ran my first marathon last month. I did a triathlon a few months before that, and I can undoubtedly state that a marathon is much, much harder. I trained for it for three months, and since I’ve been running more than 15 years, I was aiming for a 3:45, which you need to be in a decent shape to accomplish. And I was. I had a specific plan as to what I want to do during the race, how fast to run and when. My plan was to hold a 5:25/km for the first half, and then push it to a 5:15 in the second half, which will average out in a clean 5:20/km (8:30/Mile) and bring me my well earned 3:45…my training overall supported this game plan.

Race day came, 6:45am we’re off, and I’m running a little too fast. I was going at a 5:15/km pace from the start all the way to 32km. I was feeling great, imagining how I might end up with a 3:38 or something which is what I really desired, if I just push myself some more on the last 10K. HOWEVER, when it comes to the marathon it’s not a question of desire and willpower, but a question of numbers. How much energy do you have in you, and what is the pace at which this energy is vaporizing. One has to detect this pace from his training and be realistic as to what he can or can’t accomplish.

Well I had many marathon runners telling me this again and again, since burning yourself out is a well-known phenomena called “the wall,” which just as well could have been called, “Trying to move your useless body while feeling a tremendous amount of pain, while every second feels like a minute and every minute feels like an hour and all you want is for this torture to end end end why doesn’t it end and why am I here and why is this happening to me????”

So I had to run through this 10K “wall,” which took me around 80 minutes, and I finally finished with a 3:58 (which is still OK). All of my body was a single giant muscle cramp. I could hardly move, and it took me five days to be able to climb some stairs again. Pain. A lot of pain.

So what does this have to with performance you ask? Well, it turns out that capacity planning is exactly the same. I mean, if you want to stretch it, and not just spend an enormous amount of money on hardware you don’t really need. Capacity is all about measuring TPS/resource utilization (speed/energy) in a pre-production environment (training) and applying an accurate plan when going to production (race day). I’ve been called on again and again by customers, which on money time simply can’t hold the load. And I always asked myself why? What’s so hard? Measure what you need and plan accordingly. But we all try to stretch it, at one point or another, and turns out “the wall” is everywhere…

Lanir