Performance monitoring has traditionally meant setting a threshold then sending alerts when those thresholds are exceeded. Analytics is more sophisticated: it borrows from the techniques of data-mining techniques to establish norms in application performance and then points out when an application is working outside those norms.
If these two ideas sound the same, they are not. Analytics is said to be “machine-driven,” i.e., based on observable events; thresholds are set by people. Analytics applies mathematical and statistical techniques, like multivariate anomaly detection, to show when observed data does not match historical data. This technique is used in plant maintenance systems to predict when machinery needs an overhaul. Credit card companies and banks use this to alert them to fraudulent activity. Credit rating agencies use this technique to assign borrowers a credit score.
BigData is the logical place to keep data needed to drive these calculations. BigData is data without structure that is normally stored in a NoSQL database or an Apache Hadoop distributed file system. You can dump logs there and the use analytics to study them.
A simple example of analytics is linear regression. That takes observed data (e.g., transaction executive time versus transaction volume) and plots these on a chart. From that linear regression calculates a formula that in this case could estimate transaction execution time given transaction volume. People interested in that can take advantage of some of the statistical tools in Microsoft Excel to create their own model.
Predicting future behavior given historical data is logical for machines. Trying to do the same thing for the stock market is called “technical analysis.” People of course are not machines which lead some people, like Warren Buffet, to say that technical analysis a complete waste of time.
Analytics can be deployed to give a view of how the infrastructure, applications, transaction, and shared services are working together. You can see how this idea of using statistics and letting the machine set its own performance thresholds works by look at the analytics system. “Trends” are predictions into the future based upon the vast collection of past transactions in the Big Data database. Tools that let you compare current performance right now with an hour ago or yesterday can pinpoint performance issues. The analytics tool can study how an application performs during various levels of stress testing and point out cross-tier bottlenecks.
This is a basic introduction of application analytics. We will dig further into these details in future posts.