Digging Deeper into Analytics

SharePath captures transaction data from many applications, programming languages, and network components. It auto-aggregates all dimensions of many millions of individual transaction paths in real time. The product creates over 180 different analytic views of the data at any time. Customers use the data to do things like base lining, trending, modeling, change-impact-analysis, and root cause analysis.

As a result, analytics is an important topic for our customers. Some want to understand the performance profiles of their applications, others want to understand their performance against an SLA, some want to forecast capacity, refine their architecture, rollout a new application, or simply fix a problem.

We previously discussed how analytics can be used to determine whether an application is performing outside norms.  We wrote that analytics “borrows from the techniques of data-mining to establish norms in application performance and then points out when an application is working outside those norms.”

We also said that analytics can be better than other approaches, which set thresholds based on: (a) experience, (b) empirical evidence (similar to experience), (c) a hunch, or (d) some kind of rule-of-thumb.  In other words: guessing.

Big Data analytics software vendors have programmed statistical models into their software without giving much explanation about how it works.  They say things like “mining the Hadoop Big Data database lets you work smarter,” whatever that means.

It seems reasonable that someone should have a basic understanding of what is going on when they make business decisions based upon what their data mining software is telling them.

Here we give you a brief introduction to one statistic used in such software: t-test.

Statistics can be grouped into three categories:

Descriptive—takes data and gives you some explanation of what you have.  For example, it gives you the average and the average variation from the average, which is called the standard deviation or variance.

Predictive—stock-traders and merchandisers use this to try to predict future events based on past ones.  A simple model of this is linear regression, which means given some value of x (e.g., stock price last year) what might be the value for y (i.e., stock price tomorrow).

Inferential—this is where we focus our efforts in performance monitoring.  These statistics let you infer or draw conclusions from what you are looking at.  One example of this is to tell you when observed values are outside norms.  That is a key goal of performance monitoring.

Inferential statistics let you overcome the signal-to-noise problem.  That’s easy enough to explain:  it just means sorting out when an observed value (i.e., the signal) is statistically different from all the other values (i.e. the noise).

One of the easiest ways to do that is to calculate the t-test statistic.  One common use of the t-test statistic is to see when a student’s test scores are far enough outside the norms to cause concern, and whether it is time to call in the parents.

Microsoft Excel calculates the t-test statistic using the function =ttest.  You give it two arrays of mean values (meaning averages, not the actual observed values) and the software calculates the t value (also called the alpha value).  Statisticians’ rule-of-thumb is that a t value < 0.05 is statistical significant.  (The careful reader would say that we are guessing again.  That is a subject you can argue with social scientists, who have adopted that rule of thumb.) (You can watch a YouTube video on how to program use the ttest function in Excel here.)

What does this mean for performance monitoring?  Gather up two sets of data: (1) one from a day where the customer is reporting latency in end-to-end performance and (2) the control group, which is a normal day or your stated service level.  Gather data for every 10 minutes over an hour.  The array you would plug into Excel is the average values from the 6 sets of 10 minutes of data.  The t-test statistic will tell you how much these two sets differ from each other in terms of t-values.

If the t value is greater than or equal to 0.05 then you can explain to your customer that the system is operating within norms.  If he or she asks you how could that be, explain that the observed values are not significantly different from the control and let them puzzle over that.

If you want to read more deeply about the t-test statistic, you could start here. Alternately, you might purchase SharePath and let it analyze your data in real-time.