Forgotten scaling: the role of APM in customer support

“Application Monitor generally emphasizes the necessity of application performance management (APM): make good use of APM, or lose vital customers. Let’s consider the alternatives and consequences in a more sophisticated way, where the choices shade beyond “black” and “white”.

Suppose your organization is like one I currently observe closely: Web-based applications deliver mission-critical services. Response time perceived at customers’ browsers varies on regular daily and weekly cycles, with occasional flurries of congestion or slowdown. End users are mostly satisfied with performance, and focused just on going about their business. What will happen as the business grows, perhaps dramatically, over the next couple of years?

This is the “scalability” question. Much of the software and related infrastructure is already in place; programmers often speculate and plan about scaling, and the code base appears to be in adequate shape to suppor
t expansion by at least one order of decimal magnitude. While more hardware will be necessary, the system appears to be designed so that new hosts can be provisioned and deployed in a manageable and cost-effective system.

There’s one human element in service delivery, though, whose scaling hasn’t been solved: customer support. The company’s current model has expert support staff covering all 168 hours of every week, fielding inquiries through telephone, chat, and e-mail. As business grows, there’s every reason to expect that the number of support calls will grow at least proportionally.

This is a problem.

Or if not precisely a problem, at least a considerable challenge. The company will need to be hiring and training new support staff nearly continuously, if it’s to prosper. That’s inherently difficult, and particularly so during an “up” phase in the national business cycle, when skilled workers become increasingly hard to locate and secure.

The alternative has its own challenges. In principle, the company can upgrade its on-line help facilities and IVR (interactive voice response), streamline diagnostics, and enhance the general quality of its application to make the most of its support technicians’ time and gradually reduce the incidence of upsets that lead to support calls. In other words, each support tech will cover a wider and wider span of potential callers. This make the support center “brittle”: when something systemic goes wrong, an overwhelming avalanche of calls arrives.

The only solution we know for this combines increasing the robustness of the application along with disciplined, proactive APM. The company is learning that it’s not enough to have logs that allow faults to be diagnosed
after they’ve occurred; to grow the business as planned, it needs control panels that flag root causes before customers realize they have problems. While that sounds a bit like magic, several APM solutions deliver exactly that kind of insight. Early experiments hint that at least a couple of these are practical for the company I’m describing.

In abstract terms, APM has benefits beyond a requirement to maintain application availability. In this case, it makes the human resources in customer support less of a limiting factor, and allows the company to grow in a healthier and more balanced fashion.

As we’ll see in follow-ups later this winter, APM pays off in customer support in other ways as well.