IT Performance Metrics and Airplanes

I am writing this post on an airplane, traveling home from a great visit with a customer of ours. Well, more like a partner. It was the kind of meeting that makes me really love my job: working with smart people who run huge, complex systems in constantly-changing environments.

These meetings involve the usual stuff – syncing about the latest tech issues, planning the next steps, discussing training, challenges, implementation processes, unforeseen needs and service feedback. I try to learn what they are doing with the product, look for test cases and probe into the product as well as the service.

One specific thing I am always trying to find out is how are they running their performance process? Who is responsible for it? How is it implemented, both in test and production?

I know that performance is agile and often changing. There isn’t always a dedicated team to run it. Sometimes, it is the IT ops (mainly for production); sometimes it’s the App owners/developers, sometimes performance engineers. These teams can either operate per unit, or for the company as a whole. Obviously, the root of any degradation needs to be found both in testing and production, and it should be a cross-IT effort.

Since performance involves so many components throughout IT, it’s hard to find the right people that can maximize performance. A performance expert needs to be able to look horizontally over all the components, and have the ability to comment on and investigate them professionally. Each component (OS, DB, network etc.) is a whole, and it can take years to master so many fields.

Generally speaking, performance is simply a non-standard structure.

I remember when I ran an operations team in the past; we had a very long argument over whether the DBA’s should be centralized or per application. Eventually, we decided to have one centralized (system DBA’s) and several application-specific (application DBA’s). This kind of structure, along with Data Warehouse DBA’s, has since become a standard in the IT industry.

It will be interesting to see how that performance works out.