Pinpoint a problem in minutes with Operations Analytics

24 februari 2016
Pinpoint a problem in minutes with Operations Analytics

A highly visible, customer-facing application suffers a severe performance degradation. As the outage wears on, the potential business impact grows. IT springs into crisis mode, and teams of experts are assembled, pulled away from their other tasks. But it can take hours—in this case, 36—before the root cause is identified.

This scenario comes from an actual outage we experienced at HP more than a year ago. A critical application customers used to find up-to-date information about products they’d ordered went down. The HP IT Global Data Services team involved in investigating the incident took 36 hours to determine the root cause, and more than two weeks to clear the backlogged transactions. Since then, IT have been using an operations analytics solution. If they’d been using it when the application went down, they could have found the source of the problem in less than 30 minutes.

That’s a significant improvement, which is why we believe applying Big Data analytics to Operations data should be a priority for the enterprise. If CIOs are going to transform their IT organisations to adapt to today’s new digital business priorities, they will need to embrace analytics solutions to pinpoint critical problems.


Monitoring alone isn’t enough

As IT environments become more complex, it becomes harder to find the root cause of problems. Applications are constantly changing. Today, apps can move across the infrastructure to the cloud, they can automatically scale up and scale down. In addition, your infrastructure is constantly changing. And if you’re like most organisations, you’re dealing with multiple IT service providers.

When a problem occurs it’s impossible to gain the timely insight you need from traditional methods. You may be collecting data from various monitoring tools, but even with multiple tools, you don’t have the visibility you need.

You still need traditional monitoring, of course. But monitoring alone is not enough. To pinpoint the root cause of an incident in minutes, you need to apply analytics to all the data you’re collecting from your various monitoring tools as well as to the so-called “dark” data of your system log files. Only then can you see correlations that tell you where the problem is.


Three ways Operations Analytics saves time and money

Organisations see a number of benefits when they implement an Operations Analytics solution. Here are some of the most important:

  • Reduce Mean Time to Repair: When you use Big Data to correlate your Operations data, you’ll be able to see problems before they happen (the ideal case) or before they become a real issue. But if something has become a real issue, you’re able to pinpoint it much more quickly.
  • Use smaller teams to solve problems: In many enterprises, the response to an outage is to jump in with 20 people on the phone. This type of reaction is not only costly, but it also has far-reaching effects: You’re pulling people away from their real work so that they can troubleshoot. But with Operations Analytics you don’t have to mobilise a crack team of experts to solve every issue. You can often solve problems at what we call level one-and-a-half: the level directly behind the Operations Bridge.
  • Avoid future issues by improving your applications: The third benefit is that now you can provide some of the Operations Analytics data back to your application teams. Now, when they go through a new cycle of their build, they have more knowledge of how the application behaves in the real world. Providing this type of feedback on app behavior ultimately reduces issues because development teams have the information they need to improve their applications.


Keys to success with Operations Analytics

When we’ve worked with organisations on implementing an Operations Analytics solution, we’re looking to help IT come up with what we call “Big Data maps” for their applications. Basically, we take everything that has to do with an application (your metrics data, your log files, your Venn data, your flow data, and so on) and instrument the solution so that you see all that data together. A Big Data map lets you see trends quickly. Over time, as trends develop, you’ll be able to predict problems instead of reacting to them after the fact.

To be successful with an Operations Analytics solution, take these three considerations into account:

  • How do I instrument this solution so I get the insight I need?
  • How do I onboard my critical applications?
  • How do I make it easy for my developers or my level 1.5 operations team to create Big Data maps quickly?

Asking these questions can help you adapt an analytics solution to your organisation’s most pressing needs. The benefit is that you can then use the solution to improve monitoring, application development, troubleshooting, and more.