An Important Question to Ask After a Very Public Software Disaster

Rob Cross PSC color

BATS Software Glitch Causes Perfect Storm – March 2012
Nasdaq Outlines $40M Fund for Facebook IPO Glitches – June 2012
Knight Capital Blames Software for Computer Trading Glitch – August 2012
American Airlines Computer Glitch Causes Delays – April 2013
CBOE Identifies Software Glitch that Halted Trading – April 2013

What do all of these headlines have in common … wait for it … software quality problems! The above headlines are just the glitches that have been significant enough to have made the headlines. But there are thousands that happen every day, be it as simple as our smartphone apps crashing to something as serious as those situations listed above.

In each case there is an executive at the top who has to answer questions about how “it” happened. Having been involved in a lot of these situations over my career, as part of the solution, the ensuing conversations always focus on what technology is missing inside these environments that would have provided glimpses into the situation prior to the event occurring. In our experience at ProServices, the answer is rarely found in the gaps in technology, but rather in recognition of the big data problem these organizations have, but failed to recognize.

As the old saying goes, “Hindsight is always 20/20.” When such events occur and my team of experts shows up to the site (visually you can think of us as the high-tech guys wearing HAZMAT suits, prepared to dive through source code), we perform a forensic analysis of the crime scene, or in this case, the software. Interestingly enough, what we find is there was already plenty of data within these environments that could lead someone to foresee the possibility of such events. However, the data was seemingly overwhelming, and with schedule pressure to field these software releases, the data was ignored due to lack of time for proper analysis by the experts. 

What’s the important question we empower our clients to ask themselves after such an event?

“What is our data management strategy for collecting data across the various tools we already have, correlating it against risks important for us to proactively understand, and socializing this information from executives to engineers to provide transparency into our accrued technical debt so that this doesn’t happen again?”

Take a breath, I know it’s a long question, but as a guest blogger for DCG I hope to, over a series of posts, begin to explore the many answers with you as an audience and have some fun along the way. Many of these glitches, both large and small, are preventable, and I look forward to showing you how!

In the meantime, learn more about ProServices’ partnership with DCG.

Rob Cross
ProServices, Vice President

Written by Rob Cross at 08:31



Subscribe to Our Newsletter
Join over 30,000 other subscribers. Subscribe to our newsletter today!