The ubiquity of big data is such that Gartner dropped it from their Hype Cycle of Emergent Technologies back in 2015. Across sectors, businesses are scrambling to make every function “data driven,” and there’s no shortage of firms lining up to help them. The big data analytics industry, dedicated to helping big businesses leverage the petabytes of information they now generate and store, is worth $122 billion — and growing.
The basic premise of the industry’s offering is this: Hidden in that huge mass of enterprise data are latent patterns. If only you could interpret your data properly, like an explorer deciphering an ancient scroll, you’d be able to unearth these precious business secrets. Specialist analytic software tools are needed to crack the code. The big, diverse, disparate, messy data go into these tools, and “actionable insights” come out.
Here is a game you can play at home: Search online for a real-world story of how big data analytics produced a piece of “hidden” or “unexpected” intelligence, based upon which the business took action, with quantifiable commercial results (preferably expressed in one of the major world currencies). You might just detect a conspicuous absence of concrete case studies to validate this “data-insight-action-value” chain as a concept.
In the original version of that game, popular among jaded office workers in the mid-2000s, players would seek examples of bloggers who made so much cash from blogging that they quit their jobs to blog full time (at home, in a hammock, with a daiquiri). Veteran players eventually noticed that there is only one blogging topic lucrative enough to support such a lifestyle change — How To Make A Living From Your Blog So You Can Quit The 9-5.
Clicking through pages of “unlock the value of your big data!” advertorials, a cynic might suspect that the best (and perhaps only) method of deriving value from big data is to go into the business of telling people how to get value from their big data.
All that’s happened is that technological innovations in data handling capability (made by companies like Google to deal with the scale and complexity of Web 2.0) temporarily leapt ahead of our progress in learning how to apply them — progress we make through experimentation.
In the interim, firms have defaulted to leveraging big data in exactly the same way they previously used small data: for reporting and business intelligence. Having invested in purpose-built tools to analyze data at scale, they’ve been rewarded with cool interactive dashboards visualizing it. These are basically auto-generated charts, conspicuously similar to the manually created Excel and PowerPoint reports executives were staring at back in 2005, but far prettier and costlier. It’s easy to see why this approach hasn’t quite delivered on the big data promise.
Firstly, in order for a puny human brain to interpret large and complex data sets, the data sets must first be made “smaller” via aggregation, summarization, description and presentation, which kind of misses the point.
Secondly, there’s just a natural limit on how far having information about your business is going to help you win at it. An enterprise’s data is simply the digital impression left behind by real-world transactions. Typically, mining that internal data will validate basic hypotheses upon which the business is predicated (“we make profits in our luxury fashion stores when they’re located in affluent areas”). In the worst case, it can make you uncomfortable by totally undermining those core assumptions without suggesting a back-up plan — (“we thought people bought ice cream on impulse when it’s hot and sunny outside; turns out we were wrong”).
Big businesses have absorbed Google-style tech, but are only just beginning to adopt Google-style thinking alongside it. Machine-learned translation algorithms, made possible by the availability of a massive corpus of textual training data and souped-up processing power, have no conception of French or Arabic grammar. Amazon’s recommendation algorithms generate 35 percent of sales without knowing why certain products are “frequently bought together.” It’s this very characteristic that makes them so powerful — if a machine can’t judge, it can’t make the errors of judgement to which humans are prone.
Algorithms now detect when drilling equipment in oil fields is about to fail based on thousands of sensor data points, enabling “predictive maintenance.” Imagine if, instead of applying machine learning to the problem, analysts had compiled these complex data sets into summary reports and tried to divine “insights” about why the equipment breaks so they could attempt to stop it from happening.
The beauty of predictive algorithms is that they don’t need to understand the cause and effect behind statistical relationships in order to work incredibly well in practice. For an enterprise to glean the benefits of prediction, it must first give up trying to deduce why things are a certain way, and start trusting the lines of code which tell us that they are.
This requires a cultural shift, and all new technologies encounter initial mistrust. But the time is right. It’s 2017, and your understanding is unnecessary. The artificial intelligence has rendered you obsolete. Now rejoice, because we are about to achieve some incredible things.