Ideas behind Big Data were put to the test on BBC’s Radio 4 this week and were found wanting. The programme was called Start the Week.
Kenneth Cukier was on the Big Data side of the debate and Tiffany Jenkins sounded a loud note of caution on what she described as ‘data determinism‘. She had a point, not least because Cukier’s examples were simply not persuading.
Cukier argued (and argues more in a book he’s written) that we have to give up on our human need to understand and let the data speak. In other words, we should give up on causality and rely on correlation, because Big Data offers better correlations.
So for example, Cukier talked about the relationship between college basketball and flu outbreaks as revealed by Google. The trouble is that we’ve been there before – flu and basketball is no more safe a correlation than storks and babies simply because the numbers are bigger.
In other words, letting the data speak – the idea that data somehow increases our pool of knowledge without the need to understand causality – is a recipe for disaster.
We know the hype … on the Radio 4 programme, it’s that we’re generating the ancient Library of Alexandria every couple of minutes (OK, it’s not quite that, but I bet it will be before long). Or, Cukier’s claim in his book title that Big Data is a “Revolution That Will Transform How We Live, Work and Think”.
I don’t question the basics. Data is growing in volume and volume growth is speeding up. But that doesn’t necessarily mean that we’ve got more information or that the body of knowledge is increasing.
What I do know is that:
1. We can answer questions that we couldn’t before because we can handle the data associated with them. For example, we can now easily answer the question “which doctors are prescribing which drugs?” because we can manage the prescribing database in its entirety.
2. There is huge potential in the analysis of social media data – but it’s a resource looking for a question. It’s not something that a machine can stare at and deliver insight from.
3. Concentrating on the data that’s ‘BIG’ is blinding us to the data that, although not perhaps as ‘big’, is waiting to be more carefully understood.
In the corporate world, the fascination with the potential of social media to offer customer insight or the potential of loyalty cards to do the same is blinding us to the potential of working with (for example) supplier data.
The data might not be as ‘big’, but if we got to grips with it, we might perhaps find ourselves eating beef and not horse.