The Bogus in Big Data

Looney,William;

The Bogus in Big Data

August 13, 2015

By William Looney

Article

Pharmaceutical Executive

Pharmaceutical ExecutivePharmaceutical Executive-08-01-2015

Volume 35

Issue 8

BIG DATA IS BIG-it’s the one trend in healthcare that demands to be described in superlatives.

BIG DATA IS BIG-it’s the one trend in healthcare that demands to be described in superlatives. To fully grasp the scope, we must venture to the very end of the alphabet for the right word-zettabyte-that quantifies the vast amounts of random data being generated every day through human activity on the Internet.

William Looney

Sometime this year, the world will have made the transition to the era of the zettabyte, with Internet traffic accumulating by a compounded factor of 21 zeros, a volume equivalent to the storage capacity of 250 billion DVDs. To use another comparison, we are producing bytes of computing activity at sufficient scale to give all of the earth’s seven billion people access to 200 newspapers per day. And the pace is headlong: some 90% of this data has been generated just within the last two years.

On July 28, the New York Academy of Sciences held an expert symposium on the implications of the big data revolution on drug development. Underlying the discussion was an awareness of how disruptive the data revolution is to traditional ways of bringing medicines to market. While there is potential for cost efficiencies and risk mitigation in areas ranging from precision medicine in fighting cancer to the innovative repurposing of old drugs, there is a larger issue at stake: how to turn data quantity into data quality. As one speaker put it, “the big data revolution now gives us access to a billion health records, yet all of this data is flawed in some way. That said, is it acceptable to draw on this vast and diverse record pool in developing useful inferences for research? The answer requires we confront our propensity for bias: if we do find something interesting in a survey of 30 million patients, how can it be wrong?”

The consensus was it is not bigness alone that hampers reliance on data to identify efficient health solutions. Instead, it’s the complexity within the data that often lead researchers astray. These include factors like reliance on different hard/software infrastructure; enrollee nomenclature gaps like double-counting patients or the inability to track the full patient journey through an episode of care; impact of concomitant medicines use (the co-Rx effect); and risk-adjustment problems, illustrated by the conflict between the impulse toward uniformity in a centralized data base and individual privacy mandates like HIPAA. In essence, marking the transition from big data to a study model designed to perform a specific task-one consistent with a hypothesis that yields true knowledge necessary to drive action-is proving downright messy.

Solving this challenge is critical to achieving the promise of big data. If observation is indeed the starting point of biological discovery, as Charles Darwin famously said, then what must be done to augment the tools that technology now gives us-to move those powers of observation to the next stage? This was the question that Pharm Exec joined with Quintiles, several big Pharma companies, and two academic partners to grapple with in our Roundtable cover feature this month.

Our focus was on the review of the internal institutional capabilities of biopharmaceutical companies in leveraging the data surge to improve their value proposition to payers, who often rely on the same data to render judgment on patient access to new drugs. What we discovered is the role the in-house pharmacoepidemiology practice can play in finding better ways to classify and render sensible all the background noise from big data. Derived from two words in classic Greek, epidemiology is the study of people’s health and has traditionally focused on evaluating risk factors related to the incidence of disease in a broad population setting. It takes research from the relatively restricted plane of the RCT and elevates it to the population level. Insights from study at the population level allow for the identification of factors to inform treatment decisions at the individual patient level-completing the circle.

This makes the function ideal for capitalizing on big data’s potential in building broad observational studies to drive understanding of how medicines actually work, in real-world settings, where patient satisfaction and overall health outcomes count. The group’s advice on how and where pharmacoepidemiology can sharpen its impact is useful reading.

Finally, our Roundtable convened just 10 days after the death of the father of the evidence-based medicine (EBM) movement, Dr. David Sackett. It’s worth noting that a man associated with aggregated data-based pathways to drive treatment never intended to take the physician and patient out of the picture. In his seminal 1996 British Medical Journal article on EBM, Sackett avowed that “external evidence can inform but never replace individual clinical expertise.” Now that the anonymized zettabyte is becoming a “numbers don’t lie” measure of system performance, his advice remains pertinent: to improve health, the best algorithm is found in the face of every single patient.

William Looney

is Editor-in-Chief of

Pharm Exec

. He can be reached at

wlooney@advanstar.com

. Follow Bill on Twitter:

@BillPharmExec

Articles in this issue