Big Data in the Pharma R&D Landscape

July 31, 2014
Guest Blogger

Pharmaceutical Executive

Big Data and the mindset that comes with it will have a fundamental impact on how pharmaceutical research will be conducted in the future, writes Peter Tormay.

Big Data and the mindset that comes with it will have a fundamental impact on how pharmaceutical research will be conducted in the future, writes Peter Tormay.

Innovation is the corner stone of the pharmaceutical industry. Unfortunately, the industry as a whole is not very good at it. For the last two decades the approval of NME’s has been largely flat (with a blip in 1996). At the same time the costs associated with drug development are constantly increasing. This is often referred to as the innovation gap. The situation is exacerbated by the fact that patients are demanding better suited medicines for their personal diseases and payer requiring better real world evidence that the drugs coming to market provide added value over the current standard of patient care - personalized medicine and comparative effectiveness research.

Unfortunately, the industry has not yet come to grips with these challenges and in the past mainly looked at how to reduce costs through improving operational efficiency rather than looking at how to improve innovation itself. The result has been high attrition rates and ever increasing costs per drug.

Only recently initiatives such as cooperative and pre-competitive collaboration, open-source innovation as well as strategic partnering have become more common. The industry is also starting to focus on the main asset underpinning innovation – data and information.

Enter Big Data from the left. While the term Big Data implies large datasets, there are several dimensions to Big Data often referred to as the three V’s: volume, velocity and variety.

The most interesting one, in particular from a pharmaceutical perspective is variety. Data comes in all shapes and sizes. In addition to the structured well defined data there is an increasing amount of unstructured data. Unstructured data can be anything from sensor data captured through a mobile phone, to patient blogs and tweets on the internet to more semi-structured questionnaires and patient reported outcome reports in clinical research or healthcare. Variety is also the most challenging dimension. Not only is it necessary to map the right data types from one data source to the correct data type of another data source but also the meaning of the actual information need to be matched. Unfortunately, there are currently several competing standards that are used by the healthcare system and the life sciences industry.

Velocity refers to the speed with which new data is created and added to already existing data. It also incorporates the notion of change – how does data change over time.

In respect of data in the pharmaceutical industry, another dimension should be added to these dimensions – variability. Clinical data is highly variable, after all we are talking about individuals. Because of this high variability, the first dimension – volume is actually a good thing as it allows to draw conclusions from datasets with a high degree of intrinsic variability.

It is easy to see the value Big Data can offer in relation to marketing and sales. The focus here clearly lies with the velocity dimension. Companies need to be able to evaluate drug performance as well as changes in sales patterns and patient sentiment quickly so that they can act on them in a timely manner.

The situation in drug discovery and development is somewhat different. The focus is on the volume dimension. With the increasing implementation of electronic health records a vast amount of longitudinal patient data is suddenly becoming available. In addition to that companies can mine their own legacy data as well as scientific and patient derived information available in the public domain. Genome data is exploding at an astronomical rate.

An important aspect is the fact that much of the information can be integrated and brought together at the patient level. The fact that a huge variety of different data sources and types can be integrated into one big holistic picture of the patient will help in the quest in understanding disease mechanisms and define patient populations and medical need better.

Obviously it is not about the data per se but really about integrating all these different types of data in order to identify meaningful trends, patterns and correlations for better insight and decision making. Before a company starts to look at Big Data it needs to understand what it needs that data for - a clear goal needs to be defined. 

Beyond Big Data – Enterprise Knowledge Management
Enter knowledge management from the right. In order to be effective any Big Data initiative needs to be part of a wider knowledge management strategy. While the insights gained through the evaluation of Big Data can help with knowledge creation and knowledge utilisation it is equally important to make sure these insights are shared within the organisation. To this end one needs to distinguish between explicit knowledge, knowledge that can be easily transmitted and codified and tacit knowledge, which has a personal quality and thus is difficult to codify and disseminate. This also means that there are sides to any knowledge management strategy. The first one is aimed at data integration, which allows everybody access to a growing body of codified knowledge with the goal to connect people with reusable codified knowledge. The second one is aimed at allowing a dialog between experts and bring their expertise together. The first one supports clinical development whereas the second one is a prerequisite for innovation.

Both strategies require the implementation of appropriate IT systems. Key in both cases is that these systems are aimed at the relevant knowledge domain experts rather than the IT department. Gaining insight is all about relevance and context, yet most of the data is not amenable to storing it in relational databases the way we used to.

Therefore these systems need to provide a unified knowledge base and not only focus on the data itself but more importantly the relationships within the data. In order to achieve this goal these systems need to employ semantic technologies and restructure the information into meaningful “mind maps” of messages controlled by a multidisciplinary ontology.

Related Content: