OR WAIT 15 SECS
Many real-world data sources contain unstructured text, making it difficult and time-consuming to glean actionable insights from the data. Natural language processing technology can alleviate this problem, writes Jane Z. Reed.
The proliferation of online data has created a potential trove of real-world insights for life sciences companies hungry to gain better understand patient safety risks, new market opportunities, and clinical effectiveness. New artificial-intelligence-based technologies, such as natural language processing (NLP) are allowing researchers to more quickly identify and extract meaning from multiple sources, including social media, call center feeds, full-text literature, safety reports and more.
For life sciences companies, real-world evidence (RWE) is critical, informing all phases of drug and device development, as well as commercialization. However, many real-world data (RWD) sources, like electronic health records, patient forums and social media, contain unstructured text, making it difficult and time-consuming for clinicians and researchers to glean actionable insights from the data. NLP technology can alleviate this problem.
Rather than retrieving documents based on keywords that users then need to read themselves, NLP essentially reads these documents for users, and then identifies relevant facts and relationships, and extracts the information in a structured format for review and faster analysis. NLP can also connect these facts together in new ways to synthesize knowledge and create actionable insights.
Following are three examples that detail how healthcare organizations leverage NLP to mine unstructured data to gain insights that helped them solve real-world problems.
Identify market trends: Drug maker Novo Nordisk wanted to identify healthcare market trends and detect patterns from three disparate RWD sources: 1) call center feeds, 2) medical information requests and 3) conversations with healthcare providers. Novo Nordisk was already using this data, but through an inefficient and labor-intensive process in which vendors did manual extraction and scanning. They needed a better approach.
To solve the problem, Novo Nordisk built a workflow to transform RWD from the three sources to fuel a medical and patient dashboard, making medical and patient data actionable across its global workforce. Novo Nordisk hosts this information in an Amazon Web Services data lake, running NLP queries to pull out key topics and trends.
The new workflow replaced the need for manual scanning, saving the company costs equivalent to a full-time employee per year. Novo Nordisk also reduced spend on external vendor report generation, automated evidence-based insights generations and significantly broadened access to these insights across its global team.
Understand market access and healthcare economics: Evaluating the potential for market access is essential for all pharmaceutical companies, and information to characterize the burden of disease and local standard of care in different countries is critical when launching any new drug. Prior to a launch, companies need to assess the landscape of epidemiological data, health economics and outcomes information to develop the optimal commercialization strategy, understand market options and avoid expenses associated with unnecessary epidemiological studies.
A top 10 pharma company was interested in surveying the market landscape in Latin America around drugs for three therapeutic areas: irritable bowel disease, ulcerative colitis and Crohn’s disease. Key data points to develop deeper market knowledge included risk factors, health status, clinical effectiveness, compliance, costs, hospitalization rates and quality of life indicators.
The company opted to utilize NLP text mining to extract, normalize, and visualize data from scientific journals, abstracts, and conferences. This diverse set of structured data was then easily fed to a visualization dashboard that provided at-a-glance information filtered by country, disease and other factors. The company then used this data to generate a comprehensive understanding of the available evidence and identify market gaps related to each of the three therapeutic areas.
Mining EHR notes to evaluate heart failure device performance: A large health system based in the Midwest needed to analyze information on cardiac resynchronization therapy device outcomes and clinical characteristics for heart failure patients for a contract research project. The health system’s goal was to evaluate a range of different outcomes in order to understand how well cardiac devices such as pacemakers were working across several patient populations.
However, a significant amount of the information needed for the project – ejection fractions, New York Heart Association classification details, symptoms, information about the device and outcomes, and any reasons for removal of the device – was saved as unstructured data and not easily accessible to researchers. The health system’s volume of information was so substantial – seven years, 100,000 patients, 34 million documents – that researchers estimated extracting all the needed information via manual processes such as chart reviews would require 55 person-years.
Using NLP, the health system was able to help the device manufacturer understand how to improve its implantable products and help its own clinicians make better data-driven decisions on cardiac treatment. Two employees of the health system, who did not have previous experience with NLP, completed the project in just two months with 95% accuracy.
Real-world data provides high value at all stages of drug and device development, from bench to bedside. But too often, critical real-world data is locked in unstructured documents, rendering manual curation slow and tedious. NLP extracts and synthesizes that high-value information, uncovering data that may lead to insights that change patients’ lives.
Jane Z. Reed, Ph.D. is Director Life Sciences at Linguamatics, an IQVIA company.