Data Vigilance and Validation: Lessons Learned, Future Hopes for Real-World Data

Pharmaceutical Executive, Pharmaceutical Executive-05-01-2022, Volume 42, Issue 5

Biopharma experts at the forefront of applying real-world data and real-world evidence across the product life cycle converge to discuss the lingering challenges in fully realizing the vast potential of these insights to transform the value equation.

In March, nine biopharma executives joined Pharmaceutical Executive to discuss the ongoing advancement of real-world data (RWD) and real-world evidence (RWE) as critical components of the life sciences industry. With the use of RWD continuing to edge into every aspect of biopharma—from R&D strategy to commercialization and across all therapeutic areas—roundtable participants shared their insights into how the use of RWD has evolved at their organizations and how the lessons of the last two years will shape this use in the future. Discussing the challenges that remain around RWD and RWE, the roundtable—moderated by Peter Malamis, senior director, life sciences market development at Phreesia—aimed to shine a light on what is needed to ensure that the pharma c-suite understands the potential of RWD and where it fits into an organization’s overall strategy. Highlights of the discussion are presented here.

Peter Malamis: Are you seeing more alignment across your organization in terms of RWD strategies—top-down driven decisions regarding purchasing, partnerships, and standards—or is it still somewhat fragmented?

Dan Riskin, Verantos: In the advanced RWE space it was fully fragmented a few years ago, with the market access, medical affairs, and regulatory groups basically not talking with each other. Now, as people are spending more money to get high-validity evidence, we’re seeing people starting to work together. RWE divisions are being created that are trying to ensure that the information gained can span the organization. It’s more about the questions being asked and the evidence being produced than about the group that will use it.

Mike D’Ambrosio, Syneos Health: Within our company, we can have an integrated product offering, going from clinical to commercial and back again, leveraging circular insights as we go around. To our customers, we can show the value or what we perceive as the value at each end of the spectrum of the development cycle. But there is still some work to be done to bring those silos together, to make sure we’re getting the best use of high-value data across the product life cycle and beyond.

Kristen Bibeau, Incyte: I’m coming into an organization that has been rapidly growing over the past couple of years. I was the first epidemiologist hired, so I’ve been trying to bring those types of silos together and recognize the value of bringing in-house that data and that evidence generation that coordinates across the continuum of the life cycle. It’s been difficult, but we’re well on our way. The impetus of the FDA gaining some traction has really generated a top-down interest in how valuable this data can be, how it can speed efficiency of trials, how it can help leverage market access and reimbursement on the other end. So, we’re gaining steam, but I wouldn’t say that we’re all the way there yet.

Matthew W. Reynolds, IQVIA: It’s hard to give one answer because, for example, from the contract research organization (CRO) perspective, a company that might sell data and data solutions can be still highly fragmented, where others may be highly aligned due to focused efficiency. We need to keep in mind that even when we’re talking about the same real-world data that different groups within a pharmaceutical company may use that data differently. For example, with an open claims data set, a commercial team may want it because it’s nearly real-time and they are able to identify who’s prescribing, who’s switching medications for their patients, and how many new prescriptions are written. That same data may also be used by a clinical team to identify the geographic locations and specific sites to identify their patients of interest for recruitment into prospective studies.

Robert J. LoCasale JR., Sanofi: It’s still fragmented (by business unit or global function), but it’s okay for some areas to be fragmented because of their purpose (e.g., business needs). But when it comes to the data aspect, that’s where we’re trying to get more centralized because these are no longer low-cost items. When you start to spend high numbers on assets like this or on capability builds, you need to justify a return. We need to have a top-down centralization data strategy that allows for enterprise value generation, but also global business unit (GBU) and global function (GF) value generation. We’re largely seeing a tight integration on the data side, but it’s okay for the GBUs or GFs to generate differentiated value that’s important to them because they need to report vertically on that.

Debra A. Schaumberg, Thermo Fisher Scientific: The level of complexity compared to even a decade ago has increased exponentially. At Thermo Fisher Scientific’s PPD clinical research business, we’re organized with a cross-functional RWD evaluation and implementation structure, with representatives from the corporate strategy team, along with leaders in data architecture, technology, data curation, scientific strategy, core scientific disciplines (e.g., epidemiology, biostatistics, data science), and therapeutic area expertise, among several others, that are constantly evaluating the RWD/RWE landscape.

We felt the need to examine our structure because it is not just about the data anymore; the whole ecosystem has become immensely more complex. Our process has evolved from accessing, for example, broad claims and EMR (electronic medical record) databases to more focused and deeply curated offerings, such as therapeutic or disease area-specific databases. The addition of technology solutions and platform approaches to RWD, specifically those aimed at trying to achieve a 360-degree patient view, are all part of our new RWD ecosystem requiring a strong multifunctional team and evaluation structure.

There is also a stronger focus on the connectivity of data and ability to access real-world data using technology solutions that can, for example, extract EMR data from multiple sources, and potentially integrate those data at the patient-level with patient-generated health data, biomarker data, and other types of data.

Malamis: When your organizations are assessing the data sets, what are you looking for, given the current environment of greater integration, greater alignment, and increased complexity?

LoCasale: Ten to 20 years ago it was about whether you had the data features, the level of completeness, and the quality of those features. Those things are still important, but now it’s whether your data partner has advanced their product in those 10 to 20 years. Are they thinking outside of the traditional way we used to do this business? Are they putting any “skin” in the game with us as a company, because these (i.e., real world data investments) are not low-cost endeavors? We’re consumers, right? And partners need to show us a better product over time.

Riskin: For high-validity evidence, we have a fairly straightforward first bar, which is consistent with what FDA has put out, and we view them as a leading indicator of what good looks like. That’s completeness and accuracy. On the completeness side, if you haven’t linked EHR (electronic heath record) unstructured with EHR structured, with claims, with deaths registry, you are not going to be very complete, and it probably isn’t going to fall into the bucket of high validity. Accuracy is called out in the recent FDA guidance, but, over time, will be adopted by payers and providers if they desire high-validity evidence. And accuracy is simply a number. It always starts with completeness and accuracy.

Reynolds: Dan, I’m going to challenge you a bit here when you say that linking data gets you to more completeness. You’re right, that those new FDA guidance documents do talk about improving the data and the quality of it by linking the EHR and the medical claims data. But we need to keep in mind that every time we link data, we also lose a lot of data because many of the patients in Data Set 1 may not show up in Data Set 2. The more you link, the smaller that patient group with “complete data” becomes, and the more select or different they become from where we started. I think we have to be somewhat careful and counterbalance the decision of when to link, how to link, and how to supplement that data and make sure we’re thinking through what the implications of that linkage might be.

Going back to the original question, “How do we choose our data?” I think it’s very different when a company is choosing a data set to purchase for multiple uses internally. They consider the utility of it, its breadth, and how many different questions it can answer. Because a data set’s not good or bad in and of itself, other than, as Dan pointed out, accuracy. If the data is accurate and representative of what you would expect, it can be a good data set if you ask it the right question, which is the second component of “How do I choose my data?” It’s very specific as to whether a data set fits the purpose for a particular research question. That’s a very different decision process than, “Do I purchase data for my company?” vs. “Do I purchase data to answer this specific research question?”

Riskin: I would agree on this issue of fitness for purpose. If you are trying to show Drug A is 20% more effective than Drug B in a subgroup, then you need to use really high-quality data to do that. If you’re just trying to say the prevalence of a condition is 3% vs. 3.5%, it doesn’t really matter. But I would disagree on the idea that it matters who the audience is, which is to say if you’re trying to say Drug A is 20% better than Drug B, does it matter if it’s a doctor or a payer or a regulator who’s consuming that? Each of them will change the standard of care by what they prescribe, reimburse, or approve. What matters is the question. It’s fitness for purpose on the question, not for the end user.

Simu Thomas, ALEXION: I would assess the quality of data using three different lenses. The first lens would be the level of meaningfulness: Is the data reflective of the question that needs to be answered? What are the implications of the purpose of the original data design and collection? The second lens is level of importance: How impactful will the data be in answering the question, how can we enhance the credibility and acceptability of the inferences? The third lens is level of worthiness: How big of a need do we have with the data source? What is the practicality of answering the question with the resources we have? In rare diseases, the diagnosis journey is long and we may get only a handful of patients even from large academic centers. So, practicality and worthiness are big considerations. And underlying all of this would be the perspective of how the data fits with the wider spectrum of patient journey, understanding what light the data sheds and what are the other missing elements. This is fundamental, especially in rare diseases where much of the information on the lived experience of patients need to be fulfilled by RWE.

Alexa Berk, Invitae: Certainly, fitness for purpose, reproducibility, high quality, reliability, and validity are the kind of baselines we’re looking at in data sources. But what is the next level in thinking about the data partner? From the Invitae perspective, we have a very powerful genomics database and are now building out a more robust RWE offering that includes various platforms, partnerships with claims data providers, acquisitions, and partnerships with the human resources data providers. The acquisition of Citizen was certainly a big moment for Invitae in the RWE space because there’s only so much you can do with genomic data without having a robust clinical picture of each patient as well.

One thing we’re thinking about is the underlying data acquisition model of each of these data sources. Is it flexible, and is it robust in a way that it can replace some of these older ways of acquiring data with the same representativeness, with the same quality, with the same diversity of sites and diversity of patients? Are we really modernizing the way we’re collecting data or are we still just plugging into EHR systems and importing everything in the standard way? We’re trying to think innovatively about a genre-pushing, RWD source, if you will, that looks at different mechanisms of getting the data that might address a lot of the limitations that we tend to see in the standard kind of EHR and claims databases out there.

Schaumberg: The end user of RWE is very important, particularly because their evidence needs and the sponsors’ available resources don’t necessarily support the same level of effort and analytic strategy for every single question we’re trying to answer. If I’m working on a medical affairs communication of an already approved medication that is going to inform practitioners or patients on their use of a medication, it’s a question of real-world safety and effectiveness, and potentially providing more evidence to extrapolate insights to broader patient groups than were included in the pivotal clinical trials, or to refine a target patient profile. That’s potentially a somewhat different standard than if I’m trying to get a new product or new molecular entity approved in the first place, for example, by incorporation of a RWD-constructed external comparator into a new drug application. In the case of a product or a use of the product that is still unproven, in which the regulatory authority has a mission to evaluate the safety and efficacy to protect public health, it’s reasonable that the end user (the regulatory authority) is going to set a higher bar because the risk of “getting it wrong” is potentially higher. There is a spectrum of what we can do that’s valid and meaningful, but the rigor of data collection, documentation, curation, and analysis is not all the same for every end user.

Bibeau: My experience across the industry is that there’s also a need for education for some of our clinical colleagues on the understanding of how data can be used, what the quality is, what needs to be in there vs. what doesn’t. The HEOR (health economics and outcomes research) teams, the market access teams are very comfortable with this because they’ve been doing it for many years. The clinical teams are still getting up to speed with what’s going to be acceptable to FDA with respect to quality and rigor.

Reynolds: Sometimes we knowingly and purposely use data that we know is not good/optimal. You’ll know this if you’ve worked with spontaneous report adverse event data (e.g., FAERS or WHO adverse event data), for example. The data has a number of flaws (under-reported numerator and no definable denominator), but it can be highly effective at identifying signals even if we are not providing a definitive, hypothesis-driven response. It might not be high-grade data, it is not complete, it may not be valid, but it’s incredibly effective. It’s very good at picking up signals and driving us to a better study on a data set that is more fit for purpose for driving subsequent hypothesis testing.

Schaumberg: We recognize that data are just data. They don’t inherently do anything on their own to generate evidence or insights on their own. We are focused on how we can make data smarter—about how we apply the appropriate set of methodologies to the data we have (e.g., methods and level of curation, standardization, etc.) and/or improve the data if the available RWD are insufficient to support the type of evidence generation we need. Then, I’d tend to pivot the conversation from the data to the evidence that can be generated from the data. As a life sciences community in general, we have broad access to various sources of RWD; we’ve got structures and technologies in place to generate research quality data out of data collected during routine patient visits, or directly from patients, and other sources. The question then becomes how do we apply the right methodologies and the right structures to generate the fit-for-purpose evidence (e.g., adhering to core epidemiological first principles when approaching a specific research question) that we need to satisfy all the relevant stakeholders (including regulators, providers, patients, payers, and others) and move the research forward, and further? And, just as we have methodologies ranging from the case series to the clinical trial, the level of RWE required might be different, depending on what the research question is, who the end users are, and a number of other factors.

D’Ambrosio: I think it’s also true in some cases that the data itself doesn’t always have to answer a specific question. It can also, in some respects, be directional. For example, if you’re thinking about how commercial might use RWD, well, RWD doesn’t necessarily need to fall in the realms of its health data. It can also be insights that come back from medical science liaisons in conversations. It can be picking up chatter using neuro-linguistic programming techniques and so forth. The validity of that data is sometimes questionable, but at the same time, it can be very directional in terms of immediate to short-term tactics that you might employ within that space.

Thomas: I think we all agree that there is no one golden data source, as they are usually set up for diverse purposes and they are complementary. It would be ideal to link them together to weave the different elements of the patient journey. However, one has to consider the time resources and opportunity they provide to help with the decision at hand.

Malamis: If there was an area of data and its applications that you could say has the most room for improvement, where would it be?

Reynolds: I think a major issue in RWD right now is getting a gold standard for mortality. With the declining completeness and hurdles to the usage of data sources that we have used historically for mortality, like the Social Security data set. Oftentimes the RWD data sets that we use are missing the components to be able to link to mortality because very few of them have it included, unless it’s a single health system or payer that happens to collect it directly from the patients. This is an endpoint that everybody seems to care about, but so often we are hindered at being able to assess it without having to conduct some sort of prospective study.

I know from the IQVIA perspective that often as we are bringing data in, especially the more patients you have and the more data and the more components you have, the higher the likelihood you’re going to be able to reidentify patients. Because of patient privacy issues, we tend not to bring that. We tend to scrub it out on the way in. So, there has to be a concerted effort on a particular project to try to figure out how we build this and if it is acceptable.

LoCasale: What’s most interesting to me about your answer, Matt, is the privacy-preserving access need. We need to enhance access but in a privacy-preserving way, not just for mortality but for a tremendous amount of other data elements. In the next three to five years, I see us moving further with the technology to be able to do that, because we need to get closer to the source data, which also minimizes this provenance issue (i.e., lack of transparency on provenance) that we have. We need to clear up the provenance issue and get better with the access because then we get those downstream endpoints that we really need.

Berk: To echo the mortality issue, certainly within oncology, five-year overall survival is a very important endpoint and so difficult to get from real-world evidence. Even without the privacy and identifier issue, it’s a challenge. At Citizen we have patient consent, so we don’t have the issue of a need to de-identify data, but still to be transparent, mortality is a very challenging endpoint to capture because it is so poorly documented in the medical records. Almost without fail it needs to be externally generated as some sort of composite endpoint with a death index.

Another area for improvement would, of course, be the structuring of unstructured data. Again, even at Citizen, where unstructured medical records go through a machine learning pipeline as well as two levels of human clinical review for extraction standardization, it’s still incredibly time-consuming and difficult. It requires a high level of expertise from the human clinical reviewers. And if you want to have regulatory-grade data, you need human eyes on every single datapoint. Machine learning can assist with the lift and with the scale, but I don’t think it’s at the point yet where it can be trusted as a completely automated process.

Malamis: We heard the term “regulatory grade” there. What does that mean to you?

Schaumberg: I don’t think it’s always very meaningful because it’s a term being used loosely without reference to a clear consensus or detailed data standards framework for what would be considered “regulatory grade.” The required standard also depends on the specific use case, as well as the type of data. While there is emerging uniformity of guidance across agencies, there is also still wide authority within the organizations to evaluate data for its fitness for use depending on the specific application of the data and the role of the resulting evidence. In addition, there is not necessarily one unified voice or one unified standard; we’ve seen some indications of differing interpretations within regulatory agencies. In our experience, if we really dig in and evaluate data against the established regulatory RWE framework and resulting set of broad criteria, some RWD sets being termed “regulatory grade” might actually fail to satisfy all the elements regulatory guidance has suggested, depending on the specific way in which the data are being used to inform the regulatory decisions needing to be made.

Reynolds: I don’t think that you can look at a data set and say, “This is regulatory grade” without knowing what it’s being used for. To me, regulatory-grade means that you’ve assessed it for fitness for purpose, or fitness for use for a particular research question. A lot of data, if you chose it for a particular research question, is going to fall apart. It’s not going to meet that regulatory-grade level. But the question is, do you have something where you can say, “I feel comfortable answering this question with this data. I feel like I can identify the patients and the outcomes completely and accurately?”

Riskin: High-quality data is, at a basic level, about accuracy and completeness. It’s okay to use low-quality data for certain purposes. You use the data quality you need to achieve fitness for purpose. Regulatory-grade is a term we don’t use because we think it’s overused and underdefined. We say research-grade. There’s a ton of value for the idea of research-grade, for which you’ve actually put in the effort to make sure that you are fit for purpose, no matter who you’re speaking with.

Malamis: To close, if you could assign one to 10 to the real-world data environment, with 10 being ideal and one being nonexistent, where do you think we are?

D’Ambrosio: Best-case scenario, I would say a four. To jump back to one of the points Bob LoCasale made, are we getting value-add from our partners as opposed to just, “Here is our data?” From a methodological and a partnership perspective, how are we going to work together to use this data to solve a problem? From where we are to where we want to be, I think there is still a lot of work to do.

Reynolds: I didn’t think that five would be optimistic, but apparently it is! If you compare what we had to work with in the year 2000 to where we are now, it hasn’t been a straight line; but over the past three or four years particularly, we’ve seen an exponential acceleration in the ability to link data, expand data, and to create new data almost passively through patient EMR mediation, for example. I feel like things are going to expand and improve much more rapidly over the next five, 10 years than they did the past 20. So, I’d say a five, trending toward six.

Berk: I would say six. I tend to look on the sunny side of things, but even a few years ago, the idea that FDA would be releasing multiple draft guidances regarding the use of RWD was something we all hoped for but it was something that remained pending. And while they’re rough guidances, now they’re here and they’re informative.

The other thing we cannot overlook is the impact of COVID on the use and acceptance of RWE. It has sped things up and pushed things forward. Where I might have given a four or a five before COVID, the pandemic has really demonstrated the power and potential of RWE. The consensus view from the regulators down is that we’re not going to go back to the way we were looking at data before COVID. We’re going to use the momentum and continue to accelerate.

Schaumberg: I’m hovering around a four overall. I think a lot of what is left to be fully realized is related to the connectivity of data. Also, one thing we didn’t discuss was what’s in the medical record and what’s simply not there because it either wasn’t measured or wasn’t recorded in a sufficiently granular or standardized manner. Given that we’re operating within an ecosystem, some of the challenging work ahead of us is building systems based around the notion that there are additional uses for the data being collected in routine-care settings that extend beyond the clinician’s need for documentation and taking care of the patient. And can we move the EMR providers in a way that maybe structures the data differently or pulls in some different data elements that are simply not there currently to allow us to move research further? I think that’s a tough nut to crack, but it’s another element I think it will be instrumental in heightening the value of real-world data and evidence.

Bibeau: I’m going to echo that—what’s not in the data really drives my perspective on how we can leverage it. We work in oncology and dermatology, and in dermatology you want to know the severity of the disease. It’s got to be in the notes, but most of the time it’s not even there. Some of the frustrations we have with our data partners when we’re exploring new access is the transparency that the data partners bring to the table. It’s better if they say, “You know, we’ve got this field, but it’s not very complete. We still have work to do.” I value those types of conversations because then I know I can make a solid decision that is going to help us. So, I’d give it probably a five; there are still gaps that remain for me.

Riskin: What I want to give is a 10. What I’m going to give is a three. I think the majority of evidence produced uses traditional techniques. It’s not believable. With that said, there are a few green shoots out there. There’s work in oncology that I would give a six. Unfortunately, it’s fully manual, but it achieves high validity with two annotators and high inter-rater reliability. But I’d like to point out the promise here. The most important thing, in my opinion, happening to healthcare for a number of decades is using information from routine care. Every other industry does it. The idea that we’re going to use only a tiny fraction of information, i.e., the RCT, to learn and that we’re never going to compare or check subgroups is insanity.

LoCasale: 99.9% of us never enter a clinical trial, 100% of those people have a completely real-world journey. With digital today and the data footprint we’re dropping all over the place, we should be much better at collecting that footprint. Because we’re not there, I would only say four or five at best today.

Thomas: My number is five. Not because of the limitations but more because of the future opportunity and potential. The recent COVID situation has catapulted digital enablement and experiences in the common household. Technology has reduced our dependence on typical measurement tools and we are far less constrained by boundaries. With wearables, we now have the new potential to address what is usually not measured or captured using traditional methods. I’m reminded of a perspective of a patient who said (paraphrased), “You’re measuring all these things based on my experience at the hospital or the doctor’s clinic, but that’s not where I live. I live in my house and outside of the health system, and that is where I spend the majority of my time. So, if you want real-world experience, come and measure where I live. It is important to think of people’s lifestyles and how their lived experience can be translated, and digital opens up that opportunity.

Julian Upton is Pharm Exec’s former European and Online Editor, and currently Editor-in-Chief of sister publication Pharmaceutical Commerce. He can be reached at jupton@mjhlifesciences.com.