Mining Cancer Kryptonite from Historic Clinical Trial Data

May 12, 2017

Clinical trial data from completed trials are now beginning to be shared through a variety of channels. While skeptics remain, evidence is growing that this data can lead to practice-changing behavior in patient care.

For decades, the randomized clinical trial has been the gold standard for determining the safety and efficacy of new treatments for cancer and other diseases. Typically, patients enroll in a clinical trial and receive one of at least two possible treatments. Data are collected throughout the clinical trial for each patient, and then, after the trial, the data are analyzed, and an outcome is determined. Some treatments are identified as effective and safe, others are not. Regardless of the outcome, the trial data are typically archived at the conclusion of the trial.

These archived clinical trial data, however, have enormous value to the research community, and to patients, beyond its primary association with the specific trial for which it has been collected. The data are rich in demographics, laboratory and other measurements, and timing content, and can be even more valuable when combined with data from other clinical trials. Unfortunately, for patients and researchers, these data are traditionally unavailable beyond the reach of those organizations that funded the original research.

The walls that restrict access to data are slowly being taken down. Clinical trial data from completed clinical trials are now beginning to be shared through a variety of channels. While skeptics remain, often struggling with legitimate concerns like patient privacy and commercial concerns, the evidence is growing that these data can lead to practice-changing behavior when it comes to patient care.

Data sharing associated with cancer trials is beginning to lead to important new scientific insights. This is because for cancer trials, unlike trials for most other indications, the comparator arm used to evaluate the effectiveness of the experimental arm is not a placebo but is an active treatment (i.e., “standard of care”). Clinical trial sponsors are more willing to share comparator arm data than the experimental arm data, and the comparator data is still rich in value as it includes measures not only associated with disease progression, but measures associated with active treatments.

The value of the active comparator arm data can be seen in a series of peer-accepted manuscripts that have been recently published based upon data aggregated within the Project Data Sphere Cancer Research Platform.1 These manuscripts, based solely upon the comparator arm data shared by research sponsors, are the tip of the iceberg when it comes to generating insights from clinical trial data.

What does this mean for patients and their healthcare providers? It means improved clinical trial design that will likely lead to improved treatment options, more informed decisions, better care and, hopefully, improved patient outcomes.

  • Anthony Joshua, et al.,2 developed an improved prognostic model for patients suffering from prostate cancer. Prognostic models such as these help forecast the likely outcomes for patients based upon a variety of factors, and Joshua was additionally able to determine that low-molecular weight heparin and warfarin were associated with poorer patient survival, while patients taking Metformin and Cox2 inhibitors exhibited better outcomes.

  • Tito Fojo, et al.,3 determined that prostate cancer tumors grow at different rates when treated with docetaxel vs. a combination of prednisone and mitoxantrone. He was also able to determine that trials with fewer patients could be used to obtain similar outcome results, which could lead to the need to enroll fewer patients in trials. Fewer patients in one trial translates into not only lower costs per trial, but also more patients for other trials, and begins to stretch the very narrow pool of cancer patients willing to participate in a clinical trial further.

  • James Costello, et al.,4 led an effort to identify new and improved prognostic models for prostate cancer, and his research team’s results were significantly superior to the prognostic model that was most widely used at the time. Additionally, aspartate aminotransferase was newly identified as an important and previously under-reported prognostic biomarker associated with prostate cancer. One interesting consideration here is that the model was identified through a crowd-sourced challenge, and the winning model was developed by a team of data scientists who had little pre-existing knowledge regarding prostate cancer. Without data being shared, these data scientists would never have had access to this dataset.

  • Atul Butte, et al.,5 determined that patients taking docetaxel plus prednisone survived significantly longer than patients treated only with prednisone, and the survival rates are higher for patients treated with docetaxel plus prednisone than mitoxantrone plus prednisone.

These new research and practice-changing insights would not have been made if not for the growing availability of shared clinical trial data. In recent years, several of these data sharing platforms have emerged, and each platform provides differing levels of openness about the timeliness and process by which researchers gain access to the data. Notably, open-access data sharing platforms are showing more value than gatekeeper-controlled data sharing platforms.6 These open-access platforms recognize that the readiness and availability of access to shared data outweigh the risks associated with more tightly controlled access, and the evidence can be found in the steady stream of peer-accepted manuscripts being driven by this open-access shared data.

As these new insights are peer-accepted for publication, they will result in greater interest in providing more data to data-sharing repositories, and this will lead to new insights, creating a self-sustaining virtuous cycle. As these insights are published, they will, in turn, migrate to healthcare practices.

Dave Handelsman is Vice President for Development, Project Data Sphere, LLC


  • N. Geifman and A. J. Butte, et al. A patient-level data meta-analysis of standard-of-care treatments from eight prostate cancer clinical trials, Nature Scientific Data, May 2016,