• Sustainability
  • DE&I
  • Pandemic
  • Finance
  • Legal
  • Technology
  • Regulatory
  • Global
  • Pricing
  • Strategy
  • R&D/Clinical Trials
  • Opinion
  • Executive Roundtable
  • Sales & Marketing
  • Executive Profiles
  • Leadership
  • Market Access
  • Patient Engagement
  • Supply Chain
  • Industry Trends

Normalizing Data with AI: Q&A with Andrew Anderson, VP of Innovation and Informatics Strategy at ACD/Labs


Anderson discusses how AI can be used to take data collected by different processes and bring it into a usable format.

Andrew Anderson

Andrew Anderson
VP of innovation and
informatics strategy

AI and machine learning are based around data, but as anyone in the pharmaceutical industry can attest to, not all data is created the same ways. Andrew Anderson, VP of innovation and informatics strategy at ACD/Labs spoke with Pharmaceutical Executive about the ways that these algorithms can bring data into more easily usable formats.

Pharmaceutical Executive: What role can AI and ML play in drug development?
Andrew Anderson: It’s an interesting time where there’s a lot of excitement around AI and machine learning. We see the success stories. The challenge is moving to scale. I’ve used this analogy a few time: it’s easy to boil a quart of water, but boiling a tanker car full of water requires a different system or platform. What I’ve seen, and what our collaborators have seen, is that when it comes to AI and ML, the promise is there.

You can see when you have a structured training set, you can have really good models developed that you can get useful, insightful, and revealing output from them. What I see as the greatest impediment to scale is the volume, variety, and disposition of the training data that is required for scientific applications.

I’ve seen many presentations from startup AI firms where their data source is a database called ChEMBL, it’s the European bio-informatics curated database. A lot of those training sets come from literature and the ChEMBL data is human extracted, curated data. The challenge is the source of the data is often not the proprietary screening results of a large pharmaceutical organization. To get to ubiquitous scale to accurately predict therapeutically relevant attributes, such as information about a target, what will modulate the activity of that target, what are the consequences bio-chemically, require a lot more data. This is particularly true for predicting off-target effects.

You can predict cell activity fairly well, but the off-target effects still require a good deal of wet-lab experimentation. In terms of finding things that will have a therapeutic effect, scale and cell activity are one of the challenges we’ve heard about.

PE: What other areas are you seeing opportunities?
Anderson: The other place we see an interesting opportunity in applying AI and ML technology is pharmaceutical development. Let’s say you’ve identified through your discovery process a molecule or some sort of therapeutic that has been shown to have an effect. The challenge lies in the preparation of processes to make your clinical trial material and how can AI tools be put to good use there.

When you make a material, the regulatory authories require that you follow certain conventions. The GMP, good manufacturing practices, is one of those standards that one must adhere to when producing material that will be subject to human contact on ingestion. Developing processes that produce high quality is one thing, but ensuring that those processes are reproducible and reliable is a harder, but desirable, goal in pharm dev.

Where can you apply predictive modeling and be able to reduce the amount of experimentation you must do? There is still a need to do confirmatory studies. You make a material and maybe the process to make that material is AI-guided. Once you make the material, you’ll confirm that you have a high degree of reproducibility and high quality.

PE: What is the industry focused on at the moment?
Anderson: What labs and our customers are focused on is applying AI principles to the development of those processes. What do they require? Lots of training data in the format that these systems can consume. The data that is generated during these processes, it’s the same as during discovery, all of the different assays and processes produce data in different formats. The next time you need to do a discovery campaign, how can your prior knowledge to accelerate the process of discovery and development?

For us, it’s about data engineering. I’ve heard a stat that the total cost to implement an effort in AI and ML is data engineering. It’s about cutting down that cost in AI and ML.

Here’s the good news: the level of strict formats is evolving with the evolution of generative AI, in particular these large language models. You don’t necessarily have to prepare the data in a very strict structure as long as you can access it and train your large language model and describe the disposition of the data. What we’re investigating is how to take very nascent additions to the AI portfolio and best leverage them. It’s a very exciting time.

If you and I were having this conversation three months ago, it would’ve been very different. It’s exciting, not just for the pharmaceutical industry, but for society in general.

Related Videos