How AI could transform drug modeling and pharmacometric workflows

The research tackles whether LLMs can serve as useful tools in pharmacometrics without undermining the integrity of mechanistic, domain-specific models. The analysis shows that while LLMs are unlikely to replace mechanistic PK/PD models in the near future, they have significant potential as collaborative assistants.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 03-10-2025 22:12 IST | Created: 03-10-2025 22:12 IST
How AI could transform drug modeling and pharmacometric workflows
Representative Image. Credit: ChatGPT

In a sweeping new analysis, researchers from the University of Pavia have laid out a bold roadmap for the future of pharmacometrics in the age of artificial intelligence. The study, “Pharmacometrics in the Age of Large Language Models: A Vision of the Future”, published in Pharmaceutics, examines how large language models (LLMs) like ChatGPT could support, augment, or even transform the deeply technical workflows that define model-informed drug development (MIDD).

While large language models are being rapidly integrated into many aspects of biomedical research, clinical diagnostics, and drug discovery, the study highlights how their application in pharmacometrics, a field built on mathematical modeling of pharmacokinetics (PK), pharmacodynamics (PD), and disease progression, has so far been sparse and exploratory.

The authors offer a critical evaluation of existing use cases, map out a series of plausible future applications, and identify the technical and methodological hurdles that must be addressed before LLMs can be meaningfully deployed in high-stakes pharmaceutical modeling environments.

Can LLMs assist pharmacometrics without replacing mechanistic models?

The research tackles whether LLMs can serve as useful tools in pharmacometrics without undermining the integrity of mechanistic, domain-specific models. The analysis shows that while LLMs are unlikely to replace mechanistic PK/PD models in the near future, they have significant potential as collaborative assistants.

Current documented uses of LLMs in pharmacometrics are mostly confined to code generation in general-purpose languages such as R or the NONMEM control stream syntax. Tools like ChatGPT, Microsoft Copilot, Claude, and Gemini have been tested on tasks such as drafting model code, generating visualizations, simulating drug behavior, and interpreting control streams. The results have been mixed. In some cases, models like Claude and GPT-4o demonstrated accurate translation of NONMEM outputs into structured summaries or Python scripts. However, challenges remain in maintaining reproducibility and correctness, particularly when models are tasked with more complex or multistep problems.

A key example is the Data-Driven Discovery (D3) framework developed by Holt and colleagues. This framework employed three specialized LLMs to automate the construction and refinement of pharmacological models based on system dynamics and existing drug data. Though exploratory, the approach demonstrated that LLMs could move beyond mere assistants to become active contributors in hypothesis generation and model building.

However, the study states that LLMs, in their current form, lack adequate training on pharmacometrics-specific languages and datasets. This underrepresentation significantly hinders their performance in modeling environments that rely on domain-specific software like Monolix, NONMEM, or Stan. The authors argue that to unlock the full utility of LLMs in this domain, targeted fine-tuning on pharmacometrics datasets and workflows is essential.

Where can LLMs make the greatest immediate impact?

The research identifies seven critical areas in the pharmacometrics workflow where LLMs can provide immediate and high-impact assistance: information retrieval and synthesis, data collection and formatting, code generation and debugging, model building and covariate selection, support for PBPK and QSP modeling, report writing, and pharmacometrics education.

LLMs could help accelerate information synthesis at the outset of modeling projects by extracting key data on disease mechanisms, biomarkers, and prior modeling assumptions from clinical literature. This could streamline the model rationale phase and improve alignment with regulatory standards. Similarly, LLMs can assist in parsing and structuring raw data from sources like electronic health records and clinical notes. Studies have shown that ChatGPT-3.5, for instance, could outperform traditional natural language processing tools in extracting relevant structured data from pathology reports.

The use of LLMs for code generation and debugging is currently the most active area of application. With enhancements like Codex and GitHub Copilot, LLMs have proven adept at writing, translating, and annotating code. While these capabilities have shown strong results in general-purpose languages, their performance in pharmacometrics-specific syntax is limited. The study points to a need for community-driven efforts to curate and train LLMs on domain-specific pharmacometrics codebases to bridge this gap.

Another important avenue is the potential for LLMs to assist in population model development and covariate screening. The study speculates that fine-tuned LLMs could help identify biologically plausible covariates, suggest alternative model structures, and even flag inconsistencies in diagnostic outputs. This would mark a shift from simple task execution to intelligent decision support.

In complex modeling domains like physiologically based pharmacokinetic (PBPK) and quantitative systems pharmacology (QSP), the authors see longer-term potential. LLMs could help translate textual descriptions of physiological processes into differential equations and modular code blocks, thereby accelerating model construction. LLMs could also automate parameter sourcing by retrieving physiological constants from the literature and facilitate model navigation for non-expert users through conversational interfaces.

Will LLMs eventually predict clinical outcomes?

The study examines whether LLMs might one day replace traditional PK/PD models as direct predictive engines. Several emerging projects offer early hints. The NYUTron system, trained on billions of words from hospital records, has demonstrated high accuracy in predicting patient outcomes such as in-hospital mortality. Similarly, GPT-based models have shown promise in survival analysis, outperforming traditional statistical models in forecasting post-operative outcomes for cancer patients.

The OncoGPT framework, still conceptual, proposes a way for LLMs to predict cancer treatment responses by learning mappings between treatment inputs and clinical outcomes. By integrating multimodal data, clinical, genomic, and radiological, LLMs could drive one-step-ahead treatment optimizations in a reinforcement learning environment.

In time-series forecasting, LLMs have been repurposed to predict temporal patient trajectories using structured and unstructured EHR data. Digital twin frameworks such as DT-GPT are already using fine-tuned biomedical LLMs to simulate disease progression and personalize treatment planning. These systems create virtual representations of patients and offer individualized therapeutic recommendations, sometimes mirroring expert decisions.

Despite these advances, the study notes that such approaches still face hurdles in robustness, generalizability, and interpretability, especially within regulated environments like drug development. Nonetheless, the authors contend that the combination of LLMs’ predictive capacity with digital health data could usher in a new era of precision pharmacometrics.

A hybrid future for pharmacometrics

To sum up, the future of pharmacometrics in the LLM era will likely be one of hybrid intelligence. Rather than fully replacing traditional modeling, LLMs are poised to become collaborative reasoning partners that enhance productivity, broaden access, and accelerate innovation. Platforms like InsightRX Apollo AI, which deploy multiple LLM agents under a planning controller to guide analysis, exemplify this evolving vision.

Still, rigorous validation, domain-specific training, and regulatory alignment remain essential for responsible deployment. The researchers call for interdisciplinary collaboration between academia, industry, and regulators to develop the infrastructure needed to support safe and effective integration of LLMs into the pharmacometric toolkit.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback