AI in medicine gets a boost from synthetic data, but only the smart kind

Synthetic medical data, typically used to address privacy restrictions or to balance underrepresented cases in training data, is increasingly common in healthcare AI. However, the study warns that improper use or poor-quality generation techniques can introduce distortions, undermining model performance and producing misleading outputs.

CO-EDP, VisionRI | Updated: 25-03-2025 22:11 IST | Created: 25-03-2025 22:11 IST

AI in medicine gets a boost from synthetic data, but only the smart kind — Representative Image. Credit: ChatGPT

A new peer-reviewed study has found that synthetic medical data can enhance the performance and transparency of artificial intelligence algorithms in healthcare - under specific conditions. Published in Electronics, the study evaluates how machine learning and deep neural network models perform when trained on real, synthetic, and hybrid datasets, revealing both potential benefits and risks tied to data sensitivity and explainability.

The study, titled "The Explanation and Sensitivity of AI Algorithms Supplied with Synthetic Medical Data" and conducted by researchers from Dunărea de Jos University of Galați, Romania, tests the effectiveness of synthetic data using two widely used medical datasets: the Pima Indians Diabetes Dataset (PIDD) and the Breast Cancer Wisconsin Diagnostic Dataset (BCWD). Using a range of machine learning models and custom-built neural networks, the authors compared model performance across multiple scenarios, including real-only data, synthetic-only data, and combined configurations.

Synthetic medical data, typically used to address privacy restrictions or to balance underrepresented cases in training data, is increasingly common in healthcare AI. However, the study warns that improper use or poor-quality generation techniques can introduce distortions, undermining model performance and producing misleading outputs. The researchers found that in some cases, synthetic data significantly improved classification accuracy. In others, it degraded results or misaligned model interpretations, raising questions about reliability.

For the BCWD dataset, models trained solely on real data outperformed all synthetic and hybrid variants. Random forest classifiers using the original data achieved an accuracy of 97.2%, while the same models trained on Gaussian Copula Synthesizer (GCS) data dropped to 73.4% and hybrid data to 86%. This suggested that the original dataset was already well-distributed and that artificial augmentation was unnecessary, and potentially harmful, in this context.

However, results differed when applied to the diabetes dataset. The PIDD data, known for its class imbalance, benefitted from the use of synthetic augmentation methods like the Synthetic Minority Oversampling Technique (SMOTE). When combined with the original data, SMOTE raised the classification accuracy from 78.5% to 94.2% using an Extra Trees classifier in PyCaret’s AutoML framework. Custom deep neural networks also performed better, with the more complex DNN2 model achieving 89.7% accuracy on SMOTE-augmented data compared to lower results on the original dataset alone.

To further test data sensitivity, the authors introduced synthetic features derived from discretization techniques, converting continuous values into categorical representations. These transformations improved feature salience in several configurations, further boosting model performance, especially for the diabetes classification task.

The study also incorporated LIME (Local Interpretable Model-Agnostic Explanations) to evaluate which features contributed most to the models’ decisions. In high-performing models using SMOTE or feature-enriched data, key indicators such as body mass index (BMI), insulin levels, and glucose concentration were ranked as the most influential - aligning with clinical expectations. In contrast, models built on GCS data often produced less coherent feature importance rankings, suggesting potential distribution drift or over-smoothing in the synthetic generation process.

While PyCaret’s AutoML platform was praised for its efficiency and speed, especially in early testing stages, the authors noted that carefully constructed deep neural networks offered greater flexibility and interpretability when tuned properly. The advanced DNN2 architecture, which included five hidden layers and dropout stages to prevent overfitting, demonstrated particular value when applied to hybrid datasets.

Despite these gains, the researchers cautioned that synthetic data remains a double-edged sword. Without careful selection of generation methods, integration techniques, and validation tools, AI models may inherit bias, distort clinical indicators, or fail in real-world applications. They emphasized that LIME and other explainability tools are essential for validating not just performance metrics, but the integrity of underlying decision logic.

Simply put, synthetic data, when properly calibrated and combined with real records, can improve AI performance and reduce dependence on sensitive patient information. However, data sensitivity remains a critical risk.

The authors call for future research to explore additional explainability methods, such as SHAP or counterfactual reasoning, and to incorporate larger, more diverse datasets for greater generalizability. They also recommend that AI development teams in healthcare prioritize transparency, robust validation, and clear documentation when using synthetic data, particularly in clinical contexts.

FIRST PUBLISHED IN:
Devdiscourse

AI in medicine gets a boost from synthetic data, but only the smart kind

TRENDING

High Stakes in Hong Kong: Jimmy Lai's National Security Case Culminates

Japan's Bold Dive: Mining the Depths for Rare Earth Independence

Tensions Rise as Federal Immigration Raids Shake Minneapolis

Truck Collision Disrupts Los Angeles Rally for Iranian Protesters

OPINION / BLOG / INTERVIEW

Generative AI must adopt healthcare-style consent rules

Agricultural waste could power next generation of biodegradable plastics

Machines still don’t know what harm is, and that’s a growing AI risk

Generative AI may be driving a global breakdown in shared reality

DevShots

Latest News

Australian Job Ads Continue Downward Trend

Unexpected Alliances and Strategic Shifts: A Look at Global Trends

Fed Chair Faces Investigation Over HQ Renovation

Truck Collision Disrupts Los Angeles Rally for Iranian Protesters

Connect us on

SECTORS

EDITIONS

OTHER LINKS

OTHER PRODUCTS

CONNECT