New AI model harmonizes medical data across institutions without compromising privacy

One of the biggest obstacles in multi-institutional EHR research is the inconsistency in medical coding systems across healthcare facilities. Different institutions use unique local codes for laboratory tests, diagnoses, and medications, making data integration difficult. The GAME algorithm tackles this challenge by utilizing knowledge graphs, pretrained language models, and graph attention networks to map local codes to standardized medical terminologies such as ICD, LOINC, and RxNorm.

COE-EDP | Updated: 18-02-2025 10:35 IST | Created: 18-02-2025 10:35 IST

New AI model harmonizes medical data across institutions without compromising privacy — Representative Image. Credit: ChatGPT

Electronic Health Records (EHRs) have become a fundamental resource for clinical and translational research, providing vast amounts of patient data for large-scale studies. However, conducting multi-institutional studies using EHR data remains a challenge due to data heterogeneity and privacy concerns.

A recent study, "Representation Learning to Advance Multi-Institutional Studies with Electronic Health Record Data" by Doudou Zhou, Han Tong, Linshanshan Wang, Suqi Liu, and others, published in arXiv (2025), introduces an innovative method called GAME (Graph Alignment for Multi-institutional EHR Data). This algorithm leverages representation learning and federated learning techniques to harmonize EHR data across institutions without sharing patient-level information, opening new frontiers for multi-center research.

Addressing data heterogeneity in multi-institutional EHR research

One of the biggest obstacles in multi-institutional EHR research is the inconsistency in medical coding systems across healthcare facilities. Different institutions use unique local codes for laboratory tests, diagnoses, and medications, making data integration difficult. The GAME algorithm tackles this challenge by utilizing knowledge graphs, pretrained language models, and graph attention networks to map local codes to standardized medical terminologies such as ICD, LOINC, and RxNorm.

GAME operates in three levels of integration: (1) within institutions, it constructs knowledge graphs to establish relationships between local codes and standard codes; (2) between institutions, it leverages language models to identify relationships across different coding systems; and (3) it applies graph attention networks (GATs) to quantify the strength of these relationships. By jointly training embeddings through federated learning, GAME enables robust code translation across institutions while preserving patient privacy.

Preserving privacy with dederated learning

Traditional multi-institutional collaborations require the sharing of patient-level EHR data, which raises significant privacy and compliance concerns. Federated learning offers a solution by enabling institutions to train models on local data and share only aggregated parameters rather than raw patient information. GAME incorporates federated learning techniques to create jointly trained embeddings without exposing individual patient records.

The study demonstrates the effectiveness of GAME in a multi-institutional setting by testing it across seven healthcare institutions in the United States and France. The researchers applied the algorithm to patient stratification in various conditions, including heart failure, rheumatoid arthritis, Alzheimer's disease, and suicide risk assessment, proving its capability to maintain data security while enhancing predictive modeling accuracy.

Enhancing clinical research through AI-driven data integration

Beyond harmonizing EHR data, GAME significantly improves the quality of multi-institutional studies by facilitating AI-driven feature selection and predictive analytics. The researchers evaluated the algorithm's performance by applying it to clinical studies on Alzheimer's disease outcomes and suicide risk among patients with mental health disorders. The results demonstrated that GAME-processed EHR data retained valuable clinical information, enabling accurate AI-driven patient stratification.

For Alzheimer's disease, the study used GAME embeddings to cluster patients based on clinical profiles and predict nursing home admissions, a key indicator of disease progression. Similarly, in suicide risk assessment, GAME embeddings allowed the identification of high-risk patient subgroups, demonstrating its potential in predictive healthcare applications. These findings indicate that GAME can enhance precision medicine efforts by enabling more accurate patient subgroup identification across diverse healthcare settings.

Future of multi-institutional EHR studies

The GAME algorithm represents a significant advancement in multi-institutional EHR research, offering a scalable, privacy-preserving, and interpretable approach to data harmonization. By combining representation learning, knowledge graphs, and federated learning, it provides a robust framework for integrating, analyzing, and interpreting heterogeneous EHR data across institutions.

As healthcare data continues to grow in complexity, AI-driven solutions like GAME will play an increasingly important role in breaking down data silos and enabling collaborative, large-scale medical research. Future research will focus on expanding GAME's applicability to new disease areas, refining its AI models, and integrating additional medical ontologies to further improve data standardization.

With GAME, the potential for conducting scalable, privacy-conscious, and high-quality multi-institutional EHR studies becomes a reality, paving the way for more comprehensive, data-driven insights in medical research and patient care.

FIRST PUBLISHED IN:
Devdiscourse

New AI model harmonizes medical data across institutions without compromising privacy

Addressing data heterogeneity in multi-institutional EHR research

Preserving privacy with dederated learning

Enhancing clinical research through AI-driven data integration

Future of multi-institutional EHR studies

TRENDING

RPT-China foreign minister to chair UN Security Council meeting in US, visit...

IUML's Shahjahan gets Minorities Welfare, VCK's Vanni Arasu gets Social Just...

Man with pending LOC slips past immigration authorities at Hyderabad airport...

HC pulls up WFI over decision to declare Vinesh Phogat ineligible

OPINION / BLOG / INTERVIEW

World Bank Says Investing in Teacher Training Can Reduce Learning Poverty

Malaysia’s Green Transition Seen as Key to Economic Growth and Climate Resilience

Text Messages Boosted Student Performance but Triggered School Transfers in Kenya

Estonia Study Finds Doctor-Patient Care Contracts Can Significantly Save Lives

DevShots

Latest News

HC pulls up WFI over decision to declare Vinesh Phogat ineligible

Center of Policy Research and Governance and AI4India publish report on 'future of jobs' in the age of AI

Soccer-Australia's Leckie grateful for World Cup chance after injury battle

La Trobe Corridor Opens Australian Pathway for Indian Startups

"Never let India's head bow": Amit Shah salutes BSF, outlines massive 'Smart Border' plan against infiltration

Connect us on

SECTORS

EDITIONS

OTHER LINKS

OTHER PRODUCTS

CONNECT