AI model accurately classifies breast cancer subtypes using mammograms and metadata
Breast cancer subtyping is essential for determining appropriate treatment strategies, as each molecular category, luminal A, luminal B, HER2, and TN, responds differently to therapy. For example, luminal A tumors are generally hormone receptor-positive and respond well to endocrine therapy, while triple-negative tumors often require chemotherapy and carry a worse prognosis. Current gold standard methods like immunohistochemistry are invasive and sometimes limited in capturing the spatial heterogeneity of tumors, especially in metastatic or hard-to-access locations.
Artificial intelligence (AI) is reshaping precision oncology with new tools designed to decode the complex molecular subtypes of breast cancer. While invasive methods like immunohistochemistry remain standard for subtype classification, researchers are increasingly exploring non-invasive alternatives that harness the power of deep learning.
A new study published in Diagnostics, titled "A Multimodal Deep Learning Model for the Classification of Breast Cancer Subtypes," demonstrates that integrating mammography images with patient metadata significantly improves the accuracy of identifying key breast cancer subtypes. The research, led by Chaima Ben Rabah and colleagues from Weill Cornell Medicine and New Vision University, sets a new benchmark in digital diagnostic models, offering a potentially transformative solution for early detection and personalized treatment.
The team’s multimodal framework combines convolutional neural networks (CNNs) with both imaging and metadata inputs, specifically age and tumor lesion type, to classify breast tumors into five categories: benign, luminal A, luminal B, HER2-enriched, and triple-negative (TN). Trained on the publicly available Chinese Mammography Database (CMMD), the model achieved an area under the curve (AUC) of 88.87%, outperforming previous single-modality approaches that relied solely on imaging. In contrast, the unimodal model using mammography images alone achieved an AUC of just 61.3%, underscoring the critical value of fusing clinical context with radiological data.
How Does the Multimodal AI Model Improve Diagnostic Accuracy Over Traditional Methods?
Breast cancer subtyping is essential for determining appropriate treatment strategies, as each molecular category, luminal A, luminal B, HER2, and TN, responds differently to therapy. For example, luminal A tumors are generally hormone receptor-positive and respond well to endocrine therapy, while triple-negative tumors often require chemotherapy and carry a worse prognosis. Current gold standard methods like immunohistochemistry are invasive and sometimes limited in capturing the spatial heterogeneity of tumors, especially in metastatic or hard-to-access locations.
This new multimodal AI approach bridges that diagnostic gap. The model’s architecture includes three separate CNNs: one for image feature extraction using a pre-trained Xception model, another for processing metadata like age and lesion type, and a third that integrates both streams to classify tumor subtypes. By leveraging pre-trained networks on ImageNet and fine-tuning them on mammographic data, the model benefits from transfer learning while mitigating the risks of overfitting on relatively small clinical datasets.
To account for class imbalance, an ongoing issue in medical datasets where certain subtypes like TN or HER2 are underrepresented, the researchers applied weighted loss functions. This ensures the model does not favor more common subtypes at the expense of rarer but clinically significant cases. The result is a more balanced performance profile across all five categories, with particularly high accuracy in distinguishing benign tumors and identifying TN cancers.
The binary one-vs.-all classification approach further supports the model’s utility, showing an AUC of 100% for benign classifications and 78% for both HER2 and TN subtypes. However, luminal A and luminal B remained more challenging to separate due to overlapping features in mammographic appearance - a limitation commonly acknowledged in radiological research.
What Are the Technical Innovations and Key Challenges Identified in the Study?
One of the critical innovations in this study is the fusion of multimodal data into a single predictive model. Rather than treating imaging and clinical metadata as separate entities, the framework encodes both types into a joint feature space. This fusion was accomplished through concatenation and dense layers, resulting in a 544-dimensional vector used for final classification. Notably, metadata preprocessing included the binarization of patient age around the 40-year threshold - a clinically significant marker for breast cancer risk and screening guidelines - and categorical encoding of lesion type (mass, calcification, both, or none).
To ensure model generalizability and reduce overfitting, data was divided into training, validation, and test sets using stratified sampling. All models were trained for 10 epochs with the Adam optimizer and a learning rate of 0.001. Despite the high dimensionality and potential for overfitting, the model maintained robust performance, thanks in part to dropout layers and batch normalization.
Interestingly, the researchers tested whether higher-resolution images (512×512 and 1024×1024 pixels) would yield better performance. Counterintuitively, the model’s accuracy dropped as image resolution increased, falling to 54.68% and 43.6% respectively, compared to 63.79% at the original 224×224 resolution. This suggests that greater pixel detail can increase computational complexity and may lead to overfitting without enhancing feature detection - a crucial insight for resource allocation in real-world deployments.
Comparative analysis with previous work further underscored the model’s superiority. For example, a study by Mota et al., which used imaging alone on the OPTIMAM dataset, achieved only a 60.62% AUC. In contrast, the current multimodal model not only improves on that figure by nearly 30% but also includes benign tumors and a broader patient age range, making it more applicable to real-world scenarios.
What Future Directions Could Enhance Model Performance and Clinical Adoption?
While the study makes substantial strides in AI-driven breast cancer diagnosis, it also outlines several limitations and areas for future research. The dataset used is geographically and demographically limited to a Chinese patient cohort. Broader validation across diverse populations is essential to ensure generalizability and reduce the risk of demographic bias in AI models.
The metadata used, age and lesion type, while valuable, may not capture the full clinical complexity of breast cancer. Incorporating genomic data, hormone receptor status, and other biomarkers could further enhance classification accuracy. Integration with multi-omics data, such as transcriptomics and proteomics, may be particularly valuable for differentiating hard-to-classify subtypes like luminal A and B, which remain a challenge even in this advanced model.
Another avenue for improvement lies in imaging modalities. While digital mammography is widely accessible and cost-effective, combining it with MRI or ultrasound could provide richer feature sets and improve subtype differentiation, especially in younger patients with dense breast tissue.
The study also points to federated learning as a promising method for expanding model training across institutions without compromising patient privacy. This approach could support the development of globally robust models trained on diverse datasets, essential for equitable AI deployment in oncology.
The authors stress the need for explainable AI tools to facilitate clinical adoption. Interpretability remains a major barrier to trust in machine learning, particularly in high-stakes fields like cancer diagnosis. Incorporating attention mechanisms or saliency maps could help clinicians understand model decisions and build confidence in AI-assisted workflows.
- FIRST PUBLISHED IN:
- Devdiscourse

