New AI framework detects hidden credit card fraud with real-time accuracy
In contrast to prior studies that relied on deep learning models with low transparency, or balanced datasets that fail to reflect operational complexity, FraudX AI provides robust results without artificially altering the input data. The study also critiques earlier ensemble techniques that suffer from computational overhead, generalization issues, or poor scalability for real-time systems.

A team of international researchers has unveiled a powerful new machine learning framework, “FraudX AI,” designed to detect credit card fraud with high precision on real-world, highly imbalanced datasets, while remaining interpretable for regulators and financial institutions. The model, which combines Random Forest and XGBoost classifiers, achieved a recall rate of 95% and an AUC-PR of 97% in testing, outperforming multiple state-of-the-art fraud detection systems.
The study was conducted by scientists from Al-Farabi Kazakh National University, Purdue University, and the National Aviation University in Kyiv, among other institutions. Their innovation tackles long-standing challenges in the field, including class imbalance, model transparency, and real-time applicability.
Unlike many earlier fraud detection models that depend on oversampling techniques like SMOTE or computationally heavy deep learning architectures, FraudX AI preserves the natural imbalance of the dataset and uses an ensemble approach—averaging the probability scores of Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) models. The two classifiers are trained separately on the original distribution of data and integrated via threshold optimization techniques to maximize recall while minimizing false positives.
FraudX AI was tested on the widely used European credit card fraud dataset, which includes 284,807 transactions, only 492 of which are fraudulent. The researchers maintained the original class distribution to simulate real-world detection conditions. Traditional metrics like accuracy were deliberately downplayed in favor of recall and Area Under the Precision–Recall Curve (AUC-PR), more reliable indicators for anomaly detection in imbalanced settings.
In the evaluation, FraudX AI recorded the highest F1-score (0.97) and perfect precision (1.00), detecting 93 out of 98 fraudulent transactions in the test set with zero false positives. Comparisons with eight baseline machine learning models—Logistic Regression, Decision Trees, Gradient Boosting, LightGBM, Multilayer Perceptron, Naive Bayes, and XGBoost individually—demonstrated FraudX AI’s clear superiority across recall and AUC-PR.
Further bolstering its transparency, the framework integrates Shapley Additive Explanations (SHAP), an explainable AI tool that quantifies the influence of individual features on model predictions. Despite the anonymized nature of the input features (due to privacy-preserving PCA transformations), SHAP analysis was able to identify key contributors to fraud classification, notably components V14, V10, V4, and V12. These findings allow financial analysts and auditors to interpret the rationale behind fraud flags, supporting regulatory compliance and trust in automated systems.
In contrast to prior studies that relied on deep learning models with low transparency, or balanced datasets that fail to reflect operational complexity, FraudX AI provides robust results without artificially altering the input data. The study also critiques earlier ensemble techniques that suffer from computational overhead, generalization issues, or poor scalability for real-time systems.
The researchers implemented the framework using common open-source tools, scikit-learn, XGBoost, TensorFlow, and SHAP, and conducted training on standard computing hardware, further demonstrating the system’s accessibility. FraudX AI was developed and tested in a Jupyter Notebook environment running on a MacBook Pro with Apple’s M1 Pro chip, underscoring its deployability without requiring specialized infrastructure.
In comparative analysis against leading models like TAI-LSTM, TH-LSTM, and LightGBM-based frameworks, FraudX AI consistently achieved better or equivalent performance across key benchmarks while retaining explainability and real-time compatibility. While models like TAI-LSTM posted high recall scores, they lagged in precision, making them less practical due to excessive false positives.
Importantly, the study also evaluated the impact of anonymized features, with the authors acknowledging the limitations of explainability in the absence of real-world variable names. They recommend future versions of the framework be applied to labeled datasets with non-PCA-transformed attributes to enhance the interpretive value of SHAP outputs.
Looking ahead, the team plans to extend the model to cover a broader range of financial anomalies beyond credit card fraud, such as insurance fraud or network intrusion detection, where similar class imbalance issues prevail. Additional research will also explore adaptive learning mechanisms to help the system evolve in response to emerging fraud tactics.
The study titled "FraudX AI: An Interpretable Machine Learning Framework for Credit Card Fraud Detection on Imbalanced Datasets" is published in the journal Computers.
- FIRST PUBLISHED IN:
- Devdiscourse