Privacy-preserving AI system tackles rising digital payment fraud

Privacy-preserving AI system tackles rising digital payment fraud
Representative image. Credit: ChatGPT

A new artificial intelligence-driven fraud detection system is changing the way financial institutions combat increasingly sophisticated digital payment fraud, offering a privacy-preserving alternative to traditional centralized models. The system introduces a hybrid framework that combines federated learning and ensemble machine learning to improve fraud detection performance while addressing growing concerns over data privacy and regulatory compliance.

The study, titled "A Federated Ensemble Learning Framework for Distributed Fraud Detection," published in Applied Sciences (2026, Volume 16), presents a novel architecture that enables multiple financial institutions to collaboratively train fraud detection models without sharing raw transactional data. The research comes amid rising global fraud losses and mounting pressure on banks and payment providers to deploy more advanced, privacy-compliant detection systems.

Rising fraud threats push need for privacy-preserving AI systems

The global financial ecosystem has seen a sharp increase in fraud risk alongside the rapid expansion of digital banking, e-commerce, and mobile payment platforms. The study highlights that fraud losses in the EU and European Economic Area alone reached €4.2 billion in 2024, reflecting a 17 percent increase year-on-year, while global credit card fraud losses are projected to exceed $43 billion by 2026.

Despite the deployment of strong authentication mechanisms, fraudsters are increasingly using social engineering and AI-driven tactics to manipulate users into authorizing fraudulent transactions. Traditional fraud detection systems, largely based on centralized machine learning models, struggle to keep pace with these evolving threats due to limitations in data sharing, model adaptability, and class imbalance in datasets.

The research identifies a critical tension at the core of modern fraud detection systems. Financial institutions require large volumes of diverse data to train accurate models, but privacy regulations and competitive concerns prevent them from sharing sensitive customer data. This has created a gap that conventional approaches have failed to address effectively.

Federated learning and ensemble models combine to close performance gaps

To overcome these limitations, the study proposes a hybrid framework that merges federated learning with ensemble learning techniques. Federated learning allows multiple institutions to train machine learning models collaboratively without exchanging raw data. Instead, each institution trains a local model on its own data and shares only model parameters with a central server for aggregation.

The framework incorporates three distinct machine learning models: XGBoost, CatBoost, and a multilayer perceptron neural network. Each model captures different aspects of transaction data, from nonlinear relationships to categorical features and hidden patterns. These models are trained independently across distributed environments and then combined using a weighted ensemble approach to produce a final prediction.

This dual-layer architecture addresses two major challenges simultaneously. First, it preserves data privacy by eliminating the need for centralized data storage. Second, it enhances predictive performance by leveraging the strengths of multiple models rather than relying on a single algorithm.

The ensemble component plays a critical role in improving accuracy. By assigning optimized weights to each model based on performance, the system generates more reliable predictions, particularly in highly imbalanced datasets where fraudulent transactions represent a small fraction of total activity.

This approach avoids common data manipulation techniques such as oversampling or undersampling, which can distort data distributions. Instead, it focuses on improving the learning process itself, resulting in more robust and realistic fraud detection outcomes.

Strong experimental results across multiple datasets and environments

The proposed system was evaluated using three widely recognized fraud detection datasets, including IEEE-CIS Fraud Detection data, simulated credit card transactions, and a European credit card dataset. The experiments compared centralized learning models with federated configurations involving three, five, and ten distributed clients.

Results show that traditional metrics such as accuracy are insufficient in fraud detection due to extreme class imbalance, with most transactions being legitimate. Instead, the study prioritizes recall, F1-score, and area under the precision-recall curve as more meaningful indicators of model performance.

In centralized environments, boosting algorithms such as XGBoost and CatBoost consistently outperformed neural network models. However, when transitioning to federated learning, all individual models experienced slight performance degradation due to data distribution challenges and reduced data availability per client.

The ensemble model, however, demonstrated consistent superiority across all configurations. It achieved higher recall rates and improved detection of fraudulent transactions while maintaining competitive precision. In several cases, the ensemble model not only matched but exceeded the performance of centralized systems, particularly in configurations with fewer distributed clients.

For example, in the simulated transaction dataset, the ensemble model achieved an AUC-PR score of 0.8690 in a three-client federated setup, outperforming both individual models and centralized benchmarks. Even as the number of clients increased and data fragmentation intensified, the ensemble approach maintained strong performance, mitigating the typical decline seen in federated systems.

Similar trends were observed across other datasets. In the IEEE-CIS dataset, the ensemble model preserved near-centralized performance levels while improving recall, a critical metric in fraud detection. In the European dataset, the ensemble approach even surpassed centralized models in certain configurations, achieving higher AUC-PR and F1-scores.

Statistical validation confirms model superiority

The study conducted rigorous statistical testing to validate the superiority of the proposed framework. Descriptive analysis showed that the ensemble model achieved the highest average AUC-PR score while maintaining the lowest variance, indicating both strong performance and stability across datasets.

The Wilcoxon signed-rank test further confirmed that the ensemble model significantly outperformed individual models, with statistically significant differences observed across all comparisons. The Friedman test reinforced these findings, ranking the ensemble model as the top performer among all evaluated approaches.

These results suggest that combining federated learning with ensemble strategies not only improves accuracy but also enhances model reliability in distributed environments, where data heterogeneity and imbalance are major challenges.

Trade-offs, limitations, and future directions

While the framework delivers strong results, the study acknowledges several trade-offs and limitations. One key observation is the balance between recall and precision. The model tends to prioritize detecting fraudulent transactions, leading to higher recall but potentially increasing false positives. This trade-off reflects a broader challenge in fraud detection systems, where minimizing missed fraud often comes at the cost of increased false alarms.

The research also notes that the current federated setup assumes balanced data distribution across clients, which may not fully reflect real-world scenarios where institutions vary widely in size and data characteristics. Additionally, while federated learning reduces the need for data sharing, it does not provide full protection against potential information leakage through model updates.

Future research is expected to focus on integrating advanced privacy-preserving techniques such as differential privacy and secure aggregation, as well as improving model performance under highly heterogeneous data conditions. The development of more adaptive aggregation strategies is also identified as a key area for further exploration.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback