Billions at stake: AI can spot high-growth firms before investors do

SMEs are widely recognized as the backbone of Europe’s economy, yet many face persistent challenges in accessing equity financing. Investors, on the other hand, often rely on personal networks or fragmented screening methods to identify promising firms. The machine learning framework presented by the researchers offers a standardized, EU-wide tool that could transform this process.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 18-09-2025 23:33 IST | Created: 18-09-2025 23:33 IST
Billions at stake: AI can spot high-growth firms before investors do
Representative Image. Credit: ChatGPT

A group of European researchers has unveiled a machine learning framework designed to identify investment-ready small and medium-sized enterprises (SMEs), aiming to strengthen equity access and spur economic growth. The study provides policymakers and investors with a data-driven tool to reduce missed opportunities in financing innovative and fast-growing firms.

The paper, titled Identification of Investment-Ready SMEs: A Machine Learning Framework to Enhance Equity Access and Economic Growth, was published in the journal Forecasting. It analyzes how advanced algorithms can be applied to the European Central Bank’s Survey on Access to Finance of Enterprises (SAFE) to streamline screening and improve investment decisions.

How does the framework define and identify investment readiness?

The authors establish a clear operational definition of investment readiness. According to their criteria, an SME is considered investment-ready if it demonstrates innovation, records turnover growth of more than 20 percent, and shows openness to equity financing.

The SAFE dataset used for the analysis contained 10,937 SME records, of which 12 percent were classified as investment-ready. With the target class so imbalanced, 88 percent not investment-ready, the researchers carefully selected algorithms and cost-sensitive methods to avoid biased predictions.

From the 51 variables in the dataset, 44 were used as inputs covering financial conditions, innovation activity, and management priorities. These were fed into a suite of machine learning models to determine which could best classify SMEs as suitable for investment.

Which machine learning models deliver the most accurate predictions?

The study tested nine algorithms: Logistic Regression, K-Nearest Neighbors, Random Forest, Support Vector Machines, Naïve Bayes, AdaBoost, Easy Ensemble, Balanced Bagging, and Gradient Boosting. Each was trained using stratified five-fold cross-validation with grid search optimization.

Gradient Boosting emerged as the top performer, achieving a balanced accuracy score of 0.754 and a ROC AUC of 0.815. Logistic Regression followed closely, providing stable performance but at a slightly higher error cost. Importantly, the authors used cost-sensitive evaluation to account for the economic implications of errors. False negatives, cases where an investment-ready firm is misclassified, were weighted five times more heavily than false positives, given the lost opportunities they represent.

Under this evaluation, Gradient Boosting still performed best, producing the lowest total misclassification cost. It correctly identified more than 71 percent of investment-ready SMEs while also maintaining strong accuracy in filtering out firms that were not ready for equity financing. Logistic Regression came a close second, while other models underperformed either in accuracy or cost efficiency.

Statistical testing reinforced these results. A DeLong test showed no significant difference in ROC AUC between Gradient Boosting and Logistic Regression, but a McNemar test confirmed that Gradient Boosting had a lower error rate overall.

What signals matter most for predicting investment readiness?

Apart from model performance, the authors sought to understand which factors most strongly influence predictions. Using feature importance rankings and SHAP analysis, they identified six key signals that consistently separated investment-ready firms from others.

The most influential predictors included confidence in negotiations with equity investors, financing sought specifically for growth, the priority given to future financing conditions, overall external financing environment, organizational autonomy, and investor willingness to provide capital. These variables captured both the financial realities of SMEs and their strategic positioning in attracting equity.

An error profile of the best-performing model revealed that it successfully identified 267 of 374 investment-ready SMEs while correctly classifying 1,440 of 1,814 non-ready firms. Although the model produced 374 false positives and 107 false negatives, its cost-sensitive design ensured that the most economically damaging errors were minimized.

Why does this matter for Europe’s SME ecosystem?

SMEs are widely recognized as the backbone of Europe’s economy, yet many face persistent challenges in accessing equity financing. Investors, on the other hand, often rely on personal networks or fragmented screening methods to identify promising firms. The machine learning framework presented by the researchers offers a standardized, EU-wide tool that could transform this process.

By systematically flagging investment-ready firms, the model could expand deal flow for equity investors, ensuring that innovative SMEs with high growth potential are not overlooked. For policymakers, it provides an evidence-based mechanism to support entrepreneurship, channel capital more efficiently, and address gaps in equity access.

The researchers note several limitations. The SAFE survey is self-reported and cross-sectional, introducing risks of bias and limiting longitudinal insights. Class imbalance remains a challenge, as investment-ready firms represent only a small share of the dataset. Moreover, external validation is needed across different countries and sectors to confirm robustness.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback