AI monsoon forecasts help millions of Indian farmers plan for rainfall risk


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 16-03-2026 07:42 IST | Created: 16-03-2026 07:42 IST
AI monsoon forecasts help millions of Indian farmers plan for rainfall risk
Representative Image. Credit: ChatGPT

For millions of farmers, the timing of the monsoon is not a background climate signal but a direct economic trigger that can shape planting, seed choice, and investment decisions. Researchers argue that better forecasts can reduce this uncertainty, but only if they are designed around how farmers actually make choices in the field.

In their study, “Designing Probabilistic AI Monsoon Forecasts to Inform Agricultural Decision-Making,” the authors present a system that blends artificial intelligence weather models with a new statistical approach built around changing farmer expectations through the season. The paper shows how this framework improved monsoon onset forecasting and supported a large-scale rollout to Indian farmers

A forecast built around farmer decisions, not just atmospheric skill

The decision-theory framework meant for situations where forecasters cannot prescribe a single best action because farmers are too heterogeneous. One farmer may be more risk-averse and unwilling to plant unless the chance of a stable rainy onset is very high. Another may have irrigation or drought-tolerant seed and be more willing to take early risks. A household with income from outside farming may tolerate uncertainty differently from one fully dependent on the harvest. Because those differences are often invisible to a central forecaster, the authors argue that the most useful forecast is not a simplified advisory but a well-calibrated probabilistic message that preserves information instead of flattening it.

According to the researchers, probabilistic forecasts allow users with different constraints to act on the same weather information in different ways. A deterministic message that says onset is or is not expected soon may hide exactly the uncertainty that matters most to a farmer weighing whether to plant, wait, invest in a higher-yield seed, or protect against possible failure. The paper therefore argues that forecasters should avoid coarsening information when the end users are diverse and when the optimal response depends on private knowledge the users hold about their own circumstances.

That logic shaped the operational choices in the Indian monsoon system. The researchers focused on monsoon onset, a key trigger for agricultural decisions in many parts of the tropics. They used an agronomic definition centered on the first wet spell of the season that is not quickly followed by a prolonged dry spell, rather than relying only on broader circulation-based weather markers. This distinction matters because a few early rains can tempt planting, but if those rains are followed by a damaging dry period, crops may fail. The forecast target was therefore built around the event farmers care about operationally: the beginning of sustained, agriculturally meaningful rainfall.

The study also argues that forecast value depends on what farmers already know before receiving any new model output. A standard climatology baseline, such as the historical median onset date, can be misleading in this context. If the usual onset date has already passed and continuous rain still has not begun, farmers will naturally update their expectations and look ahead rather than continue believing onset has likely already happened. A forecasting system that ignores this evolving understanding may appear skillful against a simplistic historical benchmark while offering little real value to farmers, or even misleading some of them.

To solve that problem, the authors developed what they call an evolving-expectations model. This statistical model uses Bayesian updating and long-run observational rainfall data to continually revise the probability of onset as the season progresses without onset occurring. In effect, it shifts probability mass forward through the season as each dry day passes. That makes it a more realistic representation of what an informed farmer would believe in the absence of a new AI forecast. The paper argues that any forecast meant for dissemination should beat this updated baseline, not merely a static climatological median.

The authors suggest that forecast evaluation in climate services should generally compare new systems against baselines that include information already available to users. Otherwise, forecasters risk overstating the benefit of new technology and distributing information that is less useful than the beliefs people would form on their own from direct observation and local knowledge.

Why blending AI with evolving expectations outperformed standard approaches

The forecasting system itself combines the evolving-expectations model with outputs from two artificial intelligence weather prediction models selected through earlier decision-oriented benchmarking: Google’s NeuralGCM and the European Centre for Medium-Range Weather Forecasts’ AIFS. NeuralGCM is described as a hybrid model, while AIFS is a fully data-driven system. The researchers used both to forecast rainfall patterns relevant to onset, then blended those predictions with the statistical expectation model through a multinomial logistic regression structure that allows the weight of each information source to vary by lead time.

That flexible weighting is one of the paper’s main technical advances. AI rainfall forecasts can be highly informative at short lead times but tend to degrade as the horizon lengthens. By contrast, the evolving-expectations model remains useful precisely because it captures how the set of plausible onset dates changes as the season unfolds. The blended model can therefore lean more heavily on AI when short-run rainfall signals are strong and more heavily on evolving expectations when longer-run AI rainfall skill weakens. The result is not a simple average but a forecast designed to exploit the different strengths of each component.

The study reports that this blended system outperformed both static climatology and the evolving-expectations baseline across key probabilistic metrics, including Brier Skill Score, Ranked Probability Skill Score, and area under the ROC curve, during the main 2000–2024 cross-validation period, an additional 1965–1978 holdout period, and the live 2025 dissemination period. The paper states that when pooled across lead times out to four weeks, the blended model improved Brier Score by roughly 5 to 10 percent, improved Ranked Probability Score by 20 to 25 percent, and increased AUC by 3 to 5 percentage points relative to static climatology during 2000–2024.

The gains were particularly striking in 2025, when the monsoon behaved unusually. The paper notes that the monsoon first reached mainland India eight days earlier than normal on May 24, then stalled in its northward progression for more than two weeks. In such a year, static climatology performed poorly because the season did not follow the usual historical pattern. Yet the blended model maintained skill and, relative to the static climatological baseline, showed around a 20 percent improvement in both Brier Score and Ranked Probability Score as well as about a 20-point increase in AUC.

The authors emphasize that the system remained well calibrated, meaning that the probabilities it issued aligned with observed frequencies more closely than raw AI model outputs. They show that raw NeuralGCM forecasts were overconfident, while the blended approach corrected this problem and yielded better reliability. This mattered because the entire decision-theory framework depends on users being able to trust that forecast probabilities genuinely reflect weather likelihoods rather than exaggerated model certainty.

The study also found that the blended model beat every fixed-weight multimodel ensemble tested during the main cross-validation period, even when the ensemble weights were chosen after the fact to maximize performance. In other words, simply averaging models, even with optimized weights, did not match the more flexible blending procedure. That result suggests the system’s gains came not just from pooling models but from dynamically combining different kinds of information in a decision-oriented way.

Another practical strength was resilience under data constraints. The evolving-expectations model was trained on 124 years of Indian rain-gauge data, but the authors tested what would happen if much shorter training histories were available, as in other monsoon regions with sparser records. They found that while the standalone statistical model lost skill with less historical data, those losses did not meaningfully carry through into the blended model. This suggests the approach could remain viable in data-poor settings if AI rainfall information and climatology still provide overlapping but complementary signals.

From academic model to national rollout in India

What separates this paper from many forecast studies is that the system was not left at the prototype stage. The authors say the framework and models were used in a 2025 program of India’s Ministry of Agriculture and Farmers’ Welfare, which distributed local probabilistic onset forecasts weekly to 38 million farmers across 13 states. That deployment turned the study from a methodological contribution into a real-world test of whether AI-enabled forecast design can function at national scale.

The forecasts were produced weekly rather than once at the start of the season, allowing the information to become more useful as onset approached. This updating feature is central to the study’s logic. Seasonal forecasts are often issued just once, but the authors argue that a weekly system lets farmers revise plans as the probability distribution changes and as the atmosphere reveals more information. A forecast made in late May, early June, and mid-June can carry very different value, even within the same season, because the decision window for sowing and investment is dynamic rather than fixed.

The paper also connects the forecast design to previous evidence on forecast use. Rather than claiming value solely from improved skill scores, the authors ground the effort in earlier experimental work showing that farmers do change decisions when they receive information about local rainy season onset. Their decision-theory framework argues more broadly that the right way to assess forecast value in such heterogeneous settings is not by asking whether outside observers think a decision was correct, nor by looking only at realized yield or profit in a single season. A risk-averse farmer may rationally choose a lower-yield but safer option in response to a forecast, and that decision could still be welfare-improving. What matters most is whether the forecast shifts choices by providing relevant information.

That argument is likely to influence how future climate adaptation tools are judged. The study warns that experiments measuring only yields or profits can miss the true value of forecasts because outcomes depend on many unobserved shocks and because some forecast responses are explicitly aimed at reducing risk, not maximizing average output. In this framework, a forecast is useful if it changes behavior in a context where better information should matter. For agricultural policy, that provides a clearer rationale for forecast dissemination even in years when the realized weather outcome does not make every cautious decision look profitable after the fact.

The researchers are also explicit about the system’s current limitations. The 2025 implementation worked at weekly resolution and on a spatial grid of 2 degrees by 2 degrees, which is coarser than many local farm decisions. They note that future work could push toward daily forecasts and finer spatial detail while retaining the gains from blending. They also suggest that more models, including conventional numerical weather prediction systems, could be added to the framework if they contribute new information rather than duplicating what other components already capture.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback