IndiaAI Mission Driving Inclusive Growth with Open Datasets, Language Tech & R&D

India’s technology sector is expanding at record speed, with annual revenue projected to exceed $280 billion in 2025.


Devdiscourse News Desk | New Delhi | Updated: 08-08-2025 20:34 IST | Created: 08-08-2025 20:34 IST
IndiaAI Mission Driving Inclusive Growth with Open Datasets, Language Tech & R&D
Government initiatives such as IMPRINT and Uchhatar Avishkar Yojana (UAY) have allocated ₹1,000 crore for AI curriculum development, joint R&D, and industry-academia collaboration. Image Credit: Twitter(@PIB_India)
  • Country:
  • India

India’s Artificial Intelligence strategy, shaped by Prime Minister Narendra Modi’s vision to democratise technology, is fast transforming the nation’s digital future. The approach focuses on using AI to tackle India-specific challenges, create new avenues for economic growth, and generate employment opportunities across diverse sectors.

Union Minister for Electronics and IT, Ashwini Vaishnaw, outlined these developments in the Rajya Sabha, highlighting the IndiaAI Mission as a cornerstone for building an inclusive and world-class AI ecosystem rooted in the country’s development priorities.


India’s Current AI Landscape

India’s technology sector is expanding at record speed, with annual revenue projected to exceed $280 billion in 2025. This sector employs over six million professionals and hosts more than 1,800 Global Capability Centres (GCCs) — over 500 of them focusing specifically on AI.

The country’s startup ecosystem is equally vibrant, boasting nearly 1.8 lakh startups, with 89% of new ventures launched last year being AI-powered. International benchmarks such as the Stanford AI Index place India among the top global leaders in AI skills, capabilities, and policy readiness.

India is also the second-largest contributor to AI projects on GitHub, reflecting the strength of its developer community and its global relevance in open-source innovation.


The IndiaAI Mission: Launched in 2024

Introduced in 2024, the IndiaAI Mission seeks to make AI accessible, affordable, and applicable to all. Central to this mission is the development of robust, locally relevant datasets and the provision of platforms that empower developers, researchers, and startups to innovate without reinventing basic AI modules.


AIKosh – Unified Dataset Platform for India

AIKosh, the flagship IndiaAI Datasets Platform, consolidates data from government and non-government sources, offering:

  • 1,200+ India-specific datasets and 217 AI models across sectors such as health, agriculture, and education.

  • Curated datasets from government departments, academia, and Indian startups, ensuring cultural and geographic relevance.

  • Examples include farmer query data from Kisan Call Centres, geological surveys from states, and clinical, imaging, and pathology datasets for AI-based diagnosis of brain lesions.

  • Small AI models like Text-to-Speech (TTS) in Indian languages such as Bengali, Gujarati, Kannada, and Malayalam.

  • A sandbox environment for controlled testing by startups and academic institutions.

The platform has recorded 265,000+ visits, 6,000 registered users, and 13,000+ resource downloads, demonstrating strong adoption.


Bharat Data Exchange – Strengthening Data Accessibility

The Bharat Data Exchange (Bharat Data Platform) extends the Open Government Data (OGD) initiative by serving as a central repository for AIKosh. It provides machine-readable and human-readable datasets owned by government agencies, enabling secure and streamlined access for AI development.


Digital India Bhashini – AI for Language Inclusion

As part of the National Language Translation Mission (NLTM), Bhashini develops AI-driven language solutions to break linguistic barriers. Through the BhashaDaan platform, citizens contribute voice, text, and translations in 22 Indian languages.

With contributions from 70+ research institutions and domain experts, large annotated datasets are curated for technologies such as:

  • Speech recognition

  • Machine translation

  • Indic NLP applications

The project ensures diversity by capturing dialects and regional variations, safeguarding against AI bias. Partnerships with ministries, states, academia, and industry ensure cross-sector collaboration in Indic AI.


R&D Through National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS)

Under NM-ICPS, 25 Technology Innovation Hubs (TIHs) have been set up in leading academic institutions covering AI, ML, IoT, robotics, cybersecurity, and quantum technology. Key initiatives include:

  • IIIT Hyderabad TIH: Developed 105+ India-specific datasets, including the India Driving Dataset (IDD), clinical datasets, and 2,000+ digitised pathology images.

  • BharatGen consortium (IIT Bombay, IIT Madras, IIT Kanpur): Built a massive India-centric AI corpus with trillions of tokens, thousands of multilingual speech hours, and millions of local documents.

  • ARTPARK at IISc Bengaluru: Created the Vaani dataset (16,000 hours of audio in 54 languages) and the MIDAS medical imaging datasets for public health applications.


Medical Research Data for AI Innovation

The Indian Council of Medical Research (ICMR) has developed a Health Research Data Repository with centralised, secure access to datasets compliant with WHO, ISO, and national health protocols.

These include:

  • National NCD Monitoring Survey

  • ICMR-INDIAB study (2008–2020) covering 113,043 participants

  • TB treatment trials, diabetes registries, antimicrobial resistance data

  • IN-CXR chest radiographs


Funding and Education in AI

Government initiatives such as IMPRINT and Uchhatar Avishkar Yojana (UAY) have allocated ₹1,000 crore for AI curriculum development, joint R&D, and industry-academia collaboration.

The Anusandhan National Research Foundation (ANRF) runs an AI-for-Science programme to accelerate research in physics, chemistry, and biology using machine learning models.

The India AI Open Stack initiative offers a foundational AI architecture tailored for Indian researchers, embedding science and engineering models for national priorities.


Outcome: Democratized AI for India’s Growth

These integrated efforts — from open data platforms like AIKosh to language initiatives such as Bhashini, and from medical research datasets to high-performance R&D hubs — aim to produce high-quality, unbiased, vernacular AI solutions.

The government envisions an AI-driven India that is self-reliant, globally competitive, and inclusive, harnessing its diverse cultural, linguistic, and scientific resources for sustainable growth.

Give Feedback