Machine learning unlocks secrets of soil contamination via satellites

The findings come at a critical time. Traditional soil sampling, which relies on discrete data points, struggles to provide continuous maps of contamination over expansive regions. Laboratory analyses, while accurate, are labor-intensive and environmentally taxing. By contrast, satellite-based remote sensing, particularly visible and near-infrared spectroscopy, offers a rapid, scalable alternative. 


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 01-04-2025 17:42 IST | Created: 01-04-2025 17:42 IST
Machine learning unlocks secrets of soil contamination via satellites
Representative Image. Image Credit: ChatGPT

A new systematic review published in the journal Remote Sensing reveals that satellite imagery, paired with advanced machine learning algorithms, is transforming the way scientists detect soil pollution, offering a cost-effective and scalable solution to a growing global crisis.

Conducted by researchers from the University of Porto and the Universidade de Lisboa, the study "A Systematic Review of Machine Learning Algorithms for Soil Pollutant Detection Using Satellite Imagery" synthesizes eight years of data, highlighting the technology’s potential to monitor contaminants like heavy metals and microplastics across vast agricultural and industrial landscapes.

The research, led by Amir TavallaieNejad and a team of environmental engineers, analyzed 47 peer-reviewed studies from an initial pool of 1,018, spanning 2016 to 2024. It underscores the urgency of addressing soil pollution, which affects an estimated 506 million hectares of land worldwide - roughly the size of Western Europe. With traditional soil sampling methods proving slow, costly, and limited in scope, the findings signal a shift toward remote sensing as a vital tool for environmental protection.

Soil pollution, driven by industrial waste, agricultural runoff, and heavy metal contamination, poses a serious threat to ecosystems, food security, and public health. Traditional methods of soil sampling and laboratory analysis are costly, labor-intensive, and limited in spatial coverage. Satellite-based remote sensing, enhanced by machine learning techniques, offers a scalable, cost-effective alternative that enables the identification of contaminants across vast and often inaccessible regions.

The review pinpoints Sentinel-2 and Landsat 8 as the most widely used satellites for soil pollution detection. Sentinel-2, operated by the European Space Agency (ESA), delivers high-resolution multispectral images every five days, while Landsat 8, a NASA-USGS collaboration, provides comprehensive coverage every 16 days. Nearly 40% of the studies leveraged multiple satellites, enhancing data reliability and enabling detailed mapping of contaminants over large areas. 

Machine learning emerged as the backbone of this technological leap. Among the algorithms evaluated, Random Forest - a decision tree-based model - stood out, appearing in 33 of the studies and achieving top performance in 13 cases. Its ability to process large, complex datasets and identify key variables makes it ideal for classifying land cover and extracting pollution indicators from satellite imagery. Other methods, including Support Vector Machines and Artificial Neural Networks, also showed promise, with 56% of studies comparing multiple models to pinpoint the most effective tools for specific pollutants.

The findings come at a critical time. Traditional soil sampling, which relies on discrete data points, struggles to provide continuous maps of contamination over expansive regions. Laboratory analyses, while accurate, are labor-intensive and environmentally taxing. By contrast, satellite-based remote sensing, particularly visible and near-infrared spectroscopy, offers a rapid, scalable alternative.

Despite its promise, the study identifies significant challenges. The lack of standardized datasets and methodologies across studies complicates comparisons and hinders broader adoption. Variations in evaluation metrics, such as R-squared, Root Mean Squared Error, and F1 scores, further muddy the waters, making it difficult to benchmark algorithmic performance consistently. Sensor limitations also persist, with current spaceborne systems struggling to detect certain pollutants at fine scales. The researchers call for standardized frameworks and upgraded satellite capabilities to close these gaps.

Of the 47 studies reviewed, 34 focused on direct pollutant detection, targeting substances like copper, zinc, and microplastics, while 13 explored indirect indicators, such as vegetation health tied to soil contaminants. Validation remains a cornerstone of the approach, with most studies cross-referencing satellite data against on-ground soil samples. Larger sample sizes and rigorous accuracy metrics were prioritized, lending credibility to the results.

The implications are far-reaching. With soil pollution threatening ecological balance and human well-being, the integration of satellite imagery and machine learning could reshape environmental monitoring and policy. The review cites examples of success: one study mapped urban green spaces with 89% accuracy using Landsat 8, while another achieved 83% accuracy in land cover classification with Sentinel-2. Such precision could guide targeted remediation efforts, from cleaning up industrial sites to safeguarding farmland.

The review also highlighted emerging trends, including the use of deep learning methods such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), particularly Long Short-Term Memory (LSTM) models. These algorithms have demonstrated promise in capturing temporal changes in soil conditions and predicting pollution trends over time.

Moving forward, the researchers encourage the scientific community to expand upon these findings. The team promotes the integration of multi-sensor data and stronger interdisciplinary collaboration among data scientists, soil scientists, and environmental policymakers to fully realize the potential of this method.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback