Monitoring Deforestation with Open Data and Machine Learning — Part 2

Written by
Luis Di Martino
Published on
March 15, 2024

This article is the continuation of a two-part story. You can find the first section here.

In the previous article, we:

  • Presented how essential forests are as part of the solution for tackling climate change.
  • Introduced open data sources that we can use to monitor deforestation.
  • Explained how we could train a classifier to automatically detect changes in the available data.

This article uses the trained classifier to detect forest variations, analyze the obtained results, and discuss alternatives to improve the implemented solution.

Access to the evaluation data

Here we showcase how we can access last years’ forests information for a region in Pará in Brazil. Still, the approach works the same for any part of the world where data by Norway’s International Climate & Forests Initiative (NICFI) imagery program is provided.

Region of interest in Pará state in Brazil

We can query the available mosaics using Planet’s Python client as follows:

planet mosaics list | jq -r '.mosaics[] | [.name, .first_acquired, .last_acquired] | @tsv'

After selecting a particular mosaic (e.g. planet_medres_normalized_analytic_2020–06_2020–08_mosaic ), we can download the data for the particular region of interest (ROI) by providing its coordinates as shown below:

planet mosaics download planet_medres_normalized_analytic_2020-06_2020-08_mosaic --bbox -53,-4,-52,-3

The resulting product is a set of GeoTIFF image files containing the requested ROI divided by tiles of size 4096 by 4096 pixels. Considering the ground-sampling-distance of 4.77m , each tile covers a 380 square-kilometer area.


We now need to classify the downloaded data. The Resnet50 neural network that we trained previously takes as input 224 by 224 pixels images containing the three visible bands as channels. To comply with the expected format, we:

  • Drop from the input data the near-infrared channel. Later on, we discuss how we can use the NIR band information as part of future work.
  • Divide the area covered in each GeoTIFF file by tiles of the expected size.

Luckily, the open-source library telluric developed by Satellogic provides this tiling functionality out-of-the-box. The library also provides easy management of geographical polygons and their saving and loading using geojson files. This feature allows us to handle the classification results efficiently. For each tile that goes through the classifier, we generate a corresponding shape having the tile’s footprint and labels outputted by the classifier as properties. We collect all the shapes and save them in a geojson file that we can easily open in a geographic information system (e.g., QGIS) and inspect, as shown below.

You can find the code implementing the loading of rasters, the division by tiles, their classification, and the generation of the output geojsons here.

Image captured in 2020, its tile division, and classification results in a particular tile.

Let’s do a quick recap: we now have images of the region of interest taken at different years, and we have their corresponding land coverage classification. The only missing part for having an automatic deforestation monitor is developing criteria to tag zones that were deforested between two points in time.

The classifier identifies each tile using up to seventeen categories. Some can be assigned concurrently (e.g., primary and clear), while others are opposites (e.g., clear and cloudy). We can easily detect deforested tiles by evaluating some particular labels:

  • primary: is shorthand for “primary rainforest”, or what is known colloquially as virgin forest. Generally speaking, we use the “primary” label for any area that exhibits dense tree cover.
  • agriculture: one of the main drivers of deforestation in the Amazon is agriculture. Large plots of land are cleared from trees to make plantations or breed animals on a ranch. The classifier labels portions of land with such usages with this tag.
  • habitation and road: a second reason behind deforestation is expanding human-populated areas and the infrastructure this augmentation demands. We use these two categories for labeling human-made structures. habitation is employed to tag homes or buildings, including anything from dense urban areas to rural villages. road is used to label portions of land having paths, roads, or highways.

Clouds are a common phenomenon in remote sensing, making it impossible to use some captured images. Luckily, the classifier we trained allows classifying tiles according to the cloud coverage in four categories: clear, partly cloudy, cloudy, and haze. In our particular case, clouds are not an issue when working with data provided in the NICFI program. The images they deliver are mosaics generated from several captures for each period. The pictures considered for the mosaic were curated to be cloud-free. But, if one wants to work with high-resolution data from another source (e.g., Maxar or Satellogic), this classification can become valuable.

Using the classification output geojson files and the labels detailed above, we can automatically develop different strategies to tag deforested regions between two points in time.


Here we are not evaluating the classifier using the typical approach of benchmarking its accuracy on previously unseen labeled test data. We did that when submitting the implemented classifier for benchmarking by the Kaggle challenge. We report its score on the first part of this story. Sadly we can’t evaluate against ground-truth labels using the NICFI imagery as they are unlabeled data. We can assess some outputs visually to grasp an idea of the classifier performance on NICFI data.

We show changes in the region of interest between 2016 and 2020. We mark as deforested areas that did not have the agriculture nor the habitation labels initially and have any of these labels in the latter. You can find the code implementing the classification here.

Usually, when classifying an input into two possible classes, we obtain both correct and incorrect detections. Those are referred to as True Positives and False Positives. The first refers to cases where the system tags the area as being deforested, and we can visually confirm that trees were felled in the region. Let’s first review some true positive examples:

In the examples shown above, the areas were deforested to make place for what it looks like either plantations or ranches for animal breeding.

False Positives correspond to cases in which the classifier makes an incorrect detection, labeling as deforested an area where we can visually assess that no changes in the forest have occurred. Let’s now review some erroneous detections:

By visually inspecting the images, we could try to assess why each one was misclassified:

  • a: This is a very tricky case. One could say that there’s deforestation in the region because of the small portion in the top-left corner in which trees were cut down. But in a significant part of the area, the forest seems to be unchanged. Clouds in the first image and shadow of the clouds in the second tricked the classifier in this case.
  • b: In this example, the classifier did not assign the habitation label in the first image, and it did it on the second one. There seem to be some buildings on the shoreline, which can explain why this category was assigned. These buildings look saturated in the capture taken in 2016, which most surely, made the classifier miss detecting the constructions. This example contains an additional difficulty; it captured part of the Amazon River. In cases like this one, we could benefit from using the label water that the classifier also outputs to refine further the criteria we used to consider a region as deforested.
  • c: There are cases like this one in which clouds are covering most of the captured area. For this to be an issue, it’s not necessary to have clouds in both captures. When one of the two is already covered, the detection of deforested regions becomes unfeasible. In this example, the image obtained in 2020 clearly shows that the area lacks trees and contains a ranch. But we can not consider this region as deforested because we lack how it looked in 2016. Having clouds is not expected as the provided mosaics were generated trying to avoid them, but this does occur in some tiles. The classifier outputs a set of labels related to cloud coverage: clear, cloudy, partly-cloudy, and haze. We could potentially use them to detect cases like this one and make the labeling of deforested regions more robust.
  • d: In this case, we are affected by a usual problem in remote sensing: the capture taken in 2016 was oversaturated. When this occurs, details of highly reflectance objects are lost. Colloquially, this is referred to as burnt pixels. In this example, we can see this behavior in the white roofs of buildings. When the details get lost, the classifier struggles to identify the underlying structures. Due to this, the classifier missed the habitation label in the first capture. Therefore, it incorrectly considered the region as deforested.

Future work

Classifier improvements

We could try several variations when implementing the classifier. On the one hand, we could try different network architectures. We used a ResNet50 convolutional network (ConvNet) architecture. But several other network architectures could achieve better performance or a similar one but benefit from reducing the number of parameters and operations. E.g., Inception, DenseNets, or the newer EfficientNet proposed by Google in 2019. There is a post in the Kaggle challenge’s forum sharing experiments with different types of ConvNets. Without changing the network architecture, there are several other classifier features to review and test alternatives: the definition of the loss function, the learning rate evolution strategy, and the data augmentation of the training data, among others. Modifying these classifier’s properties could lead to an improvement in its performance.

Training data improvements

A Machine Learning system performs the best when it reaches generalization during training. The key for this to happen is that the training dataset contains examples resembling the data features that the classifier will later see in production. We are not entirely meeting this in our particular scenario for two main reasons.

  1. Resolution. The image resolution of the training dataset is not the same as the evaluation data. In the first, the ground-sampling-distance (GSD) is 3m; in the latter, the pixel size represents 4.7m.

We can partially compensate for this difference in the GSDs by introducing scaling as part of the data augmentation while training, which we didn’t do in this proof of concept. Apart from this, we may benefit from directly sub-sampling all the training data to have the exact resolution as the evaluation dataset.

  1. Differences in colors. Satellite imagery companies typically provide two types of products: scientific or analytic. Analytic products are optimized for visual assessment by humans. They provide a visual appearance with natural and appealing colors for people working with the images. When ConvNets are used with remote sensing data in some applications, e.g., detecting cars or buildings, this may not be important as the classifier looks for structures and shapes. In our particular scenario, this is not the case. The trees and other features we classify are more discriminative regarding textures than shapes. This makes our system very dependent on the color content. We can safely assume changes in one input’s three RGB channels will affect the labels assigned to it. Both the training and evaluation datasets we use in this demo come from Planet. But there’s no guarantee that they enhanced the red (R), green (G), and blue (B) bands in the same way on them. They provided the data collections at different moments in time. They may have introduced variations in the code and algorithms used for enhancing the visual products. This results in differences in the colors between the training and evaluation datasets.

A possible solution to this issue would be to analyze the distribution of colors in both datasets. We could apply transformations in the images’ color bands based on statistics to have similar distributions in the two datasets. Another solution could be to try accessing the scientific products instead of the analytic ones. Sadly, the NICFI program does not provide this type of product.


Reviewing correct and incorrect detections is an excellent first approach to understanding how the classifier performs and getting some intuition of the problematic cases. But, to improve it reliably, we need an automatic evaluation of its performance. For implementing such validation, in our particular case, we would need to have ground-truth labels for the NICFI imagery of the region of interest, which are not provided.

There are tree cover and tree loss data sources publicly available online. For example, the information shown in the Global Forest Watch visualization tool comes from a project carried on by the University of Maryland. They characterized the evolution of forests from 2000 to 2019. They provide the classification at a resolution of thirty meters per pixel. This is expected because it is based on imagery with this GSD obtained with Landsat 7 and 8. Our classification is based on images with a higher definition of 4.7 meters per pixel. Nevertheless, the data provided by the University of Maryland can still be used as a ground truth. We are classifying tiles of 224x224 pixels, which, at 4.7 mts/px, corresponds to a region of ​​1052x1052 meters. This area would be spanned by 35x35 pixels of the data source provided at 30 mts/px. A possible strategy to use this data as a reference of tree cover and tree loss could be to assign the most prevalent label of the pixels inside the 1052x1052 mts region and assign it to the tile.

Explore other data sources

There are other sources of data we are currently not using that can help improve the classification performance.

One of these is the near-infrared (NIR) spectral band. It captures information that is not available in the visible RGB bands we see with our naked eyes. This information is beneficial to assess changes in vegetation territory. Content in the NIR band is typically combined with the red band to obtain the Normalized Difference Vegetation Index (NDVI). Variations of this index allow to measures changes in live green vegetation.

The NIR band is available both in the training and evaluation datasets. In the first, they include the NIR band in the GeoTiff files we can use instead of the jpeg previews as we mentioned in the previous part of the article while describing the training data. Before jumping into using the NIR band, a piece of good advice is to review some messages posted by competitors of the Kaggle challenge, e.g.; here and here. They reported not seeing an improvement in performance when using the GeoTiff files instead of their jpeg counterparts.

Another useful source of data could be the three-dimensional information of forest canopy. We can obtain it using two different approaches: either capturing it with a LiDAR device or reconstructing it by using Stereoscopy.

Height map of forest canopy obtained with LiDAR.

Using three-dimensional information is an approach that several people working to monitor forests already reported as beneficial. For instance, Pachama reports using it as one of their data sources.

Another data type that has proven helpful for monitoring forests comes from Synthetic Aperture Radar (SAR) satellites.

In summary, there are many diverse types of data coming from satellites using varied capture strategies; this is good news for us as they may complement each other by capturing different features of the forests. The image below is an excellent review of data sources available to monitor forests.

Useful satellites for forest monitoring — Taken from Harry Carstairs (@harry_carstairs on Twitter)


This article allows us to see first-hand the impact of human activities on forests across time. It is an example showing the vast amount of information available to understand how we impact ecosystems. It’s not the only one; there are lots of data showing how we disrupt the natural landscapes. How we unbalanced the sources and sinks of greenhouse gases in the atmosphere is well understood. There is extensive evidence proving climate change is real. The last report from The Intergovernmental Panel on Climate Change (IPCC) of the United Nations (UN) confirms what we already known from previous assessment reports.

Despite all this, there are still people denying climate change or our responsibility in causing it. We should not be pessimistic about it. This is not the first case in which data proved how human activity affected the environment and was faced with criticism and denial. The use of chemicals chlorofluorocarbons (CFCs) became widespread in the 1960s. These components were used in industries as well in households in air conditioning units and aerosols like hairspray. When scientists analyzed the ozone layer changes, it became well understood that we were damaging it by the extensive use of products containing those chemicals. The discovery initially faced various attempts to discredit the investigation by deniers and companies using the harmful compounds. The president of one aerosol manufacturing company even claimed that the KGB was behind the criticism of CFCs. The scientific findings finally won the dispute giving place to the Montreal Protocol of 1987. This treaty phased out the production of ozone-destroying substances such as CFCs. The UN estimates that the hole has shrunk between one and three percent per decade since then. Several articles (e.g., here and here) show how the ozone layer healing is closely related to this change in our behavior and is not just a lucky coincidence.

Conserving existing forests or carrying out projects to increase tree-covered regions is part of the solution. Also, the carbon offsets initiative is a great way to provide economic incentives to these activities. But rising forests coverage or developing more technological solutions to remove greenhouse gases from the atmosphere will not completely solve the problem. To reach the point where the concentration of such gases starts to decrease, we need to cut emissions as much as possible, as soon as we can. It will not be an easy task, but it is one worth pursuing. This is greatly explained in the article “To Stop Climate Change, Time is as Important as Tech” by Dr. Jonathan Foley (Executive Director of Project Drawdown).

We’re in a race against time. Luckily we have science and data as our allies. It is up to us to use this information to shape our behavior and push policymakers and companies to implement changes to reduce emissions.