Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention (CVPR 2020, Oral)
PyTorch implementation of the model presented in "Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention" published ar CVPR 2020.
Paper abstract:
Satellite image time series, bolstered by their growing availability, are at the forefront of an extensive effort towards automated Earth monitoring by international institutions. In particular, large-scale control of agricultural parcels is an issue of major political and economic importance. In this regard, hybrid convolutional-recurrent neural architectures have shown promising results for the automated classification of satellite image time series. We propose an alternative approach in which the convolutional layers are advantageously replaced with encoders operating on unordered sets of pixels to exploit the typically coarse resolution of publicly available satellite images. We also propose to extract temporal features using a bespoke neural architecture based on self-attention instead of recurrent networks. We demonstrate experimentally that our method not only outperforms previous state-of-the-art approaches in terms of precision, but also significantly decreases processing time and memory requirements. Lastly, we release a large open-access annotated dataset as a benchmark for future work on satellite image time series.
- 17.08.2021 Check out our new approach for panoptic segmentation of satellite image time series, as well as our new benchmark dataset for semantic and panotpic segmentation of satellite image time series.
- 17.07.2020 Check our lightweight version of the TAE, a channel grouping strategy brings better performance with 10 times fewer parameters.
- 30.03.2020 Dataset preparation script available in 'preprocessing' folder + Variation of the PixeSetData class that loads all samples to RAM at init.
- 12.03.2020 Bug fix in the TAE script (see pull request comments): if you were using a previous version, re-download the pre-trained weights.
- Pytorch + torchnet
- numpy + pandas + sklearn
The code has been tested in the following environment:
Ubuntu 18.04.1 LTS, python 3.6.6, pytorch 1.1.0, CUDA 10.0
A toy version of the Pixel-set dataset can be directly downloaded here, to get an idea of the dataset structure.
The complete Pixel-set and Pixel-patch datasets are accessible on Zenodo at the following links:
We also provide the pre-trained weights for inference.
- The PyTorch implementations of the PSE, TAE and PSE+TAE architectures are located in the
models
folder. - The folder
learning
contains some additional utilities that are used for training. - The repository also contains two
high-level scripts
train.py
andinference.py
that should make it easier to get started.
Run the train.py
script to reproduce the results of the PSE+TAE architecture presented in the paper.
You will just need to specify the path to the Pixel-Set dataset (link above) with the --dataset_folder
agrument.
The default settings of the train.py
script are those used to produce the results in the paper.
Yet, some options are already implemented to play around with the model's hyperparameters and other training settings.
These options are accessible through an argparse menu (see directly inside the script).
-
You can use the pre-trained weights in the
inference.py
script to produce predictions on our dataset or your own, provided that it is formatted as per the indications below. You will need to pass the path to the unzipped folder containing the weights with the--weight_dir
argument. (do not uncompress themodel.pth.tar
files as the script takes care of this.) -
The two components of our model (the PSE and the TAE) are implemented as stand-alone pytorch nn.Modules (in
pse.py
andtae.py
) and can be used for other applications. While the PSE needs to be used in combination with the PixelSetData class, the TAE can be applied to any sequential data (with input tensors of shape batch_size x sequence_length x embedding_size).
In order to use the PixelSetData dataset classs with other data than those provided in the link above, the data folder should be structured in the following fashion:
Each dataset sample consits in the different observations for a single parcel.
The observations are aggregated in a single array of shape TxCxS with T the number of temporal observations,
C the number of channels, and S the number of pixels in the parcel (different for each data sample).
Each of these arrays should be stored separately in a numpy file: unique_id_of_the_sample.npy
All the individual .npy
files are stored in the same sub-directory DATA.
The normalisation values should be computed beforehand and stored in the form of a tuple of arrays (means, stds) in a pickle file in the main folder. The PixelSetData dataset class can adapt to different normalisation strategies depending on the shape of the arrays:
- Channel-wise normalisation for each date → the arrays have have shape (TxC)
- Channel-wise normalisation → the arrays have shape (T,)
- Global normalisation → In that case each of the two arrays consists in a single value.
The labels should be stored in the META/labels.json
file. This file has a nested dictionary like structure and
can contain multiple nomenclatures:
labels.json = {
"Name_of_nomenclature1": {
"unique_id_0": label_0,
...,
"unique_id_N": label_N,
},
"Name_of_nomenclature2": {
"unique_id_0": label_0,
...,
"unique_id_N": label_N,
}
}
The dates of the observations, if they are going to be used for the positional encoding,
should be stored in YYYYMMDD format in the META/dates.json
file:
dates.json = {
1: date_0,
...,
T: date_T,
}
If some pre-computed static parcel features are to be used between the two MLPs of the PSE,
they should be stored in another json file META/name_of_features.json
:
name_of_features.json = {
"unique_id_0": features_0,
...,
"unique_id_N": features_N,
}
The dataset folder should thus have the follwoing structure:
Dataset_folder
│ normalisation_values.pkl
└─DATA
│ │ sample0.npy
│ │ . . .
│ │ sampleN.npy
└─META
│ labels.json
│ dates.json
│ geomfeat.json
- The Temporal Attention Encoder is heavily inspired by the works of Vaswani et al. on the Transformer, and this pytorch implementation served as code base for the TAE.py script.
- Credits to github.com/clcarwin/ for the pytorch implementation of the focal loss
In case you use part of the present code, please include a citation to the following paper:
@article{garnot2019psetae,
title={Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention},
author={Sainte Fare Garnot, Vivien and Landrieu, Loic and Giordano, Sebastien and Chehata, Nesrine},
journal={CVPR},
year={2020}
}