Skip to content

Commit

Permalink
Release blank / nonblank model (#228)
Browse files Browse the repository at this point in the history
* add bnb weights and model option

* update docs and add model as a test option

* missing bracket

* correct load validation

* update name and version number

* add official models folder for blank nonblank

* format

* add bnb template and add mdlite image size to td

* reset patience

* always keep blank model if binary and it exists

* udpate comment

* test real pred for species and blank model

* note default in doc string

* WIP add number of videos to table and include blank nonblank

* simplify redundancy and finish BNB

* update to four models and add bnb

* build docs

* tweak

* lowercase label column before OHE

* fix test because we are now lowercasing species

* test case for blank

* remove model count
  • Loading branch information
ejm714 authored Sep 23, 2022
1 parent 2a4e9fc commit 291b34f
Show file tree
Hide file tree
Showing 20 changed files with 364 additions and 59 deletions.
4 changes: 2 additions & 2 deletions docs/docs/configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ Path to a model checkpoint to load and use for inference. If you train your own

#### `model_name (time_distributed|slowfast|european, optional)`

Name of the model to use for inference. The three model options that ship with `zamba` are `time_distributed`, `slowfast`, and `european`. See the [Available Models](models/species-detection.md) page for details. Defaults to `time_distributed`
Name of the model to use for inference. The model options that ship with `zamba` are `blank_nonblank`, `time_distributed`, `slowfast`, and `european`. See the [Available Models](models/species-detection.md) page for details. Defaults to `time_distributed`

#### `gpus (int, optional)`

Expand Down Expand Up @@ -301,7 +301,7 @@ A [PyTorch learning rate schedule](https://pytorch.org/docs/stable/optim.html#ho

#### `model_name (time_distributed|slowfast|european, optional)`

Name of the model to use for inference. The three model options that ship with `zamba` are `time_distributed`, `slowfast`, and `european`. See the [Available Models](models/species-detection.md) page for details. Defaults to `time_distributed`
Name of the model to use for inference. The model options that ship with `zamba` are `blank_nonblank`, `time_distributed`, `slowfast`, and `european`. See the [Available Models](models/species-detection.md) page for details. Defaults to `time_distributed`

#### `dry_run (bool, optional)`

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/debugging.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The dry run will also catch any GPU memory errors. If you hit a GPU memory error

#### Decreasing video size

Resize video frames to be smaller before they are passed to the model. The default for all three models is 240x426 pixels. `model_input_height` and `model_input_width` cannot be passed directly to the command line, so if you are using the CLI these must be specified in a [YAML file](yaml-config.md).
Resize video frames to be smaller before they are passed to the model. The default for all models is 240x426 pixels. `model_input_height` and `model_input_width` cannot be passed directly to the command line, so if you are using the CLI these must be specified in a [YAML file](yaml-config.md).

If you are using MegadetectorLite to select frames (which is the default for the official models we ship with), you can also decrease the size of the frame used at this stage by setting [`frame_selection_height` and `frame_selection_width`](configurations/#frame_selection_height-int-optional-frame_selection_width-int-optional).

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/extra-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ The options for `weight_download_region` are `us`, `eu`, and `asia`. Once a mode

## Video size

When `zamba` loads videos prior to either inference or training, it resizes all of the video frames before feeding them into a model. Higher resolution videos will lead to superior accuracy in prediction, but will use more memory and take longer to train and/or predict. The default video loading configuration for all three pretrained models resizes images to 240x426 pixels.
When `zamba` loads videos prior to either inference or training, it resizes all of the video frames before feeding them into a model. Higher resolution videos will lead to superior accuracy in prediction, but will use more memory and take longer to train and/or predict. The default video loading configuration for all pretrained models resizes images to 240x426 pixels.

Say that you have a large number of videos, and you are more concerned with detecting blank v. non-blank videos than with identifying different species. In this case, you may not need a very high resolution and iterating through all of your videos with a high resolution would take a very long time. For example, to resize all images to 150x150 pixels instead of the default 240x426:

Expand Down Expand Up @@ -113,7 +113,7 @@ A simple option is to sample frames that are evenly distributed throughout a vid

### MegadetectorLite

You can use a pretrained object detection model called [MegadetectorLite](models/species-detection.md#megadetectorlite) to select only the frames that are mostly likely to contain an animal. This is the default strategy for all three pretrained models. The parameter `megadetector_lite_config` is used to specify any arguments that should be passed to the MegadetectorLite model. If `megadetector_lite_config` is None, the MegadetectorLite model will not be used.
You can use a pretrained object detection model called [MegadetectorLite](models/species-detection.md#megadetectorlite) to select only the frames that are mostly likely to contain an animal. This is the default strategy for all pretrained models. The parameter `megadetector_lite_config` is used to specify any arguments that should be passed to the MegadetectorLite model. If `megadetector_lite_config` is None, the MegadetectorLite model will not be used.

For example, to take the 16 frames with the highest probability of detection:

Expand Down
5 changes: 4 additions & 1 deletion docs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,10 @@
[![codecov](https://codecov.io/gh/drivendataorg/zamba/branch/master/graph/badge.svg)](https://codecov.io/gh/drivendataorg/zamba)
<!-- [![PyPI](https://img.shields.io/pypi/v/zamba.svg)](https://pypi.org/project/zamba/) -->

<div class="embed-responsive embed-responsive-16by9" width=500> <iframe width=600 height=340 class="embed-responsive-item" src="https://s3.amazonaws.com/drivendata-public-assets/monkey-vid.mp4" frameborder="0" allowfullscreen=""></iframe></div>

<div class="embed-responsive embed-responsive-16by9" width=500>
<iframe width=600 height=340 class="embed-responsive-item" src="https://s3.amazonaws.com/drivendata-public-assets/monkey-vid.mp4"
frameborder="0" allowfullscreen=""></iframe></div>

> *Zamba* means "forest" in Lingala, a Bantu language spoken throughout the Democratic Republic of the Congo and the Republic of the Congo.
Expand Down
79 changes: 54 additions & 25 deletions docs/docs/models/species-detection.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Available models

The algorithms in `zamba` are designed to identify species of animals that appear in camera trap videos. There are three models that ship with the `zamba` package: `time_distributed`, `slowfast`, and `european`. For more details of each, read on!
The algorithms in `zamba` are designed to identify species of animals that appear in camera trap videos. The pretrained models that ship with the `zamba` package are: `blank_nonblank`, `time_distributed`, `slowfast`, and `european`. For more details of each, read on!

## Model summary

Expand All @@ -10,34 +10,52 @@ The algorithms in `zamba` are designed to identify species of animals that appea
<th>Geography</th>
<th>Relative strengths</th>
<th>Architecture</th>
<th>Number of training videos</th>
</tr>
<tr>
<td><code>blank_nonblank</code></td>
<td>Central Africa, West Africa, and Western Europe</td>
<td>Just blank detection, without species classification </td>
<td>Image-based <code>TimeDistributedEfficientNet</code></td>
<td>~263,000</td>
</tr>
<tr>
<td><code>time_distributed</code></td>
<td>Central and West Africa</td>
<td>Better than <code>slowfast</code> at duikers, chimps, and gorillas and other larger species</td>
<td>Recommended species classification model for jungle ecologies</td>
<td>Image-based <code>TimeDistributedEfficientNet</code></td>
<td>~250,000</td>
</tr>
<tr>
<td><code>slowfast</code></td>
<td>Central and West Africa</td>
<td>Better than <code>time_distributed</code> at blank detection and small species detection</td>
<td>Potentially better than <code>time_distributed</code> at small species detection</td>
<td>Video-native <code>SlowFast</code></td>
<td>~15,000</td>
</tr>
<tr>
<td><code>european</code></td>
<td>Western Europe</td>
<td>Trained on non-jungle ecologies</td>
<td>Finetuned <code>time_distributed</code>model</td>
<td>~13,000</td>
</tr>
</table>

The models trained on the largest datasets took a couple weeks to train on a single GPU machine. Some models will be updated in the future, and you can always check the [changelog](../../changelog) to see if there have been updates.

All models support training, fine-tuning, and inference. For fine-tuning, we recommend using the `time_distributed` model as the starting point.

<h2 id="species-classes"></h2>

## What species can `zamba` detect?

`time_distributed` and `slowfast` are both trained to identify 32 common species from Central and West Africa. The output labels in these models are:
The `blank_nonblank` model is trained to do blank detection without the species classification. The output labels from this model are:

* `blank`
* `nonblank`

The `time_distributed` and `slowfast` models are both trained to identify 32 common species from Central and West Africa. The output labels in these models are:

* `aardvark`
* `antelope_duiker`
Expand Down Expand Up @@ -72,7 +90,7 @@ All models support training, fine-tuning, and inference. For fine-tuning, we rec
* `small_cat`
* `wild_dog_jackal`

`european` is trained to identify 11 common species in Western Europe. The possible class labels are:
The `european` model is trained to identify 11 common species in Western Europe. The possible class labels are:

* `bird`
* `blank`
Expand All @@ -86,6 +104,25 @@ All models support training, fine-tuning, and inference. For fine-tuning, we rec
* `weasel`
* `wild_boar`

<a id='blank-nonblank'></a>

## `blank_nonblank` model

### Architecture

The `blank_nonblank` uses the same [architecture](#time-distributed) as `time_distributed` model, but there is only one output class as this is a binary classification problem.

### Default configuration

The full default configuration is available on [Github](https://github.com/drivendataorg/zamba/blob/master/zamba/models/official_models/blank_nonblank/config.yaml).

The `blank_nonblank` model uses the same [default configuration](#time-distributed-config) as the `time_distributed` model. For the frame selection, an efficient object detection model called [MegadetectorLite](#megadetectorlite) is run on all frames to determine which are the most likely to contain an animal. Then the classification model is run on only the 16 frames with the highest predicted probability of detection.

### Training data

The `blank_nonblank` model was trained on all the data used for the the [`time_distributed`](#time-distributed-training-data) and [`european`](#european-training-data) models.


<a id='time-distributed'></a>

## `time_distributed` model
Expand All @@ -98,7 +135,7 @@ The `time_distributed` model was built by re-training a well-known image classif

### Training data

`time_distributed` was trained using data collected and annotated by trained ecologists from Cameroon, Central African Republic, Democratic Republic of the Congo, Gabon, Guinea, Liberia, Mozambique, Nigeria, Republic of the Congo, Senegal, Tanzania, and Uganda, as well as citizen scientists on the [Chimp&See](https://www.chimpandsee.org/) platform.
The `time_distributed` model was trained using data collected and annotated by trained ecologists from Cameroon, Central African Republic, Democratic Republic of the Congo, Gabon, Guinea, Liberia, Mozambique, Nigeria, Republic of the Congo, Senegal, Tanzania, and Uganda, as well as citizen scientists on the [Chimp&See](https://www.chimpandsee.org/) platform.

The data included camera trap videos from:

Expand Down Expand Up @@ -197,7 +234,7 @@ The data included camera trap videos from:
</tr>
</table>

The most recent release of trained models took around 2-3 days to train on a single GPU machine on approximately 14,000 1-minute long videos for the African species, and around 13,000 videos for the European species. These models will be updated in the future, and you can always check the [changelog](../../changelog) to see if there have been updates.
<a id='time-distributed-config'></a>

### Default configuration

Expand All @@ -218,6 +255,9 @@ video_loader_config:
confidence: 0.25
fill_mode: score_sorted
n_frames: 16
frame_batch_size: 24
image_height: 640
image_width: 640
```
You can choose different frame selection methods and vary the size of the images that are used by passing in a custom [YAML configuration file](../yaml-config.md). The only requirement for the `time_distributed` model is that the video loader must return 16 frames.
Expand All @@ -240,7 +280,7 @@ Unlike `time_distributed`, `slowfast` is video native. This means it takes into

### Training data

The `slowfast` model was trained using the same data as the [`time_distributed` model](#time-distributed-training-data).
The `slowfast` model was trained on a subset of the [data used](#time-distributed-training-data) for the `time_distributed` model.

### Default configuration

Expand All @@ -262,6 +302,8 @@ video_loader_config:
confidence: 0.25
fill_mode: score_sorted
n_frames: 32
image_height: 416
image_width: 416
```

You can choose different frame selection methods and vary the size of the images that are used by passing in a custom [YAML configuration file](../yaml-config.md). The two requirements for the `slowfast` model are that:
Expand All @@ -275,7 +317,9 @@ You can choose different frame selection methods and vary the size of the images

### Architecture

The `european` model starts from the trained `time_distributed` model, and then replaces and trains the final output layer to predict European species.
The `european` model starts from the a previous version of the `time_distributed` model, and then replaces and trains the final output layer to predict European species.

<a id='european-training-data'></a>

### Training data

Expand All @@ -285,22 +329,7 @@ The `european` model is finetuned with data collected and annotated by partners

The full default configuration is available on [Github](https://github.com/drivendataorg/zamba/blob/master/zamba/models/official_models/european/config.yaml).

The `european` model uses the same frame selection as the `time_distributed` model. By default, an efficient object detection model called [MegadetectorLite](#megadetectorlite) is run on all frames to determine which are the most likely to contain an animal. Then `european` is run on only the 16 frames with the highest predicted probability of detection. By default, videos are resized to 240x426 pixels following frame selection.

The full default video loading configuration is:
```yaml
video_loader_config:
model_input_height: 240
model_input_width: 426
crop_bottom_pixels: 50
fps: 4
total_frames: 16
ensure_total_frames: true
megadetector_lite_config:
confidence: 0.25
fill_mode: score_sorted
n_frames: 16
```
The `european` model uses the same [default configuration](#time-distributed-config) as the `time_distributed` model.

As with all models, you can choose different frame selection methods and vary the size of the images that are used by passing in a custom [YAML configuration file](../yaml-config.md). The only requirement for the `european` model is that the video loader must return 16 frames.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/predict-tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ To run `zamba predict` in the command line, you must specify `--data-dir` and/or
* **`--data-dir PATH`:** Path to the folder containing your videos. If you don't also provide `filepaths`, Zamba will recursively search this folder for videos.
* **`--filepaths PATH`:** Path to a CSV file with a column for the filepath to each video you want to classify. The CSV must have a column for `filepath`. Filepaths can be absolute on your system or relative to the data directory that your provide in `--data-dir`.

All other flags are optional. To choose the model you want to use for prediction, either `--model` or `--checkpoint` must be specified. Use `--model` to specify one of the three [pretrained models](models/species-detection.md) that ship with `zamba`. Use `--checkpoint` to run inference with a locally saved model. `--model` defaults to [`time_distributed`](models/species-detection.md#what-species-can-zamba-detect).
All other flags are optional. To choose the model you want to use for prediction, either `--model` or `--checkpoint` must be specified. Use `--model` to specify one of the [pretrained models](models/species-detection.md) that ship with `zamba`. Use `--checkpoint` to run inference with a locally saved model. `--model` defaults to [`time_distributed`](models/species-detection.md#what-species-can-zamba-detect).

## Basic usage: Python package

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ eleph.mp4,elephant
leopard.mp4,leopard
```

There are three pretrained models that ship with `zamba`: `time_distributed`, `slowfast`, and `european`. Which model you should use depends on your priorities and geography (see the [Available Models](models/species-detection.md) page for more details). By default `zamba` will use the `time_distributed` model. Add the `--model` argument to specify one of other options:
There are pretrained models that ship with `zamba`: `blank_nonblank`, `time_distributed`, `slowfast`, and `european`. Which model you should use depends on your priorities and geography (see the [Available Models](models/species-detection.md) page for more details). By default `zamba` will use the `time_distributed` model. Add the `--model` argument to specify one of other options:

```console
$ zamba predict --data-dir example_vids/ --model slowfast
Expand Down
43 changes: 43 additions & 0 deletions templates/blank_nonblank.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
train_config:
# data_dir: YOUR_DATA_DIR HERE
# labels: YOUR_LABELS_CSV_HERE
model_name: blank_nonblank
backbone_finetune_config:
backbone_initial_ratio_lr: 0.01
multiplier: 1
pre_train_bn: true
train_bn: false
unfreeze_backbone_at_epoch: 3
verbose: true
early_stopping_config:
patience: 5
scheduler_config:
scheduler: MultiStepLR
scheduler_params:
gamma: 0.5
milestones:
- 3
verbose: true

video_loader_config:
model_input_height: 240
model_input_width: 426
crop_bottom_pixels: 50
fps: 4
total_frames: 16
ensure_total_frames: true
megadetector_lite_config:
confidence: 0.25
fill_mode: score_sorted
frame_batch_size: 24
image_height: 640
image_width: 640
n_frames: 16

predict_config:
# data_dir: YOUR_DATA_DIR HERE
# or
# filepaths: YOUR_FILEPATH_CSV_HERE
model_name: blank_nonblank
# or
# checkpoint: YOUR_CKPT_HERE
3 changes: 3 additions & 0 deletions templates/time_distributed.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ video_loader_config:
megadetector_lite_config:
confidence: 0.25
fill_mode: score_sorted
frame_batch_size: 24
image_height: 640
image_width: 640
n_frames: 16

predict_config:
Expand Down
7 changes: 5 additions & 2 deletions tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def test_shared_cli_options(mocker, minimum_valid_train, minimum_valid_predict):
assert "Config file: None" in result.output

# check all models options are valid
for model in ["time_distributed", "slowfast", "european"]:
for model in ["time_distributed", "slowfast", "european", "blank_nonblank"]:
result = runner.invoke(app, command + ["--model", model])
assert result.exit_code == 0

Expand Down Expand Up @@ -154,7 +154,8 @@ def test_predict_specific_options(mocker, minimum_valid_predict, tmp_path): # n
assert result.exit_code == 0


def test_actual_prediction_on_single_video(tmp_path): # noqa: F811
@pytest.mark.parametrize("model", ["time_distributed", "blank_nonblank"])
def test_actual_prediction_on_single_video(tmp_path, model): # noqa: F811
data_dir = tmp_path / "videos"
data_dir.mkdir()
shutil.copy(TEST_VIDEOS_DIR / "data" / "raw" / "benjamin" / "04250002.MP4", data_dir)
Expand All @@ -172,6 +173,8 @@ def test_actual_prediction_on_single_video(tmp_path): # noqa: F811
"--yes",
"--save-dir",
str(save_dir),
"--model",
model,
],
)
assert result.exit_code == 0
Expand Down
Loading

0 comments on commit 291b34f

Please sign in to comment.