Skip to content


Repository files navigation

[CVPRW 2024] Cluster Self-Refinement for Enhanced Online Multi-Camera People Tracking

The official resitory for 8th NVIDIA AI City Challenge (Track1: Multi-Camera People Tracking) from team NetsPresso (Nota Inc.).



# git clone this repository
git clone
cd AIC2024_Track1_Nota

Prepare Datasets

The official dataset is available for download at, the website of the AI City Challenge.
To get the password to download them, you must complete the dataset request form.
(We are not permitted to share the dataset, per the DATASET LICENSE AGREEMENT from the dataset author(s).)

  1. After unzipping the dataset zip files, make sure the data structure is as follows:
└── data
    └── videos
        ├── train
        │   ├── scene_001
        │   │   ├── camera_0001
        │   │   │   ├── calibration.json
        │   │   │   └── video.mp4
        │   │   ├── ...
        │   │   └── ground_truth.txt
        │   ├── scene_002
        │   ├── ...
        ├── val
        │   ├── ...
        └── test
            ├── ...
  1. Generate datasets from videos
  • Option1: In case you want to train object detection and re-identification models
bash scripts/
  • Option2: In case you want to use pre-trained models
bash scripts/

Setup Environment

# Build a docker image
docker build -t aic2024/track1_nota:latest .

# Build a docker container
docker run -it --gpus all --shm-size=8g \
-v /path/to/AIC2024_Track1_Nota:/home/workspace/AIC2024_Track1_Nota \
-v /path/to/AIC2024_Track1_Nota/data:/workspace/ \
aic2024/track1_nota:latest /bin/bash


  1. Train People Detection Model
  • Modify the 'batch' and 'device' arguments in '' based on the available GPUs.
bash scripts/
  1. Train ReID Model
  • Modify the 'CUDA_VISIBLE_DEVICES' and 'num-gpus' arguments in '' based on the available GPUs.
  • Download Market1501 pretrained weight from here and place it in the './reid' directory.
bash scripts/

If you want to use pretrained models, please download them from the provided Google Drive and place them in the './pretrained' directory.

Reproduce MCPT Results

  • Option1: Inference each scene sequentially
bash scripts/
  • Option2: Inference scenes in parallel (to get a faster results)
    • modify the '' based on the number of available GPUs and the quantity of scenes you wish to process simultaneously.
bash scripts/

(If errors occur, inference only on the affected scenes separately, then run 'python3 tools/')

The result files will be saved as follows:

└── results
    ├── scene_061.txt
    ├── ...
    └── track1_submission.txt

Terms of use

The multi-camera people tracking system published in this repository was developed by combining several modules (e.g., object detector, re-identification model, multi-object tracking model). Commercial use of any modifications, additions, or newly trained parameters made to combine these modules is not allowed. However, commercial use of the unmodified modules is allowed under their respective licenses. If you wish to use the individual modules commercially, you may refer to their original repositories and licenses provided below.

Object detector (license) link : Github, License

Re-identification model (license) link : Github, License

Multi-object tracking model (license) link : Github, License