A general PyTorch based framework for learning tracking representations.
The installation script will automatically generate a local configuration file "admin/local.py". In case the file was not generated, run admin.environment.create_default_local_file()
to generate it. Next, set the paths to the training workspace,
i.e. the directory where the checkpoints will be saved. Also set the paths to the datasets you want to use. If all the dependencies have been correctly installed, you can train a network using the run_training.py script in the correct conda environment.
conda activate pytracking
python run_training.py train_module train_name
Here, train_module
is the sub-module inside train_settings
and train_name
is the name of the train setting file to be used.
For example, you can train using the included default ATOM settings by running:
python run_training bbreg atom_default
The framework consists of the following sub-modules.
- actors: Contains the actor classes for different trainings. The actor class is responsible for passing the input data through the network can calculating losses.
- admin: Includes functions for loading networks, tensorboard etc. and also contains environment settings.
- dataset: Contains integration of a number of training datasets, namely TrackingNet, GOT-10k, LaSOT, ImageNet-VID, DAVIS, YouTube-VOS, MS-COCO, SBD, LVIS, ECSSD, MSRA10k, and HKU-IS. Additionally, it includes modules to generate synthetic videos from image datasets.
- data_specs: Information about train/val splits of different datasets.
- data: Contains functions for processing data, e.g. loading images, data augmentations, sampling frames from videos.
- external: External libraries needed for training. Added as submodules.
- models: Contains different layers and network definitions.
- trainers: The main class which runs the training.
- train_settings: Contains settings files, specifying the training of a network.
The framework currently contains the training code for the following trackers.
The following setting files can be used train the LWL networks, or to know the exact training details.
- lwl.lwl_stage1: The default settings used for initial network training with fixed backbone weights. We initialize the backbone ResNet with pre-trained Mask-RCNN weights. These weights can be obtained from here. Download and save these weights in env_settings().pretrained_networks directory.
- lwl.lwl_stage2: The default settings used for training the final LWL model. This setting fine-tunes all layers in the model trained using lwl_stage1.
- lwl.lwl_boxinit: The default settings used for training the bounding box encoder network in order to enable VOS with box initialization.
The following setting files can be used train the KYS networks, or to know the exact training details.
- kys.kys: The default settings used for training the KYS model with ResNet-50 backbone.
The following setting files can be used train the DiMP networks, or to know the exact training details.
- dimp.prdimp18: The default settings used for training the PrDiMP model with ResNet-18 backbone.
- dimp.prdimp50: The default settings used for training the PrDiMP model with ResNet-50 backbone.
- dimp.super_dimp: Combines the bounding-box regressor of PrDiMP with the standard DiMP classifier and better training and inference settings.
The following setting files can be used train the DiMP networks, or to know the exact training details.
- dimp.dimp18: The default settings used for training the DiMP model with ResNet-18 backbone.
- dimp.dimp50: The default settings used for training the DiMP model with ResNet-50 backbone.
The following setting file can be used train the ATOM network, or to know the exact training details.
- bbreg.atom: The settings used in the paper for training the network in ATOM.
- bbreg.atom: Newer settings used for training the network in ATOM, also utilizing the GOT10k dataset.
- bbreg.atom: Settings for ATOM with the probabilistic bounding box regression proposed in this paper.
- bbreg.atom: The baseline ATOM* setting evaluated in this paper.
To train a custom network using the toolkit, the following components need to be specified in the train settings. For reference, see atom.py.
- Datasets: The datasets to be used for training. A number of standard tracking datasets are already available in
dataset
module. - Processing: This function should perform the necessary post-processing of the data, e.g. cropping of target region, data augmentations etc.
- Sampler: Determines how the frames are sampled from a video sequence to form the batches.
- Network: The network module to be trained.
- Objective: The training objective.
- Actor: The trainer passes the training batch to the actor who is responsible for passing the data through the network correctly, and calculating the training loss.
- Optimizer: Optimizer to be used, e.g. Adam.
- Trainer: The main class which runs the epochs and saves checkpoints.