This repository is a mirror of the original High-Resolution Network (HRNet) repository, which can be found here. The primary purpose of this repository is to provide instructions and code modifications for training HRNet on your own custom dataset.
To get started with training your own custom dataset using HRNet, follow these steps:
First, clone this repository to your local machine:
git clone https://github.com/MaxRondelli/HRNet-with-Custom-Dataset.git
To complete the installation refer to the original repository of HRNet
Your custom dataset should be organized in a specific structure. In my experiment I use the structure of COCO.
Create a data folder and the structure must look like that:
data/
custom_dataset/
annotations/
train.json
val.json
images/
train/
val/
The json structure follows COCO structure as well. This is an example of mine train.json
.
{
"images": [
{
"id": 2,
"file_name": "train_image.png",
"width": 1280,
"height": 720
},
],
"annotations": [
{
"id": 2,
"image_id": 2,
"category_id": 1,
"keypoints": [
0,
0,
0,
0,
0,
0,
559,
253,
2,
0,
0,
0,
433,
300,
2,
0,
0,
0,
481,
244,
2,
0,
0,
0,
483,
196,
2,
0,
0,
0,
368,
158,
2,
0,
0,
0
],
"num_keypoints": 5,
"bbox": [
368,
158,
191,
142
],
"area": 27122,
"iscrowd": 0
},
],
"categories": [
{
"id": 1,
"name": "person",
"supercategory": "person",
"keypoints": [ your keypoints ],
"skeleton": []
}
]
}
Note: the structure of the dataset is critical. Both of the json files and how the folders are organized. If not respected, the network has difficulty to learn.
Modify the configuration files located in the experiments directory to suit your dataset and training parameters.
Update the DATASET section to point to your custom dataset:
DATASET:
COLOR_RGB: true
DATASET: 'custom_dataset'
DATA_FORMAT: png
FLIP: false
NUM_JOINTS_HALF_BODY: 6
PROB_HALF_BODY: 0.3
ROOT: 'HRNet-with-Custom-Dataset/data/custom_dataset/'
ROT_FACTOR: 45
SCALE_FACTOR: 0.35
TRAIN_SET: 'train'
TEST_SET: 'val'
In the original COCO dataset, there are 17 keypoints. However, in my custom dataset, there are only 12 keypoints. You must modify the keypoint configurations accordingly to your dataset.
In the configuration file (.yaml) update the NUM_JOINTS
parameter:
MODEL:
NUM_JOINTS: 12
Note: you will definitely have to change other values during the experiment. Whenever you see some mismatch where a shape is highlighted (12,) is different from (17,) it means you have to change something because the number of keypoints is different.
To start training, run the following command:
python3 tools/train.py --cfg HRNet-Human-Pose-Estimation/experiments/coco/hrnet/w48_384x288_adam_lr1e-3.yaml
In the Demo
folder you can run tests on images and videos of the model trained with your custom dataset. Edit the yaml file specifically with the custom model checkpoint.
You can run it in this way:
python3 demo.py --video video.mkv --write
or
python3 inference.py --cfg inference-config.yaml --videoFile video.mkv --writeBoxFrames --outputDir /output TEST.MODEL_FILE tools/output/custom_dataset/pose_hrnet/w48_384x288_adam_lr1e-3/model_best.pth
The dataset has 374 images in total divided in 296 for training and 78 for validation.
Lower results are obtained than the state-of-the-art presented in the paper. However, good human-pose estimation can be achieved at the inference time, resulting in a practical product that can be used for the purposes of the experiment with your custom dataset.
Method | Backbone | Pretrain | Input size | #Params | GFLOPs | AP | AP50 | AP75 | APM | APL | AR |
---|---|---|---|---|---|---|---|---|---|---|---|
HRNet-W32 | HRNet-W32 | N | 256x192 | 28.5M | 7.1 | 0.384 | 0.761 | 0.321 | 0.103 | 0.398 | 0.473 |
HRNet-W32 | HRNet-W32 | Y | 256x192 | 28.5M | 7.1 | 0.394 | 0.792 | 0.274 | 0.129 | 0.405 | 0.528 |
HRNet-W32 | HRNet-W32 | N | 384x288 | 28.5M | 16.0 | 0.384 | 0.795 | 0.278 | 0.129 | 0.397 | 0.492 |
HRNet-W32 | HRNet-W32 | Y | 384x288 | 28.5M | 16.0 | 0.396 | 0.826 | 0.338 | 0.203 | 0.406 | 0.509 |
HRNet-W48 | HRNet-W48 | N | 256x192 | 63.6M | 14.6 | 0.501 | 0.863 | 0.452 | 0.252 | 0.515 | 0.560 |
HRNet-W48 | HRNet-W48 | Y | 256x192 | 63.6M | 14.6 | 0.426 | 0.804 | 0.367 | 0.129 | 0.440 | 0.544 |
HRNet-W48 | HRNet-W48 | N | 384x288 | 63.6M | 32.8 | 0.526 | 0.831 | 0.537 | 0.077 | 0.548 | 0.581 |
HRNet-W48 | HRNet-W48 | Y | 384x288 | 63.6M | 32.8 | 0.568 | 0.950 | 0.579 | 0.253 | 0.584 | 0.665 |
This repository is based on the original HRNet repository. I extend our gratitude for their pioneering work in high-resolution visual recognition.
For detailed documentation and advanced usage, please refer to the original HRNet repository.