GTA_Head is a large-scale virtual world dataset for crowd counting and head detection, including 5096 images labeled with 1732043 head bounding boxes. The images and target center coordinates are taken from GCC Dataset. We provide information for each visible head, including: xmin, ymin, length and width for training and evaluation of object detection model. There are 35 scenes in the dataset, including 24 scenarios in the training set and 11 scenarios in the test set. Compared with other datasets, GTA_Head provides pedestrian head annotations for a large number of complex scenes, including indoor shopping malls, subways and outdoor stadiums and squares. The annotation and data format of GTA_Head follows the standard guidelines outlined by MOTChallenge benchmark.
CroHD dataset provides tracking annotation of pedestrian heads in densely populated video sequences. It consists of 2,276,838 human heads in 11,463 frames across 9 sequences of Full-HD resolution.
GCC dateset provides a large-scale, diverse synthetic dataset consisting of 15,212 images, with resolution of 1080 × 1920, containing 7,625,843 persons.
GTA_Head dataset is free for research purpose only. For any questions about the dataset, please contact: [email protected].
@article{zhong2024mask,
title={Mask focal loss: a unifying framework for dense crowd counting with canonical object detection networks},
author={Zhong, Xiaopin and Wang, Guankun and Liu, Weixiang and Wu, Zongze and Deng, Yuanlong},
journal={Multimedia Tools and Applications},
pages={1--23},
year={2024},
publisher={Springer}
}