Environmental Scene Classification: Comparing Pseudolabeled Data and Traditionally Augmented Data

This paper explores semi-supervised learning to improve environmental scene classification by generating pseudolabels for the previously unlabeled ESC-US dataset of 250,000 audio records. A MobileNetV3 convolutional neural network (CNN), pretrained on ImageNet and fine-tuned on the labeled ESC-50 dataset (2,000 records), is used to pseudolabel ESC-US. Subsequently, various VGG-like CNNs are trained from scratch on ESC-50, additionally incorporating either data augmentation techniques (e.g., pitch shifting, time stretching, silence trimming) applied to ESC-50 or pseudolabeled ESC-US data at different confidence thresholds. The results show that, while incorporating 250,000 pseudolabeled samples (ESC-US) can theoretically enhance performance, carefully applied data augmentation on a much smaller dataset (ESC-50) can yield superior performance and computational efficiency.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
experiment.pdf		experiment.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Environmental Scene Classification: Comparing Pseudolabeled Data and Traditionally Augmented Data

About

Releases

Packages

teaden/ESC-Semi-Supervised

Folders and files

Latest commit

History

Repository files navigation

Environmental Scene Classification: Comparing Pseudolabeled Data and Traditionally Augmented Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages