Skip to content

Latest commit

 

History

History
33 lines (23 loc) · 1.29 KB

BBC.md

File metadata and controls

33 lines (23 loc) · 1.29 KB

BCC Sound effects Data Card

Data Collection

Source Collecting Method
BBC website 1.Scrape from the BBC website and get the metadata
2. Download according to URLs provided in the metadata file, using a downloading script

Preprocessing Principles

You may refer to freesonud_preprocess.py for all the details.

{
    "text": "Busy French Village Street, atmosphere with cars, horns, occasional quiet periods"
    }

I. Json file generation principles

  • text entry
    text generation We take the title of the sound effect as the text entry, with class names removed (e.g. here we remove the class name "French-Traffic").
  • tag entry No such entry for this dataset
  • original_data entry No such entry for this dataset

II. Audio filtering principles

  1. Discard all audios failed to be read by soundfile.read() method or denied by FFmpeg while processing.

III. Audio format specifications

After the preprocessing work, all audio files should be in FLAC format with sampling rate of 48KHZ. (Processed by ffmpeg).