This repo contains our recommendations and tools for manual revision of speech transcripts.
on-linux-in-vim
suggests how to do it on Linux with the text editor VIMyoutube-subtitle-editor
suggests how to do it with youtube web editor- Amara subtitle editor (please add a subdir and a README how to do it)
- Filmtit + Matus Namesny's project (talk to Ondrej Bojar if we could revive this)
Depending on the particular editing environment, please try to adhere to the following annotation rules as much as possible:
- full correct letter casing + punctuation
- one line per sentence
- handling of numbers and abbreviations is unclear; the following options seem equally good, but try to be consistent:
- spelled out (write down these words to the transcript):
forty four
,Czech Tech
(if the abbreviation was not spelled letter by letter ["C-T-U"]) - condensed (write down abbreviations to transcripts):
44
,CTU
- In any case avoid expanding what was said, so do not write
Czech Technical University
when the person said CzechTech
- spelled out (write down these words to the transcript):