Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
rowanz authored Jun 1, 2022
1 parent 1144b80 commit d22c50f
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@ The rough workflow:
* Use [process.py](process.py) to convert them into tfrecord format
* Then you can train the model.

Note: This issue https://github.com/ytdl-org/youtube-dl/issues/30710 might mean you need to use something else, like yt-dlp instead of youtube-dl.

A few pieces that could be useful:
* Our model for slightly improving the timing of YouTube ASR: [offset_model](offset_model)
* [process.py](process.py) converts audio into spectrograms
* [process.py](process.py) extracts frames from videos
* [process.py](process.py) also calls a lightweight MobileNet V2 CNN to remove videos whose images seem too similar
Also:
* [download_youtube.py](download_youtube.py) is a wrapper around YouTube-DL that you could use for downloading videos.
* [download_youtube.py](download_youtube.py) is a wrapper around YouTube-DL that you could use for downloading videos.

0 comments on commit d22c50f

Please sign in to comment.