Ever needed to get the text from that unpopular video on YouTube that does not have auto-generated captions? Look no further! The YouTube transcriber will do that for you using Google's Speech-to-Text API.
True story: I made this simple YouTube video transcriber for my gf, because she was translating some videos for her job. Initially she had to listen, transcribe and then translate. Now she only needs a quick pass to correct the speech-to-text mistakes and take her time with translation, while her colleagues have to do the whole thing manually 100%. I think she likes me more now.
Clone this repo:
git clone https://github.com/Lef-F/yt_transcriber.git
Install requirements:
pip install requirements.txt
Install dependencies:
For this to work you need to install ffmpeg.
For mac users:
brew install ffmpeg
Make sure to have your Google service account credentials available as a .json
file and loaded. You will need permissions to create objects in the storage bucket and use the translation API. Currently known as Cloud Translation API User
and Storage Admin
.
You can set up a Google cloud project with free credits from Google, which should be enough for quite some transcriptions.
For mac/linux users:
export GOOGLE_ACCOUNT_CREDENTIALS=/path/to/service/account/file.json
To run, simply call the transcribe.py
script from Python 3 and add your YouTube video URL and your bucket name as arguments. An additional optional argument is the video language which by default is en-US
. (See all supported languages at Google's official documentation)
python3 transcribe.py "https://www.youtube.com/watch?v=kJQP7kiw5Fk" your_bucket_name es-ES