This project generates a podcast episode using different APIs and technologies. The main topic of this podcast is birds so each episode will be about a specific bird. This also means that the prompts were written with this in mind.
We use Wikipedia library to get the text about a bird, and Langchain with Openai to generate a structured podcast dialogue.
We use ElevenLabs API to generate the voices for the podcast. We also create an audio for the podcast intro using Audiocraft. This audio is a song generated with MusicGen and some sound effects generated using AudioGen. We put everything together using pydub library.
We use the Stable Diffusion XL model (stable-diffusion-xl-base-1.0) as a base model and Refiner XL (stable-diffusion-xl-refiner-1.0) as a refiner model to generate a podcast cover.
This project was part of the Uplimit course "Prompt Design & Building AI Products". You can see the Colab notebooks here:
- Week 1: Geting the text about your podcast topic, generating the podcast dialogue using OpenAI, and generating the voices with Elevenlabs.
- Week 2: Generating a cover for our podcast using Stable Diffusion XL model with a refiner. In this notebook, I tried different prompts to generate a cover in different art styles.
- Week 3: Generating music for the intro using AudioCraft. I also used Langchain to create a sequential chain summarising and generating the podcast dialogue. And the same process using Langchain agents.