In this video, the creator builds a podcast scripting and editing app using Gemini 3 and multispeech technology that transforms uploaded documents into customizable podcast episodes with different styles and lengths. They demonstrate the app’s versatility by generating humorous, educational, and debate-style podcasts from various documents, highlighting its potential for entertaining and informative content creation.
In this video, the creator introduces a new project for Shipmas Day 2: building a podcast scripting and editing app using Gemini 3 and multispeech technology. The app allows users to upload documents such as PDFs or markdown text files, which are then processed by Gemini 3 to generate a podcast script. Users can customize parameters like the length of the podcast and the style, including options such as roasting, steelmanning, strawmanning, or explaining concepts like to a fifth grader. The generated script is sent to the multispeech API to produce a podcast episode featuring two different speakers, which users can listen to or download.
The creator explains the initial setup, including gathering the necessary contexts from Gemini 3, such as audio, multispeech, text, and document understanding contexts. They reference the Gemini API documentation for guidance and share that they have prepared their API key and a simple test script to verify functionality. The development process begins in the terminal with the use of Claude Max in plan mode to generate a detailed plan for building the app, focusing on Python for the backend and a Next.js style frontend. The plan includes defining voices, styles, and podcast lengths, as well as prompts for different podcast styles.
Once the plan is ready, the creator proceeds to build the app, running the agent to implement the features. They demonstrate the app’s interface, which allows file uploads and selection of podcast styles and lengths. As a test, they upload a credit card statement and choose the roast style for a 2-4 minute podcast. The app generates a humorous podcast episode with two speakers roasting the statement, showcasing the app’s ability to create entertaining content from mundane documents.
Next, the creator tests the app with a more complex document, a PDF paper titled “Butterbench,” which evaluates LLM-controlled robots for practical intelligence. They select the “explain like a fifth grader” style and generate a 2-4 minute podcast episode. The resulting audio explains the research paper in simple terms, demonstrating the app’s versatility in handling technical content and making it accessible to a general audience. This highlights the app’s potential for educational and informative podcast creation.
Finally, the creator experiments with an article about Anthropic’s CEO discussing the AI industry and risk-taking. They use the steelman versus strawman debate style to generate a podcast episode presenting different viewpoints on the AI bubble. Although the output was not perfectly aligned with the initial prompt, it still provided a balanced discussion. The creator concludes by praising the smooth workflow enabled by cloud-based tools and context engineering, expressing satisfaction with the app’s performance and encouraging viewers to look forward to the next Shipmas day project.