Veo 3 Advanced Prompting: JSON, XML, and NLP Experimental Prompts

The video experiments with three prompting methods—JSON, XML, and enhanced natural language—to generate AI videos using V3, finding that each format produces quality results with slight variations in effects, realism, and accuracy depending on the scenario. The creator concludes that all methods are valuable, highlights the ease of automating prompt conversions, and encourages viewers to explore these techniques through their free resources and AI video course.

In this video, the creator conducts an experiment to compare three different prompting methods for generating AI videos using V3: JSON prompting, XML prompting, and an enhanced natural language prompt. The goal is to see if these different formats produce noticeable differences in the resulting videos. The creator sets up three agents that automatically convert a basic input prompt into each of these formats, allowing for a side-by-side comparison of the outputs.

The first test involves a prompt describing a cardboard box in an empty living room that explodes to reveal a fully decorated interior in seconds, styled like an IKEA ad. The JSON and XML prompts produce very similar videos, with the XML version having a slightly more pronounced explosion effect and preferred audio. The enhanced natural language prompt also creates a smooth video but has a minor glitch near the end with the couch animation. Overall, the structured prompts (JSON and XML) perform well, but the enhanced natural language prompt shows promise despite the small issue.

Next, the creator tests a street interview scenario set in rainy New York, where a man asks a woman how a 30-day rain shower affects her life. All three prompting methods generate coherent videos with good audio and visuals. The XML prompt is favored for its realistic background movement and sound design, followed by the enhanced natural language prompt, and lastly the JSON prompt. Each version captures the essence of the scene well, demonstrating the effectiveness of all three approaches in handling dialogue and environmental details.

The final experiment involves an ASMR-style video of a person slicing a hyperrealistic Minecraft obsidian lava block. Here, the JSON prompt produces the most satisfying and accurate slicing action, while the enhanced natural language prompt also performs well. The XML prompt, however, results in a less convincing cut, with awkward finger positioning and slicing angles. Despite this, all three prompts deliver good results, showing that each format can be useful depending on the specific video content and desired outcome.

In conclusion, the creator finds the experiment valuable and plans to incorporate these prompting methods into their V3 AI video workflow. They highlight the ease of creating agents to automate prompt conversion and share that example prompts are available for free on their website, aivididecourse.com. The video encourages viewers to explore these prompting techniques and consider joining the creator’s AI video course for deeper learning, emphasizing the potential of structured and enhanced natural language prompts to improve AI video generation.