GPT-5 VS Sonnet-4 design contest. python script to webapp refactoring

The video compares GPT-5 and Sonnet-4’s redesigns of a complex Python script into FastAPI web apps, highlighting Sonnet’s superior design and solid UX against GPT-5’s more functional layout with detailed cost and report features. Ultimately, Sonnet wins for its visual appeal and user experience, while GPT-5 is praised for functionality and detailed output presentation.

The video compares two redesigns of a complex Python script called the Context Engineer, which generates and answers questions on any topic using web search, then compiles a detailed report. The original script is about 600 lines long and includes features like token usage calculation, directory structuring, and two separate agents for question creation and answering. The creator tasked both GPT-5 and Sonnet-4 with converting this script into a FastAPI web app using the same prompt, and the video evaluates their outputs across several criteria.

Starting with design quality, the presenter prefers Sonnet’s version due to its more pleasant color scheme and overall aesthetic appeal, despite not being a designer. Sonnet’s interface includes nice touches like gradients and a brain icon, making it visually appealing. GPT-5’s design is more compact and functional, fitting everything on one screen without scrolling, which some users might prefer. However, the presenter awards Sonnet a double pass for design quality, appreciating its better visual presentation.

In terms of user experience (UX), both apps allow users to input a topic and select a purpose and timeframe for research. GPT-5’s version offers more customizable options, such as adjusting the number of questions, toggling web search, and removing citations, whereas Sonnet’s app provides preset purposes without custom input and fewer adjustable settings. GPT-5 also includes a progress bar and waiting animation during processing, which Sonnet lacks, making GPT-5’s UX more informative and interactive. However, Sonnet displays generated questions clearly and updates progress visibly, earning mixed but generally positive marks for UX.

When it comes to functionality and adherence to the original script, both apps organize output files into folders named after the topic and purpose, with Sonnet’s organization slightly more refined. GPT-5’s app excels in displaying detailed token usage, costs, and the final report with a download option, while Sonnet shows questions but does not display or allow downloading the final report. This gives GPT-5 an edge in final result presentation, though Sonnet’s output structure is commendable.

Overall, the evaluation is subjective but thorough, with Sonnet winning the contest primarily due to its superior design and solid UX features, despite some limitations like lack of waiting animations and final report display. GPT-5’s app is praised for its compact layout, detailed cost breakdown, and final report presentation but loses points for less appealing design and missing progress indicators. The presenter notes that previous comparisons favored GPT-5, but this time Sonnet takes the lead. The code for both web apps is available on the presenter’s Patreon for viewers interested in exploring further.