Gemma 4 Local Ai Test

Google’s Gemma 4 series of AI models introduces significant advancements over its predecessor, including expanded multilingual support, larger context windows, enhanced reasoning, and optimized on-device performance across various model sizes. The models demonstrate strong capabilities in complex tasks and agentic applications, with ongoing improvements and integration tools like Hermes agent facilitating efficient local AI interactions.

Google has recently released the Gemma 4 series of AI models, succeeding the popular Gemma 3 lineup. The new models come with several improvements, including a shift to the standard Apache 2 open license and support for up to 140 languages. The lineup includes various sizes such as E2B, E4B, 26B, A4B (a mixture of experts model), and the densest 31B model, all optimized for on-device use and capable of reasoning. Smaller models like the 2B and 4B are designed to run efficiently on low-end hardware, including phones and modest GPUs, making them accessible for a wide range of users.

The Gemma 4 models feature enhanced capabilities such as extended context windows (up to 128k tokens), improved coding and agentic functionalities, and full multimodal support in the smaller models, excluding audio. The mixture of experts model (A4B) promises a balance between speed and quality, while the 31B model offers the highest capability. The new models show significant performance improvements over Gemma 3, with benchmark scores jumping dramatically in areas like MMLU, Codeforces, and live codebench tests, indicating a substantial leap in quality and reasoning ability.

Testing the models revealed some strengths and weaknesses. For example, the model handled complex ethical dilemmas with nuanced reasoning but maintained strict safety protocols that limited certain responses, such as refusing to endorse violence or coercion. While it did not outright refuse challenging prompts, its reasoning sometimes relied heavily on safety guidelines rather than deeper internal logic. The model performed well on various parsing and math problems, though it struggled with some precision tasks like counting letters in a word, highlighting areas for further refinement.

The presenter also demonstrated how to set up and run the Gemma 4 models using VLM and integration with tools like Open Web UI and Hermes agent. The agentic framework, particularly Hermes agent, was noted as a game-changer for interacting with local AI models, allowing users to issue commands and review results asynchronously. Some technical issues, such as tool calling parser bugs, were mentioned but expected to be resolved soon. The presenter emphasized the importance of updating dependencies like transformers and VLM nightly builds to ensure smooth operation.

Overall, Gemma 4 represents a major advancement in local AI models, offering improved multilingual support, larger context windows, and better reasoning capabilities. The models are well-suited for agentic applications and on-device deployment, with promising performance across a variety of tasks. The presenter plans further testing, especially with smaller models, and encourages viewers to explore guides and resources available on digitalspaceport.com for setup and usage tips. The release marks an exciting step forward for accessible, powerful local AI technology.