FunctionGemma - Function Calling at the Edge

Function Gemma is a lightweight, customizable function-calling language model designed to run efficiently on edge devices like mobile phones, enabling apps and games to execute specific functions locally. Built on the Gemma 3 base model and fine-tuned for practical use cases, it supports developer customization, offers tools for fine-tuning and deployment, and is demonstrated through a mobile app and browser-compatible versions.

The video introduces Function Gemma, a new open model release from the Gemma team, designed to bring customizable function calling capabilities to small language models that can run efficiently on edge devices like mobile phones. Unlike the more research-focused T5 Gemma 2, Function Gemma is specialized for practical applications such as games or apps where a lightweight model can not only chat but also execute specific functions within the app. This model builds on the Gemma 3 270M parameter base model, which is notable for its strong performance given its size and extensive training on 6 trillion tokens, making it ideal for edge deployment and educational purposes.

Function Gemma is unique because it is fine-tuned specifically for function calling, allowing developers to customize it for their particular use cases by fine-tuning it with their own datasets. This customization enables the model to perform better on the specific functions required by an app or game. The model supports special tokens for function declarations, calls, and responses, enabling a workflow where the model suggests a function call based on user input, the function is executed, and the result is fed back into the model to generate a final response. This approach mirrors function calling mechanisms seen in larger proprietary models but is optimized for running locally on devices like phones or edge hardware such as the Jetson Nano.

Google has released a mobile app demonstrating Function Gemma’s capabilities, showcasing examples of apps and games that run fully locally on phones using this model. Additionally, the model can be converted to run in browsers using transformers.js, enabling lightweight, client-side function calling. The Gemma team has also provided documentation and notebooks to help developers understand how to use the model for inference and fine-tuning. These resources include examples of defining functions, passing prompts, and handling function call outputs, making it easier for developers to integrate and customize the model for their needs.

Fine-tuning Function Gemma is a key part of its utility, as the base model may not perform well on specific tasks without customization. The video walks through a fine-tuning notebook that uses the Hugging Face TRL library to train the model on a dataset of about 10,000 examples from the Mobile Actions dataset released by Google. Fine-tuning improves the model’s accuracy significantly, enabling it to correctly identify and execute functions such as scheduling meetings. The notebook also covers converting the fine-tuned model to the Light RT format, which is optimized for deployment on mobile and edge devices, making it practical for real-world applications.

In conclusion, Function Gemma represents a significant step forward for running customizable function-calling language models on edge devices. It offers developers a powerful yet lightweight tool to build apps and games that can operate fully locally, with the ability to fine-tune the model for specific functions and deploy it efficiently using Light RT. While it is not the anticipated Gemma 4 release, Function Gemma is a valuable resource for anyone interested in edge AI and mobile deployments. The video encourages viewers to explore the model, try out the demos, and engage with the community for further development and support.