Apple's SECRET iPhone LLM Destroys Phi-3 Performance! WWDC 2024

The video discusses Apple’s development of a bespoke Large Language Model (LLM) for iPhones and Apple hardware, boasting a 3 billion parameter model that rivals the performance of larger models. This LLM, running on Apple silicon, showcases Apple’s focus on on-device processing, outperforming existing models and potentially revolutionizing AI capabilities on edge devices.

In a recent video discussing Apple’s recent event, it was revealed that Apple is working on developing its own bespoke Large Language Models (LLMs) for use on iPhones and Apple hardware. This internal work has resulted in the creation of a 3 billion parameter model that is claimed to rival the performance of existing 7 and 8 billion parameter models. The model, which runs on Apple silicon, is capable of not only text processing but also image recognition, achieving a remarkable speed of 0.6 milliseconds per prompt token at around 30 transactions per second. This model utilizes grouped query attention activation and embedded quantization, features not commonly seen in edge device models.

Apple’s model also includes server versions designed to run on Apple silicon servers, showcasing the company’s commitment to integrating their AI technologies across their hardware ecosystem. The model is optimized for performance using techniques like dynamic load cache and swap Laura adapter models, allowing for flexibility in running apps with different language or functionality requirements. Apple’s focus on product-first mindset and hardware-software integration sets them apart in the AI space, positioning them potentially ahead of competitors in deploying LLMs on edge devices. The model’s performance outshines previous state-of-the-art models, even surpassing those intended for mobile devices from other tech giants.

The benchmarks for Apple’s LLM model show impressive results, with the on-device version outperforming previous top models, including those from Google and Microsoft. Apple’s dedication to on-device processing and edge compute capabilities signifies a shift in how developers can leverage AI technologies in their applications. The integration of Apple’s LLMS in iOS 18, in collaboration with OpenAI, introduces advanced AI capabilities to Apple devices, potentially enhancing user experiences and enabling new functionalities through predictive text and AI suggestions.

The video also highlights the significance of edge compute and the impact of bringing advanced AI capabilities to mobile devices. By enabling tasks that were traditionally done in cloud servers to be performed on-device, developers can unlock new possibilities and enhance user interactions with applications. Apple’s move towards local LLM processing on their hardware signifies a shift towards more accessible and powerful AI technologies on mobile devices, setting the stage for further advancements in the AI ecosystem. Overall, Apple’s groundbreaking work on developing and deploying their own LLMs showcases their commitment to innovation and integration across their hardware and software platforms, potentially revolutionizing the landscape of AI on edge devices.