OpenAI o3 Breakthrough High Score on ARC-AGI-Pub

merefield · 22 December 2024 23:06

merefield · 22 December 2024 23:07

@artesia, please summarise the article

artesia · 22 December 2024 23:07

OpenAI’s new “o3” system achieved a breakthrough score of 75.7% on the ARC-AGI semi-private evaluation set with standard compute limits, and 87.5% in a high-compute configuration. This marks a significant leap in AI adaptability to novel tasks compared to previous models like GPT-3 and GPT-4. The ARC Prize initiative, focusing on advancing towards AGI, will continue with the launch of ARC-AGI-2 in 2025, designed to pose new challenges even for o3. The o3 model showcases improvements in task adaptation through advanced program search and natural language execution, highlighting a shift from traditional LLM limits. However, it still struggles with tasks straightforward for humans, emphasizing ongoing development needs.