In this podcast, Professor Hannah Fry and Four Flynn discuss the evolving cybersecurity landscape in the AI era, highlighting challenges like phishing, zero-day vulnerabilities, and AI-specific threats such as prompt injections and polymorphic malware. Flynn also introduces innovative AI-driven projects like Big Sleep and Mender that autonomously discover and patch software vulnerabilities, aiming to enhance global cybersecurity through automation and collaboration.
In this insightful podcast episode, Professor Hannah Fry interviews For Flynn, VP of Security at Google DeepMind, about the evolving landscape of cybersecurity, particularly in the age of AI. Flynn reflects on the landmark Operation Aurora cyberattack in 2009, where a nation-state targeted Google and other companies through a phishing attack exploiting a vulnerability in Internet Explorer. This event marked a significant shift in cybersecurity, highlighting the move from server-side attacks to client-side attacks that exploit human vulnerabilities. It also led to innovations such as Google’s BeyondCorp (zero trust) security model and the widespread adoption of multifactor authentication, although Flynn notes that industry-wide adoption of these best practices has been slow.
Flynn explains the complexity of cybersecurity, emphasizing the vast number of potential vulnerabilities in the millions of lines of code that underpin modern systems. He categorizes security failures into three main types: social engineering, configuration errors, and integrity issues. He illustrates these with examples such as phishing, misconfigured access controls, and vulnerabilities in IoT devices like smart fish tank thermometers that can serve as entry points for attackers. Flynn also discusses the concept of zero-day vulnerabilities—unknown flaws that can be exploited despite all known defenses—and the importance of defense in depth, where multiple layers of security work together to mitigate risks.
The conversation then shifts to the impact of large language models (LLMs) like Gemini on cybersecurity. Flynn highlights that LLMs introduce new challenges because they are non-deterministic and susceptible to attacks such as prompt injections, where malicious inputs can manipulate the model’s behavior. He also notes emerging threats where attackers use AI to create polymorphic malware that can evade detection by constantly changing its code. Additionally, data poisoning and tampering with model weights pose risks to the integrity of AI systems. To counter these threats, Google DeepMind employs adaptive attack simulations and layered defenses, including classifiers that detect malicious behavior, to harden their models against exploitation.
On the defensive side, Flynn introduces two pioneering projects at Google DeepMind: Big Sleep and Mender. Big Sleep uses AI to autonomously search for novel zero-day vulnerabilities in widely used open-source software, effectively acting as a tireless vulnerability researcher. This approach leverages the AI’s superhuman ability to understand vast codebases and identify subtle flaws that humans might miss. Mender complements this by automatically generating patches for discovered vulnerabilities, using a combination of AI-generated code, formal verification methods, and human review to ensure the fixes are secure and maintain functionality. Together, these projects aim to accelerate vulnerability discovery and remediation, benefiting the entire software ecosystem.
Finally, Flynn expresses his ambition to use AI to find and patch every vulnerability in code worldwide, though he acknowledges the human and organizational challenges involved in deploying patches effectively. He stresses the importance of transparency and open-source collaboration in improving security and notes that Google’s unique combination of data, code, and talent positions it at the forefront of this effort. The episode concludes with a teaser for the second part of the podcast, which will explore the human side of cybercrime and how AI is changing the ways people can be manipulated and tricked.