Researchers demonstrated in October 2025 that most AI security defenses are easily bypassed, raising serious concerns about the effectiveness of current AI security products. A team from OpenAI, Anthropic, and Google DeepMind published a paper titled, "The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections," which detailed how they successfully circumvented 12 published AI defenses, many of which claimed near-zero attack success rates. The research highlights a critical gap between the security measures being deployed and the sophistication of potential attacks.
The study revealed that the research team achieved bypass rates exceeding 90% on most of the defenses tested. This suggests that many AI security products are not being adequately tested against realistic attacker behaviors. The team evaluated prompting-based, training-based, and filtering-based defenses under adaptive attack conditions, and found that all of them were vulnerable. Prompting defenses, for example, experienced attack success rates ranging from 95% to 99% under adaptive attacks. Training-based methods fared similarly poorly, with bypass rates reaching 96% to 100%.
To rigorously test the defenses, the researchers designed a comprehensive methodology that included 14 authors and a $20,000 prize pool for successful attacks. This approach aimed to simulate real-world adversarial conditions and incentivize the development of effective bypass techniques. The fact that the researchers were able to consistently overcome the defenses, despite their claimed near-zero attack success rates, underscores the severity of the problem.
Louis Columbus, writing in January 2026, emphasized the implications for enterprises, stating that many AI security products are being tested against attackers that don't behave like real attackers. This raises questions about the due diligence processes of security teams and the accuracy of vendor claims.
The findings have prompted calls for a more robust and adversarial approach to AI security testing. Experts recommend that organizations ask vendors critical questions about their testing methodologies, including whether they have been subjected to adaptive attacks and red teaming exercises. The research also highlights the need for ongoing monitoring and adaptation of AI defenses, as attackers continuously evolve their techniques. The rapid advancement of AI technology necessitates a proactive and dynamic approach to security, rather than relying on static defenses that can be easily bypassed.
Discussion
Join the conversation
Be the first to comment