Researchers demonstrated the vulnerability of artificial intelligence defenses, successfully breaching every system they tested, according to a study published in October 2025. The paper, titled "The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against LLM Jailbreaks and Prompt Injections," revealed that 12 AI defenses, many of which claimed near-zero attack success rates, were bypassed with success rates exceeding 90% in most cases. The research was conducted by a team from OpenAI, Anthropic, and Google DeepMind.
The findings raise serious concerns about the effectiveness of AI security products currently being deployed by enterprises. Louis Columbus reported on January 23, 2026, that many of these products are tested against attackers that do not accurately represent real-world threats.
The research team evaluated prompting-based, training-based, and filtering-based defenses under adaptive attack conditions. Prompting defenses, designed to prevent malicious prompts from manipulating AI models, experienced attack success rates between 95% and 99%. Training-based methods, which aim to fortify AI models against attacks through specific training data, fared similarly poorly, with bypass rates ranging from 96% to 100%.
The researchers employed a rigorous methodology to validate the claims made by the AI defense systems. This included a team of 14 authors and a $20,000 prize pool incentivizing successful attacks. The study tested defenses across four categories, all of which initially claimed near-zero attack success rates.
The implications of this research extend beyond immediate security concerns. The widespread adoption of AI across various sectors, from finance to healthcare, necessitates robust security measures. The demonstrated vulnerability of current AI defenses highlights the need for a more proactive and adaptive approach to AI security.
Given these findings, enterprises procuring AI security solutions should ask vendors critical questions about their testing methodologies and adaptive attack resilience. These questions should include:
1. What types of adaptive attacks have been used to test the system?
2. What is the documented attack success rate under adaptive attack conditions?
3. How frequently is the system re-evaluated against new attack vectors?
4. What methods are used to simulate real-world attacker behavior?
5. How does the system handle prompt injections and jailbreaking attempts?
6. What is the process for updating the system in response to newly discovered vulnerabilities?
7. Can the vendor provide independent verification of the system's security claims?
The research underscores the importance of continuous monitoring and adaptation in the face of evolving AI threats. As AI technology advances, so too must the strategies for defending against malicious actors. The findings suggest a need for greater collaboration between AI developers, security researchers, and enterprises to develop more robust and resilient AI security solutions.
Discussion
Join the conversation
Be the first to comment