Skip to content Skip to sidebar Skip to footer
Home Resources Blog Bug bounty hunter on the reality of autonomous AI hacking – “AI found vulnerabilities I would not have looked for”

Bug bounty hunter on the reality of autonomous AI hacking – “AI found vulnerabilities I would not have looked for”

7 minutes reading time

Bug bounty hunter on the reality of autonomous AI hacking – “AI found vulnerabilities I would not have looked for”

Artificial intelligence is often discussed in the context of future cybersecurity threats and opportunities. But according to Pedro Paniago, Offensive Security Consultant at PwC Belgium, security researcher and bug bounty hunter, the transformation is already underway. At the recent Application Security Experience Sharing Day, he shared the results of an experiment in which he used AI extensively to identify vulnerabilities in real-world systems. In a conversation with the Coalition, Paniago explained that AI is no longer just a productivity tool, it is becoming an active participant in offensive security operations.

What prompted you to start experimenting with AI for bug hunting? 

Pedro Paniago: “For months, I kept hearing conflicting messages. Some people argued that AI could only find simple vulnerabilities and that experienced security researchers had little to worry about. Others claimed that AI-powered penetration testing tools were already outperforming human hackers in certain situations. I wanted to test those claims for myself. 

At the same time, I had recently become a father and had less time available for bug bounty hunting. If AI could help automate parts of my workflow, it would allow me to continue researching effectively. A bug bounty event in Belgium provided the perfect opportunity to experiment on a large, mature target that I had never tested before.” 

How did the experiment begin? 

Pedro Paniago: “Initially, not great. I started with a standard AI setup, Vanilla Claude Code and simply asked it to find vulnerabilities. It quickly became obvious that this approach would not work. The model lacked context, methodology and access to the information it needed. The first important lesson was: prompting alone is not enough. 

I then began building a more structured environment around the AI. I provided testing scopes, credentials, instructions, methodologies and open-source security tools. In essence, I tried to translate my own manual workflow into something the AI could follow. Once that framework was in place, the results changed dramatically.” 

What kind of results did you achieve? 

Pedro Paniago: “During the two-week experiment, the system identified authentication bypasses, IDOR vulnerabilities, business logic flaws, payment bypasses and account takeover issues. I submitted fourteen valid reports, earned plenty in bug bounty rewards and finished at the top of the event leaderboard, competing against highly skilled security researchers. The surprising part was that I had deliberately avoided performing manual hacking. The goal was to see how far the AI could go on its own, and it went much further than I expected.” 

How did you evolve the system after those initial successes? 

Pedro Paniago: “Once I saw the potential, I started building a more advanced architecture. I added specialised sub-agents to perform specific tasks, memory systems to track testing progress, context management to improve efficiency and integrations with my reconnaissance databases. I also automated repetitive processes so that the AI could focus on higher-value activities. 

The objective was not simply to automate scans. It was to create a system capable of reasoning through complex attack paths while retaining enough context to continue progressing over long investigations.” 

Were the vulnerabilities it found genuinely complex? 

Pedro Paniago: “Absolutely. One example involved a sandbox escape vulnerability that ultimately provided root-level access within an isolated environment. That access could potentially have enabled broader attacks against multiple tenants and even supply chain compromise. 

Another case began with a payment bypass issue. The AI discovered a chain of weaknesses that eventually led to a zero-click account takeover. By combining several vulnerabilities and understanding how different components interacted, it was able to gain access to sensitive administrative functions and user data. 

In a separate engagement involving a Brazilian bank, the AI uncovered publicly exposed credentials and certificates on an obscure website. By connecting several pieces of information, it identified a path to highly privileged banking functionality. 

What surprised me most was not that the AI found vulnerabilities, but that it was able to combine seemingly unrelated findings into meaningful attack chains.” 

Does that mean AI can now replace human security researchers? 

Pedro Paniago: “Not really. AI is powerful, but it still has important limitations. One of the biggest challenges is maintaining focus. The system may start investigating one vulnerability, discover something else and immediately switch directions. Without proper controls, it can lose track of its original objective. 

It also struggles with business context. Human researchers can often recognise when behaviour is intended rather than vulnerable. AI does not always make that distinction correctly. False positives remain an issue as well. Modern models hallucinate less than earlier versions, but they often exaggerate severity or misunderstand business impact. 

Long-running investigations can also be problematic. Without effective memory and tracking mechanisms, the AI may repeat tests or forget what has already been examined.” 

What does this mean for the future of offensive security? 

Pedro Paniago: “I believe we are already entering a new paradigm. Traditionally, humans performed most of the technical work while tools supported them. Today, I would estimate that AI can perform around 80% of the execution, leaving humans responsible for validation, strategic decisions and communication. 

That shifts the skill set required from security professionals. The challenge is no longer just finding vulnerabilities. It is managing context, memory, workflows and cost while directing AI systems effectively. In many ways, the role of the security researcher is evolving from operator to orchestrator.” 

You also spoke about the shrinking gap between disclosure and exploitation. Why is that important? 

Pedro Paniago: “Because speed is becoming the defining factor. Historically, attackers needed significant time to analyse patches, understand what had changed and develop working exploits. That process could take days, weeks or even months. AI is dramatically compressing that timeline. 

Recently, I reported a zero-day vulnerability that was fixed within twenty-four hours. After the vendor released the patch, I provided the patched and unpatched versions to an AI system. Within five minutes, it had identified the underlying vulnerability. That illustrates how quickly modern systems can reverse-engineer security fixes. 

As AI capabilities continue to improve, the window between patch release and active exploitation will become increasingly small.” 

What impact is this having on bug bounty programmes and defenders? 

Pedro Paniago: “We are already seeing the effects. Bug bounty platforms are receiving more reports than ever before, and many of those reports are valid. The challenge for programme owners and triage teams is that the findings are becoming increasingly complex.  

AI is capable of correlating large volumes of information and constructing sophisticated attack chains. As a result, defenders must spend more time understanding and validating reports. At the same time, exploitation timelines are accelerating, leaving organisations with less time to assess and respond to risks. This combination creates significant operational pressure on security teams.” 

What should organisations do now? 

Pedro Paniago: “First, they need to accept that AI-powered offensive security is already a reality. Organisations should begin integrating AI into their own defensive processes, including code reviews, vulnerability discovery and continuous testing. Internal teams have access to source code, infrastructure documentation and system knowledge that external attackers do not. 

Second, security testing needs to become more continuous. Traditional penetration testing conducted once or twice a year is no longer sufficient when software is being updated constantly. 

Finally, security-by-design must become a priority. Building security into systems from the beginning is far more effective than trying to add it later.” 

What is the key takeaway for security leaders? 

Pedro Paniago: “The biggest risk is not that AI will replace security professionals. The real risk is that attackers, researchers and defenders will all become dramatically faster, but not at the same pace. 

The organisations that learn to harness AI effectively will gain a significant advantage. Those that wait may find themselves reacting to threats that move far faster than traditional security processes were ever designed to handle. 

The shift is already happening. The question is no longer whether AI will transform cybersecurity, but how quickly organisations can adapt.” 

 

About the author
Jo De Brabandere

Jo De Brabandere

Experienced Marketing & Communications Expert and Strategist
Jo De Brabandere is an experienced marketing & communications expert and strategist.
Join our podcast
Please choose your preferred listening platform and language

Spotify

EN

FR

NL

Apple

EN

FR

NL

Join our newsletter

Cyber Pulse keeps you up-to-date on the latest cybersecurity news, community actions and member stories.