Recherche : [Anthropic]

The Pentagon Is Spending Millions On AI Hacking From Startup Twenty https://www.forbes.com/sites/thomasbrewster/2025/11/15/pentagon-spends-millions-on-ai-hackers/

18/11/2025 12:02:50

forbes.com
By Thomas Brewster, Forbes Staff.
Nov 15, 2025, 08:00am ESTUpdated Nov 16, 2025, 06:40am EST

The U.S. government has been contracting stealth startup Twenty, which is working on AI agents and automated hacking of foreign targets at massive scale.
The U.S. is quietly investing in AI agents for cyberwarfare, spending millions this year on a secretive startup that’s using AI for offensive cyberattacks on American enemies.
According to federal contracting records, a stealth, Arlington, Virginia-based startup called Twenty, or XX, signed a contract with the U.S. Cyber Command this summer worth up to $12.6 million. It scored a $240,000 research contract with the Navy, too. The company has received VC support from In-Q-Tel, the nonprofit venture capital organization founded by the CIA, as well as Caffeinated Capital and General Catalyst. Twenty couldn’t be reached for comment at the time of publication.

Twenty’s contracts are a rare case of an AI offensive cyber company with VC backing landing Cyber Command work; typically cyber contracts have gone to either small bespoke companies or to the old guard of defense contracting like Booz Allen Hamilton or L3Harris.

Though the firm hasn’t launched publicly yet, its website states its focus is “transforming workflows that once took weeks of manual effort into automated, continuous operations across hundreds of targets simultaneously.” Twenty claims it is “fundamentally reshaping how the U.S. and its allies engage in cyber conflict.”

Its job ads reveal more. In one, Twenty is seeking a director of offensive cyber research, who will develop “advanced offensive cyber capabilities including attack path frameworks… and AI-powered automation tools.” AI engineer job ads indicate Twenty will be deploying open source tools like CrewAI, which is used to manage multiple autonomous AI agents that collaborate. And an analyst role says the company will be working on “persona development.” Often, government cyberattacks use social engineering, relying on convincing fake online accounts to infiltrate enemy communities and networks. (Forbes has previously reported on police contractors who’ve created such avatars with AI.)

Twenty’s executive team, according to its website, is stacked with former military and intelligence agents. CEO and cofounder Joe Lin is a former U.S. Navy Reserve officer who was previously VP of product management at cyber giant Palo Alto Networks. He joined Palo Alto after the firm acquired Expanse, where he helped national security clients determine where their networks were vulnerable. CTO Leo Olson also worked on the national security team at Expanse and was a signals intelligence officer at the U.S. Army. VP of engineering Skyler Onken spent over a decade at U.S. Cyber Command and the U.S. Army. The startup’s head of government relations, Adam Howard, spent years on the Hill, most recently working on the National Security Council transition team for the incoming Trump administration.

The U.S. government isn’t the only country using AI to build out its hacking capabilities. Last week, AI giant Anthropic released some startling research: Chinese hackers were using its tools to carry out cyberattacks. The company said hackers had deployed Claude to spin up AI agents to do 90% of the work on scouting out targets and coming up with ideas on how to hack them.

It’s possible the U.S. could also be using OpenAI, Anthropic or Elon Musk’s xAI in offensive cyber operations. The Defense Department gave each company contracts worth up to $200 million for unspecified “frontier AI” projects. None have confirmed what they’re working on for the DOD.

Given its focus on simultaneous attacks on hundreds of targets, Twenty’s products appear to be a step up in terms of cyberwarfare automation.

By contrast, beltway contractor Two Six Technologies has received a number of contracts in the AI offensive cyber space, including one for $90 million in 2020, but its tools are mostly to assist humans rather than replace them. For the last six years, it’s been working on developing automated AI “to assist cyber battlespace” and “support development of cyber warfare strategies” under a project dubbed IKE. Reportedly its AI was allowed to press ahead with carrying out an attack if the chances of success were high. The contract value was ramped up to $190 million by 2024, but there’s no indication IKE uses agents to carry out operations at the scale that Twenty is claiming. Two Six did not respond to requests for comment.

AI is much more commonly used on the defensive side, particularly in enterprises. As Forbes reported earlier this week, an Israeli startup called Tenzai is tweaking AI models from OpenAI and Anthropic, among others, to try to find vulnerabilities in customer software, though its goal is red teaming, not hacking.

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous https://arstechnica.com/security/2025/11/researchers-question-anthropic-claim-that-ai-assisted-attack-was-90-autonomous/

15/11/2025 16:18:03

- Ars Technica
arstechnica.com
Dan Goodin – 14 nov. 2025 13:20

The results of AI-assisted hacking aren’t as impressive as many might have us believe.

Researchers from Anthropic said they recently observed the “first reported AI-orchestrated cyber espionage campaign” after detecting China-state hackers using the company’s Claude AI tool in a campaign aimed at dozens of targets. Outside researchers are much more measured in describing the significance of the discovery.

Anthropic published the reports on Thursday here and here. In September, the reports said, Anthropic discovered a “highly sophisticated espionage campaign,” carried out by a Chinese state-sponsored group, that used Claude Code to automate up to 90 percent of the work. Human intervention was required “only sporadically (perhaps 4-6 critical decision points per hacking campaign).” Anthropic said the hackers had employed AI agentic capabilities to an “unprecedented” extent.

“This campaign has substantial implications for cybersecurity in the age of AI ‘agents’—systems that can be run autonomously for long periods of time and that complete complex tasks largely independent of human intervention,” Anthropic said. “Agents are valuable for everyday work and productivity—but in the wrong hands, they can substantially increase the viability of large-scale cyberattacks.”

“Ass-kissing, stonewalling, and acid trips”
Outside researchers weren’t convinced the discovery was the watershed moment the Anthropic posts made it out to be. They questioned why these sorts of advances are often attributed to malicious hackers when white-hat hackers and developers of legitimate software keep reporting only incremental gains from their use of AI.

“I continue to refuse to believe that attackers are somehow able to get these models to jump through hoops that nobody else can,” Dan Tentler, executive founder of Phobos Group and a researcher with expertise in complex security breaches, told Ars. “Why do the models give these attackers what they want 90% of the time but the rest of us have to deal with ass-kissing, stonewalling, and acid trips?”

Researchers don’t deny that AI tools can improve workflow and shorten the time required for certain tasks, such as triage, log analysis, and reverse engineering. But the ability for AI to automate a complex chain of tasks with such minimal human interaction remains elusive. Many researchers compare advances from AI in cyberattacks to those provided by hacking tools such as Metasploit or SEToolkit, which have been in use for decades. There’s no doubt that these tools are useful, but their advent didn’t meaningfully increase hackers’ capabilities or the severity of the attacks they produced.

Another reason the results aren’t as impressive as they’re made out to be: The threat actors—which Anthropic tracks as GTG-1002—targeted at least 30 organizations, including major technology corporations and government agencies. Of those, only a “small number” of the attacks succeeded. That, in turn, raises questions. Even assuming so much human interaction was eliminated from the process, what good is that when the success rate is so low? Would the number of successes have increased if the attackers had used more traditional, human-involved methods?

According to Anthropic’s account, the hackers used Claude to orchestrate attacks using readily available open source software and frameworks. These tools have existed for years and are already easy for defenders to detect. Anthropic didn’t detail the specific techniques, tooling, or exploitation that occurred in the attacks, but so far, there’s no indication that the use of AI made them more potent or stealthy than more traditional techniques.

“The threat actors aren’t inventing something new here,” independent researcher Kevin Beaumont said.

Even Anthropic noted “an important limitation” in its findings:

Claude frequently overstated findings and occasionally fabricated data during autonomous operations, claiming to have obtained credentials that didn’t work or identifying critical discoveries that proved to be publicly available information. This AI hallucination in offensive security contexts presented challenges for the actor’s operational effectiveness, requiring careful validation of all claimed results. This remains an obstacle to fully autonomous cyberattacks.

How (Anthropic says) the attack unfolded
Anthropic said GTG-1002 developed an autonomous attack framework that used Claude as an orchestration mechanism that largely eliminated the need for human involvement. This orchestration system broke complex multi-stage attacks into smaller technical tasks such as vulnerability scanning, credential validation, data extraction, and lateral movement.

“The architecture incorporated Claude’s technical capabilities as an execution engine within a larger automated system, where the AI performed specific technical actions based on the human operators’ instructions while the orchestration logic maintained attack state, managed phase transitions, and aggregated results across multiple sessions,” Anthropic said. “This approach allowed the threat actor to achieve operational scale typically associated with nation-state campaigns while maintaining minimal direct involvement, as the framework autonomously progressed through reconnaissance, initial access, persistence, and data exfiltration phases by sequencing Claude’s responses and adapting subsequent requests based on discovered information.”

The attacks followed a five-phase structure that increased AI autonomy through each one.

The life cycle of the cyberattack, showing the move from human-led targeting to largely AI-driven attacks using various tools, often via the Model Context Protocol (MCP). At various points during the attack, the AI returns to its human operator for review and further direction. Credit: Anthropic
The attackers were able to bypass Claude guardrails in part by breaking tasks into small steps that, in isolation, the AI tool didn’t interpret as malicious. In other cases, the attackers couched their inquiries in the context of security professionals trying to use Claude to improve defenses.

As noted last week, AI-developed malware has a long way to go before it poses a real-world threat. There’s no reason to doubt that AI-assisted cyberattacks may one day produce more potent attacks. But the data so far indicates that threat actors—like most others using AI—are seeing mixed results that aren’t nearly as impressive as those in the AI industry claim

Many-shot jailbreaking \ Anthropic https://www.anthropic.com/research/many-shot-jailbreaking

08/01/2025 12:17:06

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

Anthropic researchers find that AI models can be trained to deceive https://techcrunch.com/2024/01/13/anthropic-researchers-find-that-ai-models-can-be-trained-to-deceive/

15/01/2024 06:44:13

A study co-authored by researchers at Anthropic finds that AI models can be trained to deceive -- and that this deceptive behavior is difficult to combat.

Liens par page

Filtres