Cyberveillecurated by Decio
Nuage de tags
Mur d'images
Quotidien
Flux RSS
  • Flux RSS
  • Daily Feed
  • Weekly Feed
  • Monthly Feed
Filtres

Liens par page

  • 20 links
  • 50 links
  • 100 links

Filtres

Untagged links
6 résultats taggé Anthropic  ✕
Anthropic debuts preview of powerful new AI model Mythos in new cybersecurity initiative https://techcrunch.com/2026/04/07/anthropic-mythos-ai-model-preview-security/
08/04/2026 06:43:35
QRCode
archive.org
thumbnail

| TechCrunch
Lucas Ropek
11:00 AM PDT · April 7, 2026

The new model will be used by a small number of high-profile companies to engage in defensive cybersecurity work.

Anthropic on Tuesday released a preview of its new frontier model, Mythos, which it says will be used by a small coterie of partner organizations for cybersecurity work. In a previously leaked memo, the AI startup called the model one of its “most powerful” yet.

The model’s limited debut is part of a new security initiative, dubbed Project Glasswing, in which 12 partner organizations will deploy the model for the purposes of “defensive security work” and to secure critical software, Anthropic said. While it was not specifically trained for cybersecurity work, the model will be used to scan both first-party and open source software systems for code vulnerabilities, the company said.

Anthropic claims that, over the past few weeks, Mythos identified “thousands of zero-day vulnerabilities, many of them critical.” Many of the vulnerabilities are one to two decades old, the company added.

Mythos is a general-purpose model for Anthropic’s Claude AI systems that the company claims has strong agentic coding and reasoning skills. Anthropic’s frontier models are considered its most sophisticated and high-performance models, designed for more complex tasks, including agent-building and coding.

The partner organizations previewing Mythos as part of Project Glasswing include Amazon, Apple, Broadcom, Cisco, CrowdStrike, the Linux Foundation, Microsoft, and Palo Alto Networks. As part of the initiative, these partners will ultimately share what they’ve learned from using the model so that the rest of the tech industry can benefit from it. The preview is not going to be made generally available, Anthropic said, though 40 organizations will gain access to the Mythos preview aside from the partnership.

Anthropic also claims that it has engaged in “ongoing discussions” with federal officials about the use of Mythos, although one would have to imagine that those discussions are complicated by the fact that Anthropic and the Trump administration are currently locked in a legal battle after the Pentagon labeled the AI lab a supply-chain risk over Anthropic’s refusal to allow autonomous targeting or surveillance of U.S. citizens.

News of Mythos was originally leaked in a data security incident reported last month by Fortune. A draft blog about the model (then called “Capybara”) was left in an unsecured cache of documents available on a publicly inspectable data lake. The leak, which Anthropic subsequently attributed to “human error,” was originally spotted by security researchers. “‘Capybara’ is a new name for a new tier of model: larger and more intelligent than our Opus models — which were, until now, our most powerful,” the leaked document said, adding later that it was “by far the most powerful AI model we’ve ever developed,” according to the report.

In the leak, Anthropic claimed that its new model far exceeded performance areas (like “software coding, academic reasoning, and cybersecurity”) met by its currently public models and that it could potentially pose a cybersecurity threat if weaponized by bad actors to find bugs and exploit them (rather than fix them, which is how Mythos will be deployed).

Last month, the company accidentally exposed nearly 2,000 source code files and over half a million lines of code via a mistake it made in the launch of version 2.1.88 of its Claude Code software package. The company then accidentally caused thousands of code repositories on GitHub to be taken down as it attempted to clean up the mess.

Correction April 7, 2026: An earlier version of this article erroneously stated how many partners are working with Anthropic on Project Glasswing. There are 12 partner organizations, though 40 organizations total will have access to the Mythos preview.

techcrunch.com EN 2026 AI Anthropic Mythos
Anthropic employee error exposes Claude Code source | InfoWorld https://www.infoworld.com/article/4152856/anthropic-employee-error-exposes-claude-code-source.html
01/04/2026 10:04:29
QRCode
archive.org
thumbnail

infoworld.com
by Howard Solomon
Mar 31, 2026

A version of the AI coding tool in Anthropic's npm registry included a source map file, which leads to the full proprietary source code.

An Anthropic employee accidentally exposed the entire proprietary source code for its AI programming tool, Claude Code, by including a source map file in a version of the tool posted on Anthropic’s open npm registry account, a risky mistake, says an AI expert.

“A compromised source map is a security risk,” said US-based cybersecurity and AI expert Joseph Steinberg. “A hacker can use a source map to reconstruct the original source code and [see] how it works. Any secrets within that code – if someone coded in an API key, for example – is at risk, as is all of the logic. And any vulnerabilities found in the logic could become clear to the hacker who can then exploit the vulnerabilities.”

However, Anthropic spokesperson told CSO, “no sensitive customer data or credentials were involved or exposed. This was a release packaging issue caused by human error, not a security breach. We’re rolling out measures to prevent this from happening again.”

But it wasn’t the first time this had happened; according to Fortune and other news sources, the same thing happened last month.

Don’t expose .map files
Map files shouldn’t be left in the final version of code published on open source registries, where anyone can download a package; they can be sources of useful information for hackers.

According to developer Kuber Mehta, who published a blog on the latest incident, when someone publishes a JavaScript/TypeScript package to npm, the build toolchain often generates source map files (.map files). These files are a bridge between the minified/bundled production code and the original source; they exist so that when something crashes in production, the stack trace can point to the actual line of code in the original file, not to some unintelligible reference.

What’s available in these files? “Every file. Every comment. Every internal constant. Every system prompt. All of it, sitting right there in a JSON file that npm happily serves to anyone who runs npm pack or even just browses the package contents,” said Mehta.

“The mistake is almost always the same: someone forgets to add *.map to their .npmignore or doesn’t configure their bundler to skip source map generation for production builds,” Mehta said. “With Bun’s bundler (which Claude Code uses), source maps are generated by default unless you explicitly turn them off.”

Think of a source map as a file that shows what parts of minified computer code, which is not easily understandable to humans, are doing, shown in the human-readable source code, said Steinberg. For example, he said, it may indicate that the code in a specific portion of the executable code is performing the instructions that appear in some specific snippet of source code.

A source map can help with debugging, he added. Without it, he said, many errors would be identified as coming from a larger portion of code, rather than showing exactly where the errors occur.

infoworld.com EN 2026 Anthropic Claude code-leak leak source-map error
The Pentagon Is Spending Millions On AI Hacking From Startup Twenty https://www.forbes.com/sites/thomasbrewster/2025/11/15/pentagon-spends-millions-on-ai-hackers/
18/11/2025 12:02:50
QRCode
archive.org
thumbnail

forbes.com
By Thomas Brewster, Forbes Staff.
Nov 15, 2025, 08:00am ESTUpdated Nov 16, 2025, 06:40am EST

The U.S. government has been contracting stealth startup Twenty, which is working on AI agents and automated hacking of foreign targets at massive scale.
The U.S. is quietly investing in AI agents for cyberwarfare, spending millions this year on a secretive startup that’s using AI for offensive cyberattacks on American enemies.
According to federal contracting records, a stealth, Arlington, Virginia-based startup called Twenty, or XX, signed a contract with the U.S. Cyber Command this summer worth up to $12.6 million. It scored a $240,000 research contract with the Navy, too. The company has received VC support from In-Q-Tel, the nonprofit venture capital organization founded by the CIA, as well as Caffeinated Capital and General Catalyst. Twenty couldn’t be reached for comment at the time of publication.

Twenty’s contracts are a rare case of an AI offensive cyber company with VC backing landing Cyber Command work; typically cyber contracts have gone to either small bespoke companies or to the old guard of defense contracting like Booz Allen Hamilton or L3Harris.

Though the firm hasn’t launched publicly yet, its website states its focus is “transforming workflows that once took weeks of manual effort into automated, continuous operations across hundreds of targets simultaneously.” Twenty claims it is “fundamentally reshaping how the U.S. and its allies engage in cyber conflict.”

Its job ads reveal more. In one, Twenty is seeking a director of offensive cyber research, who will develop “advanced offensive cyber capabilities including attack path frameworks… and AI-powered automation tools.” AI engineer job ads indicate Twenty will be deploying open source tools like CrewAI, which is used to manage multiple autonomous AI agents that collaborate. And an analyst role says the company will be working on “persona development.” Often, government cyberattacks use social engineering, relying on convincing fake online accounts to infiltrate enemy communities and networks. (Forbes has previously reported on police contractors who’ve created such avatars with AI.)

Twenty’s executive team, according to its website, is stacked with former military and intelligence agents. CEO and cofounder Joe Lin is a former U.S. Navy Reserve officer who was previously VP of product management at cyber giant Palo Alto Networks. He joined Palo Alto after the firm acquired Expanse, where he helped national security clients determine where their networks were vulnerable. CTO Leo Olson also worked on the national security team at Expanse and was a signals intelligence officer at the U.S. Army. VP of engineering Skyler Onken spent over a decade at U.S. Cyber Command and the U.S. Army. The startup’s head of government relations, Adam Howard, spent years on the Hill, most recently working on the National Security Council transition team for the incoming Trump administration.

The U.S. government isn’t the only country using AI to build out its hacking capabilities. Last week, AI giant Anthropic released some startling research: Chinese hackers were using its tools to carry out cyberattacks. The company said hackers had deployed Claude to spin up AI agents to do 90% of the work on scouting out targets and coming up with ideas on how to hack them.

It’s possible the U.S. could also be using OpenAI, Anthropic or Elon Musk’s xAI in offensive cyber operations. The Defense Department gave each company contracts worth up to $200 million for unspecified “frontier AI” projects. None have confirmed what they’re working on for the DOD.

Given its focus on simultaneous attacks on hundreds of targets, Twenty’s products appear to be a step up in terms of cyberwarfare automation.

By contrast, beltway contractor Two Six Technologies has received a number of contracts in the AI offensive cyber space, including one for $90 million in 2020, but its tools are mostly to assist humans rather than replace them. For the last six years, it’s been working on developing automated AI “to assist cyber battlespace” and “support development of cyber warfare strategies” under a project dubbed IKE. Reportedly its AI was allowed to press ahead with carrying out an attack if the chances of success were high. The contract value was ramped up to $190 million by 2024, but there’s no indication IKE uses agents to carry out operations at the scale that Twenty is claiming. Two Six did not respond to requests for comment.

AI is much more commonly used on the defensive side, particularly in enterprises. As Forbes reported earlier this week, an Israeli startup called Tenzai is tweaking AI models from OpenAI and Anthropic, among others, to try to find vulnerabilities in customer software, though its goal is red teaming, not hacking.

forbes.com EN 2025 IA Pentagon OpenAI Anthropic AI artificial-intelligence US Hacking Twenty
Researchers question Anthropic claim that AI-assisted attack was 90% autonomous https://arstechnica.com/security/2025/11/researchers-question-anthropic-claim-that-ai-assisted-attack-was-90-autonomous/
15/11/2025 16:18:03
QRCode
archive.org
thumbnail

- Ars Technica
arstechnica.com
Dan Goodin – 14 nov. 2025 13:20

The results of AI-assisted hacking aren’t as impressive as many might have us believe.

Researchers from Anthropic said they recently observed the “first reported AI-orchestrated cyber espionage campaign” after detecting China-state hackers using the company’s Claude AI tool in a campaign aimed at dozens of targets. Outside researchers are much more measured in describing the significance of the discovery.

Anthropic published the reports on Thursday here and here. In September, the reports said, Anthropic discovered a “highly sophisticated espionage campaign,” carried out by a Chinese state-sponsored group, that used Claude Code to automate up to 90 percent of the work. Human intervention was required “only sporadically (perhaps 4-6 critical decision points per hacking campaign).” Anthropic said the hackers had employed AI agentic capabilities to an “unprecedented” extent.

“This campaign has substantial implications for cybersecurity in the age of AI ‘agents’—systems that can be run autonomously for long periods of time and that complete complex tasks largely independent of human intervention,” Anthropic said. “Agents are valuable for everyday work and productivity—but in the wrong hands, they can substantially increase the viability of large-scale cyberattacks.”

“Ass-kissing, stonewalling, and acid trips”
Outside researchers weren’t convinced the discovery was the watershed moment the Anthropic posts made it out to be. They questioned why these sorts of advances are often attributed to malicious hackers when white-hat hackers and developers of legitimate software keep reporting only incremental gains from their use of AI.

“I continue to refuse to believe that attackers are somehow able to get these models to jump through hoops that nobody else can,” Dan Tentler, executive founder of Phobos Group and a researcher with expertise in complex security breaches, told Ars. “Why do the models give these attackers what they want 90% of the time but the rest of us have to deal with ass-kissing, stonewalling, and acid trips?”

Researchers don’t deny that AI tools can improve workflow and shorten the time required for certain tasks, such as triage, log analysis, and reverse engineering. But the ability for AI to automate a complex chain of tasks with such minimal human interaction remains elusive. Many researchers compare advances from AI in cyberattacks to those provided by hacking tools such as Metasploit or SEToolkit, which have been in use for decades. There’s no doubt that these tools are useful, but their advent didn’t meaningfully increase hackers’ capabilities or the severity of the attacks they produced.

Another reason the results aren’t as impressive as they’re made out to be: The threat actors—which Anthropic tracks as GTG-1002—targeted at least 30 organizations, including major technology corporations and government agencies. Of those, only a “small number” of the attacks succeeded. That, in turn, raises questions. Even assuming so much human interaction was eliminated from the process, what good is that when the success rate is so low? Would the number of successes have increased if the attackers had used more traditional, human-involved methods?

According to Anthropic’s account, the hackers used Claude to orchestrate attacks using readily available open source software and frameworks. These tools have existed for years and are already easy for defenders to detect. Anthropic didn’t detail the specific techniques, tooling, or exploitation that occurred in the attacks, but so far, there’s no indication that the use of AI made them more potent or stealthy than more traditional techniques.

“The threat actors aren’t inventing something new here,” independent researcher Kevin Beaumont said.

Even Anthropic noted “an important limitation” in its findings:

Claude frequently overstated findings and occasionally fabricated data during autonomous operations, claiming to have obtained credentials that didn’t work or identifying critical discoveries that proved to be publicly available information. This AI hallucination in offensive security contexts presented challenges for the actor’s operational effectiveness, requiring careful validation of all claimed results. This remains an obstacle to fully autonomous cyberattacks.

How (Anthropic says) the attack unfolded
Anthropic said GTG-1002 developed an autonomous attack framework that used Claude as an orchestration mechanism that largely eliminated the need for human involvement. This orchestration system broke complex multi-stage attacks into smaller technical tasks such as vulnerability scanning, credential validation, data extraction, and lateral movement.

“The architecture incorporated Claude’s technical capabilities as an execution engine within a larger automated system, where the AI performed specific technical actions based on the human operators’ instructions while the orchestration logic maintained attack state, managed phase transitions, and aggregated results across multiple sessions,” Anthropic said. “This approach allowed the threat actor to achieve operational scale typically associated with nation-state campaigns while maintaining minimal direct involvement, as the framework autonomously progressed through reconnaissance, initial access, persistence, and data exfiltration phases by sequencing Claude’s responses and adapting subsequent requests based on discovered information.”

The attacks followed a five-phase structure that increased AI autonomy through each one.

The life cycle of the cyberattack, showing the move from human-led targeting to largely AI-driven attacks using various tools, often via the Model Context Protocol (MCP). At various points during the attack, the AI returns to its human operator for review and further direction. Credit: Anthropic
The attackers were able to bypass Claude guardrails in part by breaking tasks into small steps that, in isolation, the AI tool didn’t interpret as malicious. In other cases, the attackers couched their inquiries in the context of security professionals trying to use Claude to improve defenses.

As noted last week, AI-developed malware has a long way to go before it poses a real-world threat. There’s no reason to doubt that AI-assisted cyberattacks may one day produce more potent attacks. But the data so far indicates that threat actors—like most others using AI—are seeing mixed results that aren’t nearly as impressive as those in the AI industry claim

arstechnica.com EN 2025 Anthropic claim china cyberattack
Many-shot jailbreaking \ Anthropic https://www.anthropic.com/research/many-shot-jailbreaking
08/01/2025 12:17:06
QRCode
archive.org
thumbnail

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

anthropic EN 2024 AI LLM Jailbreak Many-shot
Anthropic researchers find that AI models can be trained to deceive https://techcrunch.com/2024/01/13/anthropic-researchers-find-that-ai-models-can-be-trained-to-deceive/
15/01/2024 06:44:13
QRCode
archive.org
thumbnail

A study co-authored by researchers at Anthropic finds that AI models can be trained to deceive -- and that this deceptive behavior is difficult to combat.

techcrunch EN 2024 AI models study deceive research Anthropic
5048 links
Shaarli - Le gestionnaire de marque-pages personnel, minimaliste, et sans base de données par la communauté Shaarli - Theme by kalvn