Anthropic's AI Marketplace Experiment Reveals an Uncomfortable Truth: Better Agents Win, and You Won't Notice

Anthropic ran a classified internal marketplace where 69 employees let AI agents negotiate on their behalf. The agents completed 186 deals worth $4,000+ — but employees with weaker (Haiku) agents received $3.64 less per transaction on average, and critically, had no idea they were getting worse deals.

Original source

Anthropic just published the results of Project Deal, one of the most revealing AI agent experiments to date: a real-money marketplace where employees couldn't negotiate for themselves. The AI had to do it.

The experiment worked like an internal Craigslist. Sixty-nine employees each received a $100 budget and went through an intake interview with Claude, describing what they wanted to sell, their acceptable price ranges, and their negotiation preferences. Claude then built a custom agent for each person and unleashed them into a multi-channel Slack marketplace — where the agents posted listings, made offers, countered, and closed deals entirely autonomously.

The headline number: 186 deals, $4,000+ in total transactions. But the more important finding was buried in the model comparison. Anthropic ran four simultaneous versions of the marketplace — two where all agents used Claude Opus (the strongest model), and two that mixed Opus with Claude Haiku (a faster, weaker model). Employees with Haiku agents received approximately $3.64 less per item on average. One broken bike sold for $38 via a Haiku agent — and $65 via an Opus agent. Same bike. Same listing. Different AI quality.

The critical finding isn't the price gap itself — it's that participants had no idea. Perceived fairness ratings were essentially identical across model types. The people getting worse deals didn't feel like they were getting worse deals. They had no visibility into their agent's capability level, no way to audit the negotiation, no mechanism to know they were at a disadvantage.

This is the uncomfortable preview of an agentic economy. As AI agents handle more of our commercial interactions — price negotiations, contract terms, service agreements — access to higher-quality agents will quietly confer real financial advantages that most participants won't perceive or be able to contest. Anthropic acknowledges that "policy frameworks for this simply don't exist yet." Project Deal is a small experiment, but the inequality dynamics it surfaces are not small at all.

Panel Takes

The Builder

Developer Perspective

“The methodology is clean — real money, real goods, real consequences. The $3.64 per-item gap sounds small but compounds dramatically at scale. This is exactly the kind of empirical benchmark that agent infrastructure teams should be designing for.”

The Skeptic

Reality Check

“69 employees in a low-stakes marketplace is a very small sample for drawing conclusions about agentic inequality at scale. The $100 budgets and trivial items also mean participants had little incentive to scrutinize outcomes. Real agentic commerce involves much higher stakes and more adversarial counterparties.”

The Futurist

Big Picture

“The fairness invisibility problem Anthropic surfaced will define the next decade of AI policy debates. When your agent negotiates your rent, your salary, your medical bills — and you can't tell if it's competitive or not — we need disclosure standards that don't yet exist. Project Deal should be required reading for every AI regulator.”

Panel Takes

Bookmarks