Beyond browser automation: the protocols defining agentic commerce

Almost every dollar an AI agent spends on the consumer web in 2026 still moves through a browser. The agent loads the merchant page, clicks the button, fills the form, submits. Browser automation. It works. We've shipped a lot of it ourselves at Axiom, because the consumer long tail (Amazon, Etsy, Discogs, Best Buy, the airlines, the grocery delivery services, the concert ticket sites, the indie record stores) lives in browser-rendered checkout. That's not changing fast.

But the floor on browser automation is bad. CAPTCHAs catch agents and humans both. Anti-bot defenses are getting more aggressive, not less. The DOM changes overnight and a recipe breaks. Latency is measured in seconds, not milliseconds. Disputes don't have a clean trace because the merchant doesn't know it was an agent. None of this gets meaningfully better with smarter scrapers. It requires a different abstraction.

That different abstraction is what 2025 and early 2026 have been building. Five protocols are now in the field, and they're not competing for the same spot in the stack. They're layered. And critically, none of them solve the part of the problem that consumers actually feel.

The five protocols you should know

x402: machine-payable HTTP

Coinbase introduced x402 by reviving the dormant HTTP 402 Payment Required status code. The pattern is simple: when an agent hits a paid endpoint, the server responds with a 402 and a payment header describing what's owed. The agent pays (currently USDC and EIP-3009 stablecoins on Base, Polygon, and Solana, with broader card support on the roadmap), then retries with a payment-signature header carrying proof of settlement. A facilitator handles verification and on-chain settlement, so the merchant doesn't need to run blockchain infrastructure.

x402 went open as the x402 Foundation under the Linux Foundation in April 2026, with backers including Google, Microsoft, Visa, Stripe, AWS, and the Solana Foundation. As a wire-level primitive, it's the closest thing to "machine-payable HTTP" with industry consensus behind it. Real-world volume is still small (about $28k/day at the time of writing), but the spec has buy-in.

ACP: checkout interaction model

Stripe and OpenAI co-developed the Agentic Commerce Protocol (ACP), the interaction layer between a buyer's agent and a merchant. Instead of having the agent click around the merchant's checkout, ACP exposes a structured set of endpoints the agent talks to directly: discover product, create cart, take payment, get confirmation. ACP is already in production. It's what powers Instant Checkout in ChatGPT, where users can buy from US Etsy sellers (with Shopify support coming) without leaving the chat.

ACP is Apache 2.0 licensed and jointly governed by OpenAI and Stripe. Salesforce, PayPal, and others have announced support. Of the protocols on this list, ACP has the most production usage today.

AP2: trust and authorization

Google's Agent Payments Protocol (AP2) sits one level above ACP. It defines what the user actually authorized and provides a cryptographic record of that authorization. AP2's centerpiece is mandates: verifiable digital credentials that capture specific user permissions, signed with hardware-backed keys on the user's device.

There are three mandate types in AP2:

Cart Mandate. The user signs a finalized cart in real time when they're present at purchase.
Intent Mandate. The user signs a description of what they're authorizing the agent to buy when they won't be present.
Payment Mandate. A separate credential that signals to the network and issuer that an AI agent is involved, while keeping the cart details private.

AP2 is payment-agnostic: it works with cards, bank transfers, stablecoins, and via an x402 extension, on-chain payments. Google announced AP2 with 60+ launch partners including Mastercard, Coinbase, Etsy, PayPal, Salesforce, ServiceNow, and Adyen. We go deeper on mandates in the next post in this series.

Visa Trusted Agent Protocol: bot vs. agent verification

Visa introduced Trusted Agent Protocol (TAP) in October 2025 as part of its broader Visa Intelligent Commerce initiative. TAP solves a narrower but crucial problem: it lets merchants tell the difference between a malicious bot and a legitimate AI agent acting on behalf of a real consumer. It's built on HTTP Message Signatures and Web Bot Auth so it can be deployed with minimal changes to merchant infrastructure.

Visa's commerce pilots with Trusted Agent Protocol are scheduled for early 2026 in Asia Pacific and Europe.

Mastercard Verifiable Intent: disputable intent records

In March 2026, Mastercard introduced Verifiable Intent, an open-source framework that links the consumer's identity, their specific instructions, and the outcome of a transaction into a single tamper-resistant record. The technical centerpiece is SD-JWT selective disclosure: the merchant sees the checkout-relevant claims, the network sees payment-relevant claims, and neither sees more than they need. Mastercard Verifiable Intent is being integrated into Microsoft's Copilot Checkout and OpenAI's Instant Checkout in ChatGPT.

Verifiable Intent extends Mastercard Agent Pay, which was announced in 2025 as the broader framework for tokenized agent payments on the Mastercard network.

They're a stack, not a contest

The instinct when you see five protocols is to expect a winner-take-all standards fight. That's not what's happening. Each one solves a different layer of the same problem.

x402 is wire transport. Machine-payable HTTP via the 402 status code and payment headers.
ACP is the checkout interaction. An agent-merchant API for cart, pay, confirm.
AP2 is authorization. A cryptographic record of what the user agreed to, in the form of mandates.
Visa TAP is verification. Merchants can tell legitimate agents from bots.
Mastercard Verifiable Intent is the dispute trail. A tamper-evident record across parties.

In practice, a single transaction can (and probably will) use several of these together. An agent could use ACP to negotiate a cart, sign an AP2 Cart Mandate to authorize it, present credentials via Visa TAP so the merchant knows it's a real agent, and settle via x402 if the merchant supports it. None of these layers is in tension with the others.

The layer the protocols don't define

A funny thing about all five of these protocols is that they're server-to-server. They define how agents talk to merchants, how merchants verify agents, how networks see agent-initiated transactions. None of them define the relationship between a consumer and the money the agent is spending.

That's not a flaw. They're doing their job. Wire-level standards aren't supposed to specify product surfaces, and a Cart Mandate doesn't need to know what a payment app looks like on a phone screen. But the absence is conspicuous when you step back and ask who is going to make a person feel okay about their AI agent buying things while they sleep.

There's a list of things the principal needs from such a product. They need to see what their agents are spending and what's left under their rules. They need to define what their agents are allowed to do, in plain terms. They need to watch transactions happen in something close to real time. They need to audit any purchase, including the agent's reasoning for why it bought what it bought. They need to be able to pause everything if something feels off.

None of that is in any of the five protocols. None of it should be. But all of it has to exist somewhere.

Banks are the historical default for the "place a person manages their money" job, and whether they end up playing that role for agent commerce is an open question. Bank product cycles are measured in years. Agent commerce is moving in weeks. The category of "consumer-native payment surface for the agent era" doesn't have an obvious incumbent yet, and the most interesting thing in the space over the next eighteen months might not be which protocol wins, but who builds the layer that sits above all of them.

Where this goes

The protocols are converging on roughly the same shape regardless of which one ends up dominant. The harder question is the trust layer that has to sit above them. Who builds it, what it looks like in the hand, how it earns the right to be the surface a person opens when they want to know what their agents have been up to.

The next post in this series goes deeper on mandates, the AP2 primitive that captures what the user actually authorized. It's the most architecturally interesting piece of the agentic commerce stack, and the shape of it has implications for what any consumer product in this space ends up looking like.