Agents Payment Architecture

Why I believe agent-to-agent payments will not just run on Stripe, Visa, or on any single rail you've heard of. And what the actual architecture is going to look like.

Author

  • Satvik Jagannath

    Satvik Jagannath

    Co-founder & CEO, Vitra.ai

Tag

Agents Payment Architecture

I've been building products in AI for long enough to know when something shifts under my feet. It's shifting now. And the shift is not the one most people are writing about.

Everyone is fixated on whether ChatGPT will replace Google, whether agents will write all our code, whether OpenAI or Anthropic will win. These are interesting questions. But they are not the most interesting question.

The most interesting question is this: When there are a billion agents in the world making decisions, calling each other, hiring each other, buying from each other: How does the money move?

Because money has to move. An agent that books your flight has to pay the airline. An agent that scrapes a news site has to pay the publisher. An agent that calls another agent for specialized research has to compensate it. An agent running an e-commerce business has to pay its suppliers, its compute providers, its data vendors, its other agent collaborators. Multiply that across a world where agents are calling each other tens of billions of times a day, each transaction a fraction of a cent, each transaction needing to settle in milliseconds — and you realize almost immediately that the financial infrastructure we have today is not built for this world. Not even close.

This is my thesis on how that future actually looks. I'm writing it the way I'd explain it to a founder over coffee — not as a research report, not hedged to death, but as what I actually believe.

The math breaks before the philosophy does

Start with the economics, because this is where every other argument collapses. A credit card transaction costs roughly 30 cents + about 3%. That is a floor, not a feature to be optimized. It exists because of the underlying infrastructure — the issuing bank, the acquiring bank, the network, the risk models, the chargeback reserve, the fraud operations team, the settlement windows, the lawyers. You can make it cheaper in a specific corridor, but you cannot make it fundamentally cheaper as a global rail because the cost structure is baked into the physics of how the system is governed.

Now imagine an agent paying another agent 3 cents for a single model inference. Or 0.5 cent for a piece of data. Or a hundredth of a cent for an API call. The fee is 30 times the transaction. You cannot clear that on a card rail. Not now, not after any amount of optimization, not ever. The rails were built for humans buying things in the tens and hundreds of dollars. They are economically impossible at the scales at which agents will transact.

This is not a small point. This is the entire point. The reason the future of agent payments will not be Stripe is not because Stripe is doing something wrong — it is because the arithmetic of card interchange cannot survive contact with sub-cent machine transactions.

When I say “not Stripe,” I don't mean Stripe the company loses. I mean the Visa-rail-with-a-nice-API thing Stripe is famous for cannot be the substrate. Stripe the company is smart enough to know this, which is why Stripe is already building the thing that replaces it. More on that in a minute.

There are two kinds of agent payments, and people keep confusing them

Here is the distinction that unlocks almost everything: consumer-facing agent payments and machine-to-machine agent payments are completely different problems, and they need completely different rails.

Consumer-facing is when I ask my agent to buy me concert tickets. The ticket costs four hundred dollars. I want a chargeback option if something goes wrong. I want the merchant to be verified. I want fraud protection. I want the receipt in my email. I want dispute resolution if the event is cancelled. For this use case, card networks are not merely good — they are irreplaceable. Visa and Mastercard have spent 60 years building the dispute, trust, and insurance infrastructure around these flows, and no blockchain replicates it. When Amex rolled out purchase protection for agent-initiated transactions, I thought it was the most important defensive move any incumbent had made in the entire cycle, because it answers the only question that matters for consumer agent commerce: who eats the loss when the agent is wrong? Cards eat it. Crypto does not.

Machine-to-machine is when my agent calls your agent's specialized tool for data enrichment at $0.003 per call. Across a day, my agent might make fifty thousand such calls across a hundred different providers. There is no dispute. There is no chargeback. Either the data came back or it didn't. Either the inference ran or it didn't. The request either succeeded in milliseconds or failed in milliseconds. Reversibility has no meaning here. Paying a thirty-cent fee on a three-tenths-of-a-cent transaction is not an optimization problem, it's an absurdity.

For this second flow — the one that will represent the overwhelming majority of agent transactions by count, even if not by dollar value — the rails have to be new. And they are being built, right now, on stablecoins.

When I say stablecoins, I mean specifically dollar-pegged tokens — the USDCs and PYUSDs of the world — running on fast settlement chains. I do not mean Bitcoin. I do not mean meme coins. I do not mean “crypto” in the 2021 sense. I mean a boring, regulated, fully-reserved digital dollar that moves on a programmable network in milliseconds for fractions of a cent. That is the substrate for the machine economy. It's not ideological. It's arithmetic.

The protocol layer is where the real action is

If you zoom in past the rails question, the more interesting architectural story is happening at the protocol layer. And what's happening there is that the industry is, in real time, resurrecting a 27 year old unused piece of HTTP to make it the foundation of machine payments.

Here's the background. In the original HTTP spec, there was a status code — 402 Payment Required — that was reserved for a future feature and never implemented. It sat there for nearly 30 years as a curiosity, a footnote, a “we'll figure this out later.” In the last 18 months, a handful of teams independently realized that 402 is exactly the primitive needed for agent payments. A server replies to an agent's request with “payment required, here are the terms.” The agent signs a payment authorization. The agent retries the request with proof of payment. The server serves the content. One round trip. Stateless. No accounts, no API keys, no onboarding, no human in the loop. The entire interaction — discovery, payment, fulfillment — happens in two HTTP requests.

This is a profound design choice, and I don't think most people appreciate how profound. The web's original payment primitive was hiding in plain sight, and it turns out to be exactly what the agent economy needs. Not because someone was clever enough to design it, but because the same constraints that made HTTP the right protocol for documents (stateless, simple, composable) are the right constraints for machine payments at scale.

The other thing happening at the protocol layer is the idea of mandates. A mandate is a cryptographically signed piece of paper that says, in essence, “my user has authorized me, their agent, to spend up to $400 on concert tickets in the next 48 hours, refundable within 24 hours of purchase.” It is a verifiable digital credential, signed by a hardware-backed key on the user's device, that the agent carries with it. When the agent goes to buy something, the merchant can verify the signature, see the scope of authority, and know — cryptographically, not just on faith — that this transaction is authorized.

The mandate model matters because it solves the question that haunts every conversation about agent payments: when the agent makes a bad purchase, who is liable? Without mandates, the answer is a legal nightmare — the user didn't click anything, the agent doesn't have legal personhood, the merchant only knows an AI showed up. With mandates, you have a cryptographic chain of custody from the user's biometric authorization on their phone, through the agent's scoped authority, to the merchant's fulfillment commitment, to the payment rail's execution. Every step is signed. Every dispute has evidence. You can build insurance on top of it. You can build regulation on top of it. You can build a real economy on top of it.

The architecture I actually believe in

Here is what I think the stack will look like by the end of this decade. I'm going to describe it from the bottom up, because that's how the value flows. You can refer to the image at the top of this post for a visual.

At the bottom, you have settlement — the actual movement of value. This is not one rail. It's multiple rails chosen per-transaction by policy. Consumer purchases above a certain threshold settle on card networks, because you want chargebacks and insurance. Machine-to-machine micropayments settle on stablecoins, because the economics and latency demand it. Cross-border B2B settles on stablecoin corridors, because the existing correspondent banking system is a tire fire at that size. The decision of which rail to use is not a product decision the user makes — it's a policy decision the protocol makes based on transaction size, counterparty trust, jurisdiction, and dispute risk.

Above that, you have capability negotiation — the protocol layer where the buying agent and the selling agent figure out what's being sold, what's being paid, and on which rail. This is where the revived 402 pattern lives, generalized to support multiple payment methods in a single handshake. The agent says “I want this.” The merchant says “it costs this, payable in these ways.” The agent picks the rail and signs the payment. The merchant verifies and serves. It looks the same whether the money moves on a card, a stablecoin, or a bank transfer — the rail is invisible above this layer.

Above that, you have authorization and intent — the mandate layer. This is where the user's device-signed authorization lives. When my phone unlocks with my face, that biometric event can be used to sign a mandate that grants my agent scoped spending authority. The mandate travels with the agent, gets shown to merchants, gets verified by payment networks, and creates the evidentiary chain that makes dispute resolution possible.

Above that, you have identity and attestation — the question of whether this agent is who it says it is and is authorized to be acting for who it says it's acting for. The answer here is going to be cryptographic signatures on HTTP requests, with keys registered in network-operated directories. Which networks? The card networks, yes — Visa and Mastercard and Amex will end up operating the trust registries for consumer-facing agents, because they are the only entities with the scale and the compliance posture to do it. And Cloudflare, which quietly ended up in a position to be the trust registry for the web itself. And a handful of open registries for agents that operate outside the consumer payment flow.

At the top, you have distribution — the agent interfaces people actually use. This is OpenAI, Google, Anthropic, Meta, and a handful of others. These are the aggregators, the surfaces, the places where users delegate their purchases to an agent in the first place. This layer is the most valuable and the most concentrated, and it's where Aggregation Theory plays out in the agent era the same way it did in the consumer internet era. There will be a handful of winners, the winners will have enormous margin, and the rails below them will commoditize.

The thing I believe that most people are missing

If you squint at this architecture, you'll notice something. The interesting value capture is not at the settlement layer. Stablecoin issuers compete on yield and regulatory posture; their margin is thin. Card networks survive but see their interchange slowly ground down on agent flows. The bottom of the stack becomes a commodity — dollars moving through programmable pipes, with the pipes' operators making thin spreads.

The interesting value capture is at the identity and intent layers. Who operates the cryptographic trust registry that every agent must register with? That's a toll booth. Who operates the mandate-signing infrastructure that every consumer's phone uses to authorize agent purchases? Also a toll booth. Who insures the gap between what the mandate authorized and what the agent actually did? A new category entirely, probably worth ten billion dollars of annual fees by the early 2030s.

And the most interesting capture of all — the one I think is genuinely underbuilt right now — is programmable attribution across multi-agent chains. When Agent A calls Agent B which calls Agent C which calls a data provider which calls an inference API, and the end user pays ten cents for the whole thing, how does the ten cents split back through the chain? There is no primitive for this today. Everyone is rolling their own. Whoever builds the neutral, cross-rail, protocol-level answer to this question — think ASCAP but for agent networks — is building a multi-billion-dollar business. I don't know who wins this, but I'd bet on it happening before 2030.

What I don't believe

I want to be honest about the things I don't believe, because a thesis that doesn't contradict anything is probably not a real thesis.

I do not believe the future is a single blockchain hosting all agent payments. The crypto-maximalist version of this story — where every transaction is on-chain and “traditional finance” is disintermediated — misunderstands what people actually want from consumer-facing commerce. They want chargebacks. They want fraud protection. They want to undo mistakes. Blockchains are good at many things; being undoable is not one of them.

I do not believe the future is a single winning payment protocol. The “HTTP of agent payments” framing is a useful analogy but a misleading prediction. HTTP didn't win alone. HTTP won with TCP/IP underneath, TLS alongside, DNS as the directory, and payment networks layered on top. The agent payment future is the same — layers, not a single winner.

I do not believe Stripe loses, even though I think the rail Stripe was built on will not carry the agent economy. Stripe is one of the smartest infrastructure companies ever built. They're already hedged. They have the card-rail play for consumer commerce. They have a stablecoin chain in development for machine commerce. They have a protocol for both. Betting against Stripe is betting against a company that has already seen this transition coming and is building for it from both ends.

I do not believe meme-coin agent tokens matter. There is a whole cohort of crypto projects right now issuing tokens that are supposed to “power the agent economy.” Most of them are marketing. The real plumbing is being built by boring companies with boring names using boring regulated stablecoins. Pay attention to the plumbing. Ignore the tokens.

And I do not believe this is happening as fast as the hype suggests. The protocols are being built faster than the behavior is changing. Consumers are not yet comfortable buying things through chat windows. Enterprises are not yet comfortable letting agents spend their budgets. The infrastructure will be years ahead of the adoption, and the gap is going to be awkward and full of failed pilots. That's fine. That's how every platform shift has ever worked. The internet was ready to sell things in 1995; consumers were ready to buy things in 2003. We are currently in the 1997 of agent commerce.

The part I get existentially excited about

I want to end on what I think the real story is, past all the protocol-war stuff, past the rail analysis, past the thesis on who wins.

The story is that for the first time in the history of capital, the default participant in the economy is not going to be a human. The overwhelming majority of transactions by count will be machines transacting with machines at scales and speeds no human economy could support. Agents will hire other agents. Agents will accumulate capital. Agents will pay humans for training data, for compute, for specialized judgment. Agents will run businesses, and those businesses will be profitable in ways that look alien to us, because the operating cost structure — no rent, no payroll, no physical space, no sleep — is fundamentally different from anything humans have ever run.

The rails I've been describing in this post are the circulatory system of that economy. They're being built now. They're not finished. They're going to be rebuilt several times before they're stable. And the people building them, right now, in April of 2026, are doing the most consequential infrastructure work of this generation, even though almost nobody outside a narrow technical circle is paying attention.

A founder, should build on the gaps. Identity. Mandates. Insurance. Attribution. The boring plumbing. The toll booths.

An investor should stop looking at the tokens and start looking at the standards bodies. The IETF drafts from the last 6 months tell you more about the next decade than any pitch deck will.

If you're just curious — pay attention. This is the transition. It's not the most cinematic part of the AI revolution, but it might be the part that matters most, because it's the part that decides who captures the value created by everything else.

The money is going to move differently. The infrastructure to make it move differently is being built in public, on GitHub, in developer docs, in bilateral partnerships between companies that would never have collaborated five years ago. The rails are bifurcating. The protocols are converging. The identity layer is consolidating. The insurance layer doesn't exist yet but will. The attribution primitive doesn't exist yet but has to.

This is how I think it's going to look. I'll probably be wrong about the specifics and right about the shape. That's true with most predictions. But I feel good about the general contours of this thesis, because it's based on the math of payments and the history of platform shifts, not on the hype of the moment.

Build it, or build on it. But don't ignore it. The last time a payment layer got rewritten from scratch, it defined the next three decades of the internet. This one is going to define the next three decades of whatever comes after.

I can't wait to see who builds it and how the economy changes when it gets built. It's going to be a wild ride.

Subscribe

Subscribe to Blog

Get new essays, product notes, and maker updates when fresh writing drops.

Next Article

No next article.