AI Agent Economic Infrastructure in Progress: The X402 Protocol and the PayFi Revolution

特邀专栏作者

2026-04-23 10:39

This article is about 20422 words, reading the full article takes about 30 minutes

In the spring of 2025, a seemingly unremarkable standard quietly came into existence. The X402 Payment-Triggered Protocol, jointly launched by organizations including Coinbase and Cloudflare, was initially designed to enable machines to facilitate instant payment requests and response mechanisms without human confirmation, by extending the HTTP "402 Payment Required" status code. The proposal for this protocol was not a baseless concept but a response to a rapidly emerging real-world need—AI agents are moving to the forefront of practical economic activities.

AI Summary

Expand

Core Thesis: By embedding payments into the HTTP request flow, the X402 protocol enables AI agents to conduct instant micropayments without human intervention. Combined with the ERC-8004 identity standard, it establishes a dual-layer trust system of "proof of identity" and "proof of execution," providing foundational infrastructure for the agent economy.
Key Elements:
1. X402 is positioned as a payment-triggered protocol, embedding payment capabilities into HTTP requests to form a "request-payment-delivery" closed loop. It does not maintain accounts or handle identity, and when integrated with the ERC-8004 identity registry and reputation registry, it addresses trust issues.
2. The core technical architecture separates the "Authorization Chain" from the "Execution Chain." The Authorization Chain addresses "whether payment is allowed," while the Execution Chain addresses "how to complete the payment." The Facilitator plays a pivotal role in identity verification, proof of execution generation, and risk control.
3. The commercial outlook focuses on shifting the API economy from account-driven to request-driven. It supports high-frequency, low-value micropayments, promotes instant settlement and credit decoupling in resource markets (such as computing power, data, and content), and enables payment actions themselves to become a foundational input for generating financial credit.
4. In terms of financial compliance, it faces structural challenges including fragmented AML/CFT responsibilities, the lack of legal personhood for AI agents, and ambiguous legal validity of algorithmic authorizations. Code-level protective mechanisms (such as conditional releases and dispute arbitration) are needed to compensate for these gaps.
5. Legal and regulatory frameworks need to clearly define the status of AI agents as "electronic agents." Through limited authorization, auditable logs, and authorization layer protocols like AP2, vague intents can be converted into traceable cryptographic contracts to address consumer protection and copyright compliance requirements.

Introduction: The Background of X402

In the spring of 2025, a seemingly inconspicuous standard quietly emerged. The X402 payment trigger protocol, jointly launched by institutions like Coinbase and Cloudflare, was initially designed to enable machines to instant payment requests and response mechanisms without human confirmation by extending the HTTP "402 Payment Required" status code. The proposal of this protocol was not a random idea but a response to a rapidly forming real-world demand—AI agents are stepping to the forefront of actual economic activities.

By the second half of 2025, with the maturation of the virtual agent ecosystem, X402 quickly transitioned from concept to real-world operation. By the end of 2025, the X402 protocol had been upgraded to V2, supporting features like multi-chain parallelism and session mechanisms. It was integrated by mainstream infrastructure such as Google's Agent Payments Protocol (AP2), helping agents bridge payment behaviors between on-chain and off-chain systems.

Entering 2026, discussions and practices surrounding autonomous economic behaviors of agents intensified rapidly. A wave of AI Agent platforms and frameworks, represented by OpenClaw, sparked an "agent craze": developers used them to build autonomous task executors, content producers, automated service providers, and even attempted to enable these intelligent entities to autonomously earn rewards, pay fees, and purchase services. This phenomenon not only captured the interest of entrepreneurs and developer communities but was also incorporated into incentive systems by major ecosystems. For instance, the collaboration between AIsa and OpenClaw Demo Day promoted the implementation of agent payment infrastructure.

However, these practices simultaneously exposed a deep-seated problem: existing traditional payment systems were designed for human users—each transaction requires identity verification, interaction confirmation, authorization approval, and human-machine dialogue. These processes are clearly unsuitable for high-frequency, on-demand, millisecond-triggered agent payment scenarios, severely constraining the scaling of agent collaboration, automated procurement, and the micro-payment economy. To meet these automated, precise, and high-frequency transaction needs, merely calling existing payment APIs is insufficient. Traditional systems emphasize human control and accountability, whereas agents seek to complete payments freely and securely within authorized boundaries.

The introduction of X402 precisely targets this conflict. It is not a simple settlement tool or API, but a payment trigger protocol that embeds payment capabilities into the request flow, enabling machines to instantly initiate payments and complete operations within authorized scopes in a verifiable manner. The protocol itself does not maintain accounts or handle identities. Instead, using one-time, request-bound proof of payment capability, it allows automated systems to conduct micro-payments safely, efficiently, and without human intervention. This design not only bridges the gap between traditional payment systems and machine-autonomous payments but also provides the underlying payment semantics and infrastructure support for the emerging Agent economy.

I. Technical Architecture

1.1 The Fundamental Logic of Agent Payments

1.1.1 Why the Paying Entity Shifts from "Human" to "Machine"

AI Agent payment is not a "new payment tool" but a systemic restructuring issue arising from the change in the subject of payment decision-making. The key lies not in the settlement method, but in how authorization is expressed, how execution is constrained, and how risks are traced.

In traditional payments, every transaction initiation involves human confirmation—clicking a button, entering a password, scanning a code for authorization. However, when an AI Agent becomes the entity initiating service calls, this premise no longer holds true. Agents need to autonomously complete high-frequency tasks like API calls, data purchases, and compute rental, involving small-value resource consumption, without human intervention. These transactions are characterized by high frequency, low value, long-tail distribution, and strong real-time requirements.

The shift of the paying entity from human to machine introduces a series of new risks: an Agent might trigger a payment without explicit instructions (unforeseeable triggering); an Agent given excessive authority could lead to fund loss (authority abuse); when problems arise, it's difficult to trace back to the specific authorization source and decision logic (accountability challenges); transaction records lack structured proof, making post-audit difficult (missing audit trail). These risks dictate that an Agent-oriented payment system cannot simply reuse traditional payment architecture; it must undergo systemic restructuring from the three dimensions of authorization, execution, and audit.

1.1.2 Why "Authorization" and "Execution" Must Be Separated

In traditional payment systems, "whether payment is allowed" and "how to complete the payment" are usually handled by the same system. In the AI Agent scenario, this approach is no longer viable. A more reasonable path, which has shown consensus at the protocol level, is to decompose Agent payment into two independent yet composable logical chains:

• Authorization Path: Addresses "whether the Agent is allowed to pay on behalf of the user"—including the source, boundaries, validity, and revocation mechanisms of authorization.

• Execution Path: Addresses "how a specific payment is completed and triggers resource delivery"—including payment initiation, verification, settlement, and proof generation.

The core reason for this separation is: if an Agent simultaneously holds the roles of "authorization source" and "execution discretion," system risks become uncontrollable. The authorization source must always come from the user (the fund owner), while execution can be delegated to the Agent under strict constraints. This separation ensures that even if the Agent behaves abnormally, losses are confined within the authorized boundaries; any payment can be traced back to a clear human authorization decision.

From a system design perspective, AI Agent payment involves at least five roles, each with clearly defined responsibilities:

• User (Fund Owner): Provides initial authorization and ultimate responsibility;

• AI Agent (Executing Entity): Initiates payment requests under authorization constraints;

• Merchant/Resource Provider: Sets pricing, verifies payment, and decides whether to release resources;

• Authorization/Trust Layer: Expresses and verifies "who can pay under what conditions";

• Payment Execution Layer: Completes a specific, verifiable, and settleable payment action.

1.1.3 How a Standardized Agent Payment Cycle Forms

A standardized AI Agent payment follows the principle of "authorization upfront, conditional triggering, autonomous execution," and the entire process can be divided into four stages:

Authorization Period: The user issues a Limited Mandate to the Agent, defining the amount, target, and duration, establishing the legal boundaries for fund expenditure. The authorization statement needs to be stored on-chain or in a verifiable manner.

Request Period: The Agent sends a request to the resource provider. The provider returns a machine-readable payment instruction (including amount, currency, receiving address, expiration time, etc.), clarifying the transaction target and payment consideration.

Execution Period: After the Agent internally verifies that the request complies with authorization boundaries and risk control policies, it automatically completes the payment and generates a one-time, non-replayable payment proof.

Settlement Period: The resource provider verifies the authenticity and uniqueness of the payment proof, releases the resource upon receipt confirmation, and records the transaction for auditing.

These four stages form an automated "conditional payment—conditional delivery" cycle. Its engineering significance lies in decoupling "payment intent" from "execution action," allowing the Agent to efficiently complete high-frequency, low-value resource procurement without relying on subjective trust.

The main risks of AI Agent payment are not concentrated in the transfer step itself, but in three areas: Authorization Risk (whether the Agent is over-authorized, whether authorization is revocable and auditable, whether there are semantic ambiguities); Execution Integrity Risk (whether payment is strictly bound to a specific resource request, whether there is replay or concurrent abuse); Systemic and Compliance Risk (whether a complete chain of evidence can be formed, whether it supports post-event reconciliation and compliance audits).

1.1.4 Criteria for Determining Commercial Readiness of a System

A scalable AI Agent payment system should at least meet the following constraints:

Security Dimension: All payments traceable to a clear authorization source; one-to-one correspondence between payment and single resource request; payment credentials non-reusable and non-transferable; execution layer does not rely on subjective trust assumptions; full-process events auditable and reproducible; authorization must be revocable and limitable.

Cost Dimension: Transaction friction low enough to support high-frequency micro-payment scenarios; integration cost friendly for developers, merchants, and platforms; does not require complex pre-established account systems or subscription relationships.

Ecosystem Dimension: Compatible with existing payment networks and settlement systems; supports an open competitive landscape with multiple chains, multiple assets, and multiple Facilitators; protocol standardization sufficient for interoperability.

Compliance Dimension: Clear responsible party—who authorized, who executed, who bears ultimate responsibility, all recorded on-chain or in structured format; supports KYC/AML, cross-border sanctions, KYT and other regulatory checks; clear dispute resolution path with evidence foundation for dispute arbitration.

Whether these conditions are met is the core criterion for evaluating different protocols and the starting point for assessing x402 and its complementary mechanisms.

1.2 x402's Positioning, Mechanism, and Trust Complements

1.2.1 What x402 Specifically Solves and What It Does Not Solve

x402's core positioning is a payment trigger protocol—it sits between HTTP APIs and settlement networks, acting as an information relay bridge. x402 is not a payment system; it solves a precise problem: how to embed "payment" into the "HTTP request—resource delivery" chain, forming a standardized skeleton for per-request payment.

Specifically, x402 solves the following:

• Complements the HTTP 402 status code with a machine-readable, executable, and verifiable minimal information set;

• Completes the "request—payment—delivery" cycle without breaking HTTP's stateless nature;

• Makes payment a capability proof attached to the request, rather than a prerequisite account relationship or subscription binding.

It is equally important to understand what x402 does not solve:

• Does not solve the full trust problem of identity and authorization—it doesn't care "who you are," only "whether this request has been paid for";

• Does not handle all payment networks and settlements—it doesn't bind to a specific chain or payment channel; settlement is handled by external networks;

• Does not cover complex business relationships—it is unsuitable for scenarios requiring long-term customer relationship management or complex billing strategies;

• Does not solve regulatory compliance and dispute arbitration—these need to be handled by upper-layer business systems or supplementary protocols.

This restrained positioning allows x402 to avoid the trap of "over-design and difficult implementation." It acknowledges it is not a universal payment system, accepts the existence of external settlement networks, and does not seek to cover all commercial relationships. Precisely because of this, it emphasizes being "machine-oriented"—if the interacting party is a human user, payment issues can be solved through page redirects, QR codes, or wallet prompts; it's only when the caller is a program that protocol-level expression becomes necessary.

A brief comparison with existing payment models is also worthwhile: Web2 payments usually happen before the call (subscription/prepaid), Web3 payments are initiated by the user's active transaction, whereas x402 embeds payment into the request process, triggered by the server. Web2 maintains a large amount of account state, Web3 puts state on-chain, and x402 attempts to compress state into a single request. From an automation perspective, x402's advantage is clear—it requires neither interactive authorization nor holding long-term private keys; as long as the Client can execute payment and attach proof, the call can be completed.

1.2.2 From HTTP 402 to Request-Level Payment

HTTP 402 (Payment Required) was reserved early in RFC 2616 but without any defined executable semantics. HTTP design intentionally avoided payment mechanisms, leaving them for higher-level systems to handle. Historically, there were attempts—LSAT (Lightning Service Authentication Token), proposed by Lightning Labs in 2019, bound HTTP 402 with Lightning invoices, exchanging payment invoices for Bearer Tokens. LSAT proved the usability of 402 but exposed several problems: after payment, the system degraded to session-based tokens; the server needed to maintain Token-to-permission mappings; payment was decoupled from the request, making "one request, one proof" difficult.

x402 explicitly abandons the "payment for identity" path, instead treating the payment result itself as a one-time proof of capability. Its core design constraint is: no session introduced, no user accounts maintained, no requirement for the server to remember "who has already paid." Under this constraint, payment is redefined as a capability proof attached to a request—as long as the proof is verifiable, non-forgeable, and bindable to the request, the server can release resources in a completely stateless manner.

A complete x402 payment execution path is as follows:

a. Client/Agent Initiates an HTTP Request to access a resource endpoint requiring payment;

b. Server Returns a 402 Response with a machine-readable payment requirement (amount, currency, receiving address, network, expiration time, proof type, etc.), allowing the Client to make a decision without human intervention;

c. Agent Completes Payment within the Authorized Boundary, generating a payment proof;

d. Agent Re-initiates the Request with the Payment Proof;

e. Server Verifies the Authenticity and Uniqueness of the Proof, then releases the resource and records the audit.

The key is: payment is not a prerequisite step, but a branching path after a request fails. This design naturally adapts x402 to stateless HTTP architecture.

The protocol evolution of x402 is also noteworthy. Early experimental implementations faced interoperability issues—payment instruction formats varied, proof verification logic was highly customized, and Facilitators were tightly coupled with Servers. The V2 version solved these fragmentation problems by standardizing the payment instruction format, introducing an independent Facilitator role, and abstracting the settlement layer as a replaceable backend. The real engineering significance of V2 is proving that: payment can be a protocol state, not a business prerequisite. This is particularly important for scenarios like Agent automatic API calls, cross-organizational service access without account relationships, and per-use instant settlement—where introducing an account system is itself a form of friction.

In engineering practice, x402 cannot "elegantly solve" all edge cases, such as successful payment but eventual request failure, Client replaying requests, or temporary Facilitator unavailability. Similar problems have long existed in Lightning, Rollup bridge, and Webhook systems, and are ultimately left to the implementation layer. x402 chooses not to pretend this complexity doesn't exist at the protocol level.

1.2.3 Why x402 Needs an Identity/Trust Complementation Layer

x402 solves "how to embed the payment action into the request," but it has a natural shortcoming: it does not address the identity and trust issues like "who is a trusted Agent," "who authorized it," or "how to revoke authorization." x402 cares about "whether this request is paid for," not "who you are" or "whether you are authorized."

In simple scenarios, this design is sufficient—the Agent has money, pays, and uses the service. However, in scaled commercial scenarios, merely being able to pay is far from enough. Merchants need to know if the counterparty is trustworthy or has malicious history; users need to ensure the Agent doesn't exceed authorization boundaries; regulators need to trace the complete identity chain of a transaction.

Therefore, x402 requires the supplementation of the following capabilities:

• Identity Registration and Verification: Agents need an on-chain verifiable identity identifier so that merchants and Facilitators can decide whether to accept the request;

• Structured Expression of Authorization Boundaries: Not just "has money," but needs to specify "who authorized, for what resource, under what conditions, for how long";

• Defined Responsibilities of the Executing Agent: A trusted intermediary execution layer (Facilitator) is needed between the Agent and the Merchant, handling identity verification, proof generation, and risk control;

• Auditable and Traceable Evidence Foundation: Transaction records need on-chain anchoring to support post-event dispute arbitration.

These are exactly the capabilities that mechanisms like ERC-8004 attempt to supplement.

1.2.4 What Complements Mechanisms like ERC-8004 Bring

ERC-8004, formally named "Trustless Agents," is a standard proposed on Ethereum in August 2025 and officially launched on mainnet in January 2026. It does not modify Ethereum's core but extends inter-agent protocol capabilities through three on-chain registries:

• IdentityRegistry: Provides on-chain identity NFT identifiers for Agents, supporting identity validity queries, blacklisting, and freezing mechanisms;

• ReputationRegistry: Provides multi-dimensional aggregated scores for Agents based on historical transaction performance, with transparent and deterministically updated scoring rules;

• ValidationRegistry: Stores execution proofs (Merkle proofs, transaction hashes, timestamps, etc.), ensuring transaction records are tamper-proof and available for auditing and dispute arbitration.

The combination of ERC-8004 and x402 constructs a dual trust structure of "identity proof + execution proof":

Safety

smart contract

finance

technology

PayFi

x402

Welcome to Join Odaily Official Community