BTC
ETH
HTX
SOL
BNB
查看行情
简中
繁中
English
日本語
한국어
ภาษาไทย
Tiếng Việt

AI Agent Economic Infrastructure in Progress: The X402 Protocol and PayFi Revolution

北大区块链研究
特邀专栏作者
2026-04-23 10:39
本文約20422字,閱讀全文需要約30分鐘
In the spring of 2025, a seemingly unremarkable standard was quietly born. The X402 payment trigger protocol, jointly introduced by organizations such as Coinbase and Cloudflare, was initially designed to enable machines to implement instant payment requests and response mechanisms without manual confirmation by extending the HTTP "402 Payment Required" status code. This protocol was not conceived in a vacuum but as a response to a rapidly emerging real-world need—AI agents are advancing to the forefront of actual economic activities.
AI總結
展開
  • Core Thesis: By embedding payments into the HTTP request flow, the X402 protocol enables AI agents to perform instant micropayments without human intervention. Combined with the ERC-8004 identity standard, it establishes a dual-layer trust system of "proof of identity + proof of execution," providing foundational infrastructure for the agent economy.
  • Key Elements:
    1. X402 is positioned as a payment trigger protocol, embedding payment capabilities into HTTP requests to form a "request-payment-delivery" closed loop. It does not maintain accounts or handle identities. By integrating with the ERC-8004 identity registry and reputation registry, it solves the trust problem.
    2. The core technical architecture separates the "Authorization Chain" from the "Execution Chain." The Authorization Chain addresses "whether payment is allowed," while the Execution Chain addresses "how to complete the payment." The Facilitator acts as a hub for identity verification, execution proof generation, and risk control.
    3. Business prospects focus on the API economy shifting from account-driven to request-driven models. It supports high-frequency, small-value micropayments, promotes instant settlement and credit decoupling in markets for computing power, data, and content, and turns payment behavior itself into a fundamental input for generating financial credit.
    4. In terms of financial compliance, it faces structural challenges such as fragmented AML/CFT responsibilities, the lack of legal personhood for AI agents, and the vague legal effect of algorithmic authorization. Code-level protection mechanisms (e.g., conditional release, dispute arbitration) need to be introduced as remedies.
    5. Legal regulation requires clarifying the "electronic agent" status of AI agents. By using limited authorization, auditable logs, and authorization layer protocols like AP2, vague intentions can be transformed into traceable cryptographic contracts to meet consumer protection and copyright compliance requirements.

Introduction: The Background of X402

In the spring of 2025, an unassuming standard quietly emerged. The X402 Payment Trigger Protocol, jointly launched by Coinbase and Cloudflare, was initially designed to enable machines to issue instant payment requests and responses without human confirmation by extending the HTTP "402 Payment Required" status code. The proposal for this protocol was not a product of thin air but a response to a rapidly forming reality—AI agents are stepping to the forefront of real economic activities.

By the second half of 2025, as the virtual agent ecosystem matured, X402 quickly transitioned from concept to real-world operation. By the end of 2025, the X402 protocol had been upgraded. Version 2 supports multi-chain parallelism, session mechanisms, etc., and has been integrated by mainstream infrastructures like Google's Agent Payments Protocol (AP2), helping agents bridge payment behaviors between on-chain and off-chain systems.

Entering 2026, discussions and practices surrounding autonomous agent economic behavior intensified rapidly. A wave of "agent mania" was sparked by a group of AI Agent platforms and frameworks represented by OpenClaw: developers used them to build autonomous task executors, content producers, automated service providers, and even attempted to enable these intelligent entities to earn rewards, pay fees, and purchase services autonomously. This phenomenon not only piqued the interest of entrepreneurs and developer communities but has also been integrated into the actual incentive systems of major ecosystems, such as the collaboration between AIsa and OpenClaw Demo Day, promoting the implementation of Agent payment infrastructure.

However, these practices simultaneously exposed a deep-seated problem: existing traditional payment systems are designed for human users—each transaction requires identity verification, interaction confirmation, permission approval, and human-machine dialogue. These processes prove markedly inadequate in high-frequency, on-demand, millisecond-triggered agent payment scenarios, severely restricting the large-scale development of agent collaboration, automated procurement, and the micro-payment economy. To meet the demands of such automated, precise, and high-frequency transactions, merely invoking existing payment APIs is no longer sufficient. Traditional systems emphasize human control and accountability, while agents seek to complete payments freely and securely within authorized boundaries.

The introduction of X402 precisely addresses this contradiction. It is not a simple settlement tool or API, but a payment trigger protocol that embeds payment capabilities into the request flow, enabling machines to instantly initiate payments and complete operations in a verifiable manner within authorized scopes. The protocol itself does not maintain accounts or handle identities; instead, through one-time, request-bound payment capability proofs, it allows automated systems to conduct micropayments safely, efficiently, and without human intervention. This design not only bridges the gap between traditional payment systems and machine-autonomous payments but also provides the underlying payment semantics and infrastructure support for the emerging Agent economy.

I. Technical Architecture

1.1 Basic Logic of Agent Payments

1.1.1 Why the Payment Subject Shifts from "Human" to "Machine"

AI Agent payment is not a "new payment tool" but a systemic reconstruction problem resulting from the change in the payment decision-making subject. The key lies not in the settlement method, but in how authorization is expressed, how execution is constrained, and how risks are traced back.

In traditional payments, the initiation of every transaction involves human confirmation—clicking a button, entering a password, scanning for authorization. However, when an AI Agent becomes the subject initiating service calls, this premise no longer holds. Agents need to autonomously complete high-frequency API calls, data procurement, compute rental, and other low-value resource consumption without human intervention. The characteristics of such transactions are: high frequency, small amounts, long-tail dispersion, and strong real-time requirements.

The shift of the payment subject from human to machine brings a series of new risks: the Agent might trigger payments without explicit instructions (unpredictable triggers); the Agent given excessive permissions could lead to loss of fund control (permission abuse); when problems occur, it's difficult to trace back to the specific authorization source and decision logic (accountability challenges); transaction records lack structured existence proofs, hindering post-hoc audits (audit deficiencies). These risks dictate that a payment system for Agents cannot simply reuse traditional payment architectures but must be systematically reconstructed from three dimensions: authorization, execution, and auditing.

1.1.2 Why Must "Authorization" and "Execution" Be Separated

In traditional payment systems, "whether payment is allowed" and "how payment is completed" are usually handled by the same system. This approach no longer holds in the AI Agent scenario. A currently reasonable path, which has shown consensus at the protocol level, is to decouple Agent payments into two independent but composable logical chains:

• Authorization Path: Addresses "whether the Agent is permitted to pay on behalf of the user"—including the source, boundaries, validity period, and revocation mechanism of the authorization.

• Execution Path: Addresses "how a specific payment is completed and triggers resource delivery"—including payment initiation, verification, settlement, and proof generation.

The core reason for separation: if the Agent simultaneously serves as both the "authorization source" and "execution discretion," system risk becomes uncontrollable. The authorization source must always originate from the user (the fund owner), while execution can be delegated to the Agent under strict constraints. This separation ensures that even if the Agent behaves abnormally, losses are confined within the authorized boundaries; any payment can be traced back to a clear human authorization decision.

From a system design perspective, AI Agent payment involves at least five roles, each requiring clear boundaries of responsibility:

• User (Fund Owner): Provides initial authorization and bears ultimate responsibility.

• AI Agent (Execution Entity): Initiates payment requests under authorization constraints.

• Merchant/Resource Provider: Sets prices, verifies payment, and decides whether to release resources.

• Authorization/Trust Layer: Expresses and verifies "who can pay under what conditions."

• Payment Execution Layer: Completes a specific, verifiable, and settleable payment action.

1.1.3 How a Standardized Agent Payment Loop Forms

A standardized AI Agent payment follows the principle of "authorization upfront, conditional trigger, autonomous execution," and the entire process can be divided into four stages:

Authorization Phase: The user issues a Limited Mandate to the Agent, specifying limits, targets, and duration, establishing the legal boundaries for fund expenditure. The authorization statement needs to be recorded on-chain or stored in a verifiable manner.

Request Phase: The Agent sends a request to the resource provider, who returns machine-parsable payment instructions (including amount, currency, receiving address, expiration time, etc.), clarifying the transaction target and payment consideration.

Execution Phase: After internally verifying the request complies with authorization boundaries and risk control policies, the Agent automatically completes the payment and generates a one-time, non-replayable payment proof.

Settlement Phase: The resource provider verifies the authenticity and uniqueness of the payment proof, confirms receipt, releases the resource, and records the transaction for auditing.

These four phases form an automated closed loop of "conditional payment – conditional delivery." Its engineering significance lies in decoupling "payment intent" from "execution action," enabling Agents to efficiently complete high-frequency, low-value resource procurement without relying on subjective trust.

The main risks of AI Agent payment are not concentrated in the transfer step but center on three aspects: Authorization Risk (Is the Agent over-authorized? Is the authorization revocable and auditable? Is there room for semantic ambiguity?); Execution Integrity Risk (Is the payment strongly bound to a specific resource request? Is there potential for replay or concurrent abuse?); Systemic and Compliance Risk (Can a complete evidence chain be formed? Is post-hoc reconciliation and compliance review supported?).

1.1.4 Criteria for Determining if a System is Commercially Viable

A scalable AI Agent payment system should at least meet the following constraints:

Security Dimension: All payments are traceable to a clear authorization source; payments have a one-to-one correspondence with a single resource request; payment credentials are non-reusable and non-transferable; the execution layer does not rely on subjective trust assumptions; all events within the process are auditable and reproducible. Authorization must be revocable and limitable.

Cost Dimension: Transaction friction is low enough to support high-frequency, small-value scenarios; integration costs are friendly for developers, merchants, and platforms; does not require complex upfront account systems or subscription relationships.

Ecosystem Dimension: Compatible with existing payment networks and settlement systems; supports an open competitive landscape with multiple chains, assets, and Facilitators; the degree of protocol standardization is sufficient to support interoperability.

Compliance Dimension: Responsible entities are clear – who authorizes, who executes, and who bears ultimate responsibility, all have on-chain or structured records; supports regulatory checks like KYC/AML, cross-border sanctions, KYT; dispute resolution paths are clear, with a basis for evidence in dispute arbitration.

Whether these conditions are met is the core criterion for evaluating different protocols and is the starting point for subsequently evaluating x402 and its supplementary mechanisms.

1.2 x402's Positioning, Mechanism, and Trust Supplementation

1.2.1 What x402 Solves and What It Doesn't Solve

The core positioning of x402 is a Payment Trigger Protocol – it sits between the HTTP API and the settlement network, acting as an information transmission bridge. x402 is not a payment system; it solves a precise problem: how to embed "payment" into the "HTTP request – resource delivery" chain, forming a standardized skeleton for per-request payment.

Specifically, x402 solves the following:

• Supplements the HTTP 402 status code with a minimum set of machine-understandable, executable, and verifiable information.

• Completes the "request – payment – delivery" loop without breaking the stateless nature of HTTP.

• Makes payment a capability proof attached to the request, rather than a prerequisite account relationship or subscription.

Equally important is understanding what x402 does not solve:

• Does not solve the full trust issues of identity and authorization – it doesn't care "who you are," only "whether this request has been paid for."

• Does not handle all payment networks and settlement – it is not bound to a specific chain or payment channel; settlement is handled by external networks.

• Does not cover complex business relationships – it is unsuitable for scenarios requiring long-term customer relationship management or complex billing strategies.

• Does not solve regulatory compliance and dispute arbitration – these need to be handled by upper-layer business systems or supplementary protocols.

This restrained positioning allows x402 to avoid the trap of "over-designed, hard to implement." It admits it is not a universal payment system, accepts the existence of external settlement networks, and doesn't seek to cover all business relationships. Precisely because of this, it emphasizes being "machine-oriented" – if the interacting party is a human user, payment can be solved through page redirects, QR codes, or wallet pop-ups; only when the caller is a program does protocol-level expression become necessary.

A brief comparison with existing payment models is also useful: Web2 payments usually occur before the call (subscription/prepaid), Web3 payments are initiated by users via active transactions, while x402 embeds payment into the request flow, triggered by the server. Web2 maintains a large amount of account state, Web3 puts state on-chain, and x402 tries to compress state into a single request. From an automation perspective, x402's advantage is clear – it requires no interactive authorization, nor holding long-term private keys; as long as the Client can execute payment and attach a proof, the call can be completed.

1.2.2 From HTTP 402 to Request-Level Payment

HTTP 402 (Payment Required) was reserved in RFC 2616 but never defined with executable semantics. The design of HTTP deliberately avoided embedding payment mechanisms, leaving them for higher-level systems. There have been historical attempts – LSAT (Lightning Service Authentication Token), proposed by Lightning Labs in 2019, bound HTTP 402 with Lightning invoices, exchanging invoice payments for Bearer Tokens. LSAT proved the usability of 402 but exposed several issues: the system degenerated into session-based Tokens after payment; the server needed to maintain Token-permission mappings; payment was decoupled from the request, making it difficult to achieve "one request, one proof."

x402 explicitly abandons the "payment-for-identity" path, instead treating the payment result itself as a one-time capability proof. Its core design constraint is: no sessions introduced, no user accounts maintained, the server doesn't need to remember "who has already paid." Under this constraint, payment is redefined as a capability proof attached to the request – as long as the proof is verifiable, unforgeable, and bindable to the request, the server can release resources in a completely stateless manner.

A complete x402 payment execution path is as follows:

a. Client/Agent initiates an HTTP request to access a resource endpoint that requires payment.

b. The server returns a 402 response, including machine-parsable payment requirements (amount, currency, receiving address, network, expiration time, proof type, etc.). These fields allow the Client to make decisions without human intervention.

c. The Agent completes payment within authorized boundaries, generating a payment proof.

d. The Agent re-initiates the request, carrying the payment proof.

e. The server verifies the authenticity and uniqueness of the proof, releases the resource upon success, and records the audit trail.

The key point: payment is not a prerequisite step, but a branch path after a request fails. This design makes x402 naturally compatible with stateless HTTP architecture.

The protocol evolution of x402 is also noteworthy. Early experimental implementations faced interoperability issues – diverse payment instruction formats, highly customized proof verification logic, tight coupling between Facilitator and Server. Version 2 solved these fragmentation problems by standardizing the payment instruction format, introducing an independent Facilitator role, and abstracting the settlement layer as a replaceable backend. The real engineering significance of V2 was proving that: payment can be a protocol state, not a business prerequisite. This is particularly important for scenarios like Agent automated API calls, cross-organizational service access without account relationships, and pay-per-use instant settlement – where introducing an account system itself is a form of friction.

In engineering practice, x402 cannot "elegantly solve" all edge cases, such as payment succeeding but the request ultimately failing, Client replaying requests, or Facilitator temporarily unavailable. Similar problems have long existed in Lightning, Rollup bridge, and Webhook systems, ultimately left to the implementation layer to handle. x402 chooses not to pretend these complexities don't exist at the protocol layer.

1.2.3 Why x402 Needs an Identity/Trust Supplementation Layer

x402 solves "how to embed the payment action into the request," but it has an inherent weakness: it does not solve identity and trust issues like "who is the trusted Agent," "who authorized it," and "how to revoke authorization." x402 cares about "whether this request has been paid for," not "who you are" or "whether you are authorized."

In simple scenarios, this design is sufficient – if the Agent has money, it can pay; if it pays, it can use. But in large-scale commercial scenarios, only being "able to pay" is far from enough. Merchants need to know if the counterparty is trustworthy and has a history of malicious activity; users need to ensure the Agent won't exceed its authorized boundaries; regulators need to be able to trace the complete identity chain of transactions.

Therefore, x402 requires the following supplementary capabilities:

• Identity Registration and Verification: Agents need on-chain verifiable identity identifiers, allowing merchants and Facilitators to decide whether to accept requests.

• Structured Expression of Authorization Boundaries: Not just "has money," but a clear specification of "who authorized, for what resources, under what conditions, and for how long."

• Definition of Execution Agent Responsibilities: A trusted intermediate execution layer (Facilitator) is needed between the Agent and the Merchant, responsible for identity verification, proof generation, and risk control.

• Evidence Basis for Accountability and Auditing: Transaction records need on-chain anchoring to support post-hoc dispute arbitration.

These capabilities are precisely the areas that mechanisms like ERC-8004 attempt to supplement.

1.2.4 What Mechanisms Like ERC-8004 Bring as Supplementation

ERC-8004, titled "Trustless Agents," is a standard proposed on Ethereum in August 2025 and officially launched on mainnet in January 2026. It does not modify Ethereum's core but extends the protocol capabilities between Agents through three on-chain registries:

• IdentityRegistry: Provides on-chain identity NFT identifiers for Agents, supporting identity validity queries, blacklists, and freezing mechanisms.

• ReputationRegistry: Provides multi-dimensional aggregated scores for Agents based on historical transaction performance, with transparent scoring rules and deterministic updates.

• ValidationRegistry: Stores execution proofs (Merkle proofs, transaction hashes, timestamps, etc.), ensuring transaction records are tamper-proof and available for auditing and dispute arbitration.

The combination of ER

安全
智能合約
金融
技術
AI
PayFi
x402