AI 代理经济基建进行中: X402 协议与PayFi 革命

北大区块链研究

特邀专栏作者

2026-04-23 10:39

บทความนี้มีประมาณ 20422 คำ การอ่านทั้งหมดใช้เวลาประมาณ 30 นาที

กำลังดำเนินการสร้างโครงสร้างพื้นฐานทางเศรษฐกิจของ AI Agent: โปรโตคอล X402 และการปฏิวัติ PayFi

สรุปโดย AI

ขยาย

ในฤดูใบไม้ผลิปี 2025 มาตรฐานที่ดูเหมือนไม่โดดเด่นได้ถือกำเนิดขึ้นอย่างเงียบๆ โปรโตคอลทริกเกอร์การชำระเงิน X402 ซึ่งเปิดตัวร่วมกันโดยองค์กรต่างๆ เช่น Coinbase และ Cloudflare มีวัตถุประสงค์เริ่มแรกเพื่อให้เครื่องจักรสามารถขยายรหัสสถานะ HTTP "402 Payment Required" เพื่อให้เกิดกลไกการร้องขอและตอบสนองการชำระเงินทันทีโดยไม่ต้องมีการยืนยันจากมนุษย์ การเสนอโปรโตคอลนี้ไม่ได้เป็นเพียงแนวคิดที่วางแผนไว้ล่วงหน้า แต่เป็นการตอบสนองต่อความต้องการที่แท้จริงที่กำลังเกิดขึ้นอย่างรวดเร็ว นั่นคือ AI agent กำลังก้าวเข้าสู่แนวหน้าของกิจกรรมทางเศรษฐกิจที่แท้จริง

Overview: The Background of X402

In the spring of 2025, an unassuming standard quietly emerged. The X402 Payment Trigger Protocol, co-introduced by Coinbase and Cloudflare, was initially designed to enable machines to execute instant payment requests and responses without human confirmation, by extending the HTTP "402 Payment Required" status code. The proposal was not born in a vacuum but as a response to a rapidly taking shape reality—AI agents are moving to the forefront of actual economic activity.

By the second half of 2025, with the virtual agent ecosystem maturing, X402 quickly transitioned from concept to real-world operation. By the end of 2025, the X402 protocol had been upgraded. Version 2 (V2) supported features like multi-chain parallelism and session mechanisms, and was integrated by mainstream infrastructure such as Google's Agent Payments Protocol (AP2), helping agents bridge payment actions between on-chain and off-chain systems.

Entering 2026, discussions and practices around autonomous agent economic behavior intensified rapidly. A wave of AI Agent platforms and frameworks, represented by OpenClaw, sparked an "agent craze": developers used them to build autonomous task executors, content producers, automated service providers, and even attempted to enable these intelligences to earn rewards, pay fees, and purchase services independently. This phenomenon not only captured the interest of entrepreneurs and the developer community but was also integrated into the actual incentive systems of major ecosystems. For example, the collaboration between AIsa and the OpenClaw Demo Day promoted the implementation of Agent payment infrastructure.

However, these practices simultaneously revealed a deep-seated problem: existing traditional payment systems were designed for human users—every transaction requires identity verification, interactive confirmation, permission approval, and human-machine dialogue. These processes are clearly ill-suited for agent payment scenarios triggered at high frequency, on-demand, and in milliseconds, severely hindering the scaling of agent collaboration, automated procurement, and the micro-payment economy. To meet the demands of automated, precise, and high-frequency transactions, simply calling existing payment APIs is insufficient. Traditional systems emphasize human control and accountability, while agents seek to execute payments freely and securely within authorized boundaries.

The introduction of X402 precisely targets this contradiction. It is not a simple settlement tool or API, but a Payment Trigger Protocol that embeds payment capabilities into the request flow, enabling machines to instantly initiate payments and complete operations verifiably within authorized limits. The protocol itself does not maintain accounts or handle identities. Instead, it allows automated systems to conduct micropayments securely, efficiently, and without human intervention through one-time, request-bound proof of payment capability. This design not only bridges the gap between traditional payment systems and machine-autonomous payments but also provides the underlying payment semantics and infrastructure support for the emerging Agent economy.

1. Technical Architecture

1.1 The Basic Logic of Agent Payments

1.1.1 Why the Payer Shifts from "Human" to "Machine"

AI Agent payment is not a "new payment tool," but a systemic restructuring issue arising from the change in the decision-making entity for payments. The key lies not in the settlement method, but in how authorization is expressed, execution is constrained, and risks are traced.

In traditional payments, every transaction initiation requires human confirmation—clicking a button, entering a password, scanning a QR code. However, when an AI Agent becomes the entity initiating service calls, this premise no longer holds. Agents need to autonomously and frequently complete small-value resource consumption tasks like API calls, data purchases, and compute rental without human intervention. The characteristics of such transactions are: high frequency, small amounts, long-tail dispersion, and strong real-time requirements.

The shift of the payer from human to machine brings a series of new risks: an Agent might trigger a payment without a clear instruction (unpredictable trigger); excessive permissions granted to an Agent could lead to fund loss (permission abuse); difficulty in tracing the specific authorization source and decision logic when problems arise (accountability challenges); and transaction records lacking structured proof, hindering post-hoc audits (audit deficiencies). These risks determine that a payment system for Agents cannot simply reuse traditional payment architectures but must undergo a systemic restructuring across the three dimensions of authorization, execution, and auditing.

1.1.2 Why "Authorization" and "Execution" Must Be Separated

In traditional payment systems, "whether payment is allowed" and "how to complete the payment" are typically handled by the same system. This approach is no longer viable in AI Agent scenarios. A reasonable path currently emerging, with consensus evident at the protocol level, is to decouple Agent payments into two independent but composable logical chains:

• Authorization Path: Addresses "whether an Agent is allowed to pay on behalf of a user"—including the source, boundaries, validity period, and revocation mechanism of authorization.

• Execution Path: Addresses "how a specific payment is completed and resource delivery triggered"—including payment initiation, verification, settlement, and proof generation.

The core reason for this separation is that if an Agent simultaneously serves as both the "authorization source" and the "execution arbiter," system risk becomes uncontrollable. The authorization source must always be the user (the fund owner), while execution can be delegated to the Agent under strict constraints. This separation ensures that even if an Agent behaves abnormally, losses are confined within the authorized boundaries; any payment can be traced back to a specific human authorization decision.

From a system design perspective, AI Agent payment involves at least five roles, each with clearly defined responsibilities:

• User (Fund Owner): Provides initial authorization and bears ultimate responsibility.

• AI Agent (Executing Entity): Initiates payment requests under authorization constraints.

• Merchant/Resource Provider: Sets pricing, verifies payments, and decides whether to release resources.

• Authorization/Trust Layer: Expresses and verifies "who can pay under what conditions."

• Payment Execution Layer: Completes a specific, verifiable, and settleable payment action.

1.1.3 How a Standardized Agent Payment Cycle Forms

A standardized AI Agent payment follows the principle of "authorization first, conditional trigger, autonomous execution," and the entire process can be divided into four stages:

Authorization Period: The user issues a Limited Mandate to the Agent, specifying amount limits, targets, and validity, establishing the legal boundaries for fund expenditure. The authorization statement needs to be recorded on-chain or stored verifiably.

Request Period: The Agent sends a request to the resource provider, who returns machine-readable payment instructions (including amount, currency, receiving address, expiration time, etc.), clarifying the transaction target and consideration.

Execution Period: After internally verifying that the request complies with the authorization boundary and risk control policies, the Agent automatically completes the payment and generates a one-time, non-replayable payment proof.

Settlement Period: The resource provider verifies the authenticity and uniqueness of the payment proof, confirms receipt, releases the resource, and records the transaction for auditing purposes.

These four stages form an automated cycle of "conditional payment — conditional delivery." Its engineering significance lies in decoupling "payment intent" from "execution action," enabling Agents to efficiently conduct high-frequency, small-value resource procurement without relying on subjective trust.

The main risks of AI Agent payment are not concentrated in the transfer step itself, but rather in three areas: Authorization Risk (whether the Agent is over-authorized, whether authorization is revocable and auditable, whether there is semantic ambiguity); Execution Integrity Risk (whether the payment is strongly bound to the specific resource request, whether there is potential for replay or concurrent abuse); Systemic and Compliance Risk (whether a complete chain of evidence can be formed, whether post-hoc reconciliation and compliance audits are supported).

1.1.4 Criteria for Determining if a System is Commercially Viable

A scalable AI Agent payment system should at least meet the following constraints:

Security Dimension: All payments must be traceable to a clear authorization source; there must be a one-to-one correspondence between payment and a single resource request; payment credentials must be non-reusable and non-transferable; the execution layer must not rely on subjective trust assumptions; all process events must be auditable and reproducible. Authorization must be revocable and have limit settings.

Cost Dimension: Transaction friction must be low enough to support high-frequency, small-value scenarios; integration costs should be friendly for developers, merchants, and platforms; complex upfront account systems or subscription relationships should not be required.

Ecosystem Dimension: Must be compatible with existing payment networks and settlement systems; support an open competitive landscape for multiple chains, multiple assets, and multiple Facilitators; protocol standardization must be sufficient to support interoperability.

Compliance Dimension: The responsible entity must be clear—who authorized, who executed, who bears ultimate responsibility, all recorded on-chain or in structured records; must support regulatory checks like KYC/AML, cross-border sanctions, KYT; clear dispute resolution paths and evidence basis for arbitration must exist.

Meeting these criteria is the core standard for evaluating different protocols and the starting point for assessing x402 and its complementary mechanisms.

1.2 The Positioning, Mechanism, and Trust Supplement of x402

1.2.1 What x402 Solves and What It Doesn't

The core positioning of x402 is as a Payment Trigger Protocol—it sits between the HTTP API and the settlement network, acting as a bridge for information transfer. x402 is not a payment system; it solves a precise problem: how to embed "payment" into the "HTTP request — resource delivery" chain, forming a standardized skeleton for per-request billing.

Specifically, x402 solves the following:

• Provides the HTTP 402 status code with a minimal set of machine-readable, executable, and verifiable information.

• Completes the "request — payment — delivery" cycle without breaking the stateless nature of HTTP.

• Makes payment a capability proof attached to a request, rather than a prerequisite account relationship or subscription binding.

Equally important is understanding what x402 does not solve:

• Does not solve the complete trust issues of identity and authorization—it doesn't care "who you are," only "whether this request has been paid for."

• Does not handle all payment networks and settlement—it is not bound to a specific chain or payment channel; settlement is handled by external networks.

• Does not cover complex business relationships—it is unsuitable for scenarios requiring long-term customer relationship management or complex billing strategies.

• Does not handle regulatory compliance and dispute arbitration—these need to be handled by higher-level business systems or supplementary protocols.

This restrained positioning helps x402 avoid the trap of "over-designed, hard to implement." It acknowledges it is not a universal payment system, accepts the existence of external settlement networks, and does not seek to cover all business relationships. For this reason, it emphasizes being "machine-oriented"—if the interacting party is a human user, payment can be solved through page redirects, QR codes, or wallet prompts; protocol-level expression becomes necessary only when the caller is a program.

A brief comparison with existing payment models is also worthwhile: Web2 payment usually occurs before the call (subscription/prepaid), Web3 payment is initiated actively by the user, while x402 embeds payment into the request flow, triggered by the server. Web2 maintains extensive account state, Web3 puts state on-chain, and x402 attempts to compress state into a single request. From an automation perspective, x402's advantage is clear—it requires no interactive authorization nor holding long-term private keys; as long as the client can execute the payment with proof, the call can be completed.

1.2.2 From HTTP 402 to Request-Level Payment

HTTP 402 (Payment Required) was reserved in RFC 2616 but never defined with executable semantics. HTTP's original design deliberately avoided payment mechanisms, leaving them for higher-level systems. There have been historical attempts—LSAT (Lightning Service Authentication Token), proposed by Lightning Labs in 2019, bound HTTP 402 with Lightning invoices, exchanging payment for a Bearer Token. LSAT proved the usability of 402 but exposed several problems: the system reverted to session-based tokens after payment; the server needed to maintain a mapping between tokens and permissions; payment and request were decoupled, making it hard to achieve "one request, one proof."

x402 explicitly abandons the path of "pay for identity." Instead, it treats the payment result itself as a one-time capability proof. Its core design constraint is: no sessions, no user accounts maintained by the server, no need for the server to remember "who has paid." Under this constraint, payment is redefined as a capability proof attached to a request—as long as the proof is verifiable, non-forgeable, and bound to the request, the server can release resources in a completely stateless manner.

The execution path of a complete x402 payment is as follows:

a. Client/Agent initiates an HTTP request to access a resource endpoint that requires payment.

b. The server returns a 402 response with machine-readable payment requirements (amount, currency, receiving address, network, expiry time, proof type, etc.). These fields allow the Client to make the decision without human intervention.

c. The Agent completes the payment within its authorized boundaries, generating a payment proof.

d. The Agent re-initiates the request carrying the payment proof.

e. The server verifies the authenticity and uniqueness of the proof, releases the resource upon success, and records it for audit.

The key point: payment is not a prerequisite step, but a branch path taken after a request fails. This design makes x402 naturally fit the stateless HTTP architecture.

The protocol evolution of x402 is also noteworthy. Early experiments faced interoperability issues—diverse payment instruction formats, highly customized proof verification logic, and tight coupling between Facilitator and Server. V2 addressed these fragmentation issues by standardizing the payment instruction format, introducing the independent Facilitator role, and abstracting the settlement layer as a replaceable backend. The true engineering significance of V2 lies in proving that: payment can be a protocol state, not a business prerequisite. This is particularly important for scenarios like automated API calls by Agents, service access across organizations without account relationships, and per-use instant settlement—where introducing an account system itself creates friction.

In engineering practice, x402 cannot "elegantly solve" all edge cases, such as payment success followed by request failure, replay requests from the Client, or temporary Facilitator unavailability. Similar issues have long existed in Lightning, Rollup bridge, and Webhook systems and are ultimately left to the implementation layer. x402 chooses not to pretend these complexities don't exist at the protocol layer.

1.2.3 Why x402 Needs an Identity/Trust Supplement Layer

x402 solves "how the payment action is embedded in the request," but it has an inherent shortcoming: it does not solve the identity and trust issues of "who is a trusted Agent," "who authorized it," or "how to revoke authorization." x402 concerns itself with "whether this request has been paid for," not "who you are" or "whether you are authorized."

In simple scenarios, this design is sufficient—an Agent with funds can pay, and after paying, it can use the service. However, in scaled commercial scenarios, merely being able to pay is far from adequate. Merchants need to know if the counterparty is trustworthy or has a history of malicious behavior; users need assurance that the Agent won't exceed authorization boundaries; regulators need to be able to trace the complete identity chain of transactions.

Therefore, x402 requires the supplementation of the following capabilities:

• Identity Registration and Verification: Agents need an on-chain verifiable identity identifier, allowing Merchants and Facilitators to assess whether to accept a request.

• Structured Expression of Authorization Boundaries: Beyond just "having funds," it requires clarity on "who authorized it, for what resources, under what conditions, and for how long."

• Defined Responsibilities of the Executing Agent: A trusted intermediary execution layer (Facilitator) is needed between the Agent and Merchant to handle identity verification, proof generation, and risk control.

• Accountable and Auditable Evidence Foundation: Transaction records need on-chain anchoring to support future dispute arbitration.

These capabilities are precisely the areas that mechanisms like ERC-8004 aim to supplement.

1.2.4 What Mechanisms Like ERC-8004 Bring as Supplements

ERC-8004, full name "Trustless Agents," is a standard proposed on Ethereum in August 2025 and officially launched on mainnet in January 2026. It does not modify Ethereum's core but extends protocol capabilities between Agents through three on-chain registries:

• IdentityRegistry: Provides on-chain identity NFT identifiers for Agents, supporting identity validity queries, blacklisting, and freezing mechanisms.

• ReputationRegistry: Provides multi-dimensional aggregate scores for Agents based on historical transaction performance, with transparent and deterministic update rules.

• ValidationRegistry: Stores execution proofs (Merkle proofs, transaction hashes, timestamps, etc.), ensuring transaction records are tamper-proof and available for audit and dispute arbitration.

The combination of ERC-8004 and x402 constructs a dual trust structure of "

ความปลอดภัย

สัญญาที่ชาญฉลาด

การเงิน

เทคโนโลยี

PayFi

x402

ยินดีต้อนรับเข้าร่วมชุมชนทางการของ Odaily