AI Agent Era On-Chain Security Report: When the Subject of Transactions, Payments, and Signatures Shifts from Humans to Agents, Where are the Security Boundaries of Web3?

特邀专栏作者

2026-07-01 02:08

บทความนี้มีประมาณ 24764 คำ การอ่านทั้งหมดใช้เวลาประมาณ 36 นาที

Over the past decade, the core issues of Web3 security have largely revolved around private keys, contract vulnerabilities, and phishing. The security boundaries were relatively clear: humans see a page, click on a wallet, confirm a transaction, and the on-chain result is executed.

สรุปโดย AI

ขยาย

Core Thesis: AI Agents are shifting Web3 on-chain operations from "human confirmation" to "model understanding and automatic execution." This expands the security boundary from protecting private keys and contract vulnerabilities to protecting intent, constraining execution, and managing the trust connections between components such as models, tools, and wallets.
Key Elements:
1. The 2026 Grok/Bankrbot Incident: The attacker did not steal private keys or attack the contract but merely had Grok translate Morse code. Bankrbot then executed this as a transfer instruction, leading to a loss of $440,000, exposing the fragility of trust boundaries between AI systems.
2. Attack Surface Complexity: The focus of attacks has shifted from single code vulnerabilities to cross-layer attacks, including prompt injection, memory poisoning, tool permission penetration, supply chain contamination, and x402 payment hijacking. The attacks target the "connections."
3. Layered Defense System: Companies like Cobo and Coinbase isolate private keys through MPC/custodial wallets; SlowMist and GoPlus provide tool security audits; KYA/ERC-8004 establishes a foundation for Agent identity and permission verification, forming a multi-layered defense.
4. Key Design Principle: Agents can only propose actions, while the rule system is responsible for authorization; private keys and high-level permissions must be kept away from Agents; all on-chain actions must be readable, verifiable, and traceable; the toolchain needs to be managed as supply chain assets.

Over the past decade, the core security issues in Web3 have largely revolved around private keys, contract vulnerabilities, and phishing. The security boundary was relatively clear: a human sees a page, clicks the wallet, confirms the transaction, and the result is executed on-chain. However, the emergence of AI Agents is changing these default assumptions. The entity performing on-chain operations is shifting from "human confirmation" to "model understanding intent, tool calling systems, and automatic execution by the wallet or payment layer." Consequently, security issues are evolving from single-point protection to a new phase of cross-layer collaboration.

The Grok and Bankrbot incidents in 2026 served as a wake-up call for the industry. The attacker didn't steal private keys or directly attack the contract; they simply had Grok translate a Morse code message. Bankrbot then interpreted this natural language as a transfer instruction, ultimately causing user asset losses. This case demonstrates that risks in the AI Agent era have extended from code vulnerabilities to the trust boundaries between model output, tool invocation, and wallet execution.

This change stems from the lengthening of the Web3 execution chain. Previously, a transaction typically involved a person viewing a page, clicking the wallet, and confirming the signature. Now, users might only propose a goal, and the Agent will read the context, call tools, use permissions, initiate payments, and complete on-chain execution. Errors at any step in this process—misguidance, contamination, or unauthorized actions—can turn a mistaken intent into a real transaction. Therefore, on-chain security needs to evolve from protecting keys to protecting intent and constraining execution.

This research report will analyze real-world security incidents like Grok / Bankr, PocketOS, and LiteLLM to deconstruct how attackers influence an Agent's understanding, memory, tools, wallet, and payment paths. It will then discuss which human roles AI is replacing on-chain. Finally, it will examine how projects or proposals like Cobo, Coinbase, OKX, Binance, SlowMist, KYA, and ERC-8004 are collectively rebuilding the on-chain security boundaries for the AI Agent era.

The report focuses on answering three key questions:

First, what do on-chain attacks look like in the AI Agent era?

Second, when Agents can autonomously trade, call tools, initiate payments, and manage assets, how should wallet, permission, signature, and identity systems be redesigned?

Third, as Agentic Wallets, x402 payments, MCP tools, Skills markets, and KYA identity systems mature, what new industry opportunities will emerge in the Web3 security sector?

This will be a systematic reassessment of the on-chain security paradigm in the AI Agent era. The real question is no longer whether AI can execute tasks on-chain, but whether Web3 is ready with a sufficiently trustworthy, controllable, and traceable execution boundary when AI begins to understand, judge, sign, and pay on behalf of humans.

Original Author: Jesse, Researcher at Web3Caff Research

Produced by: Web3Caff Research × SlowMist

1. Introduction: The On-Chain Security Boundary is Being Redrawn
2. Role Replacement: Which "Humans" is AI Replacing On-Chain?
3. Potential Attack Vectors and Classic Cases in the AI Era
Model Objective & Prompt Layer: Prompt Injection Turns Chat Content into "Execution Instructions"
Wallet & Signature Semantic Layer: Agents Amplify the Problem of Blind Signing
Memory & State Layer: Attackers Don't Need Immediate Success; They Can First Poison the Agent's Long-Term Memory
Tool, MCP & Skill Layer: Agent Permission Escalation Can String "Limited Permissions" into "Full Control"
Autonomous Authorization Risks: An Agent Doesn't Need Malice; Being "Helpful" Enough Can Be Disastrous
AI Middleware Attacks: Attackers Don't Touch Your Agent; They First Contaminate the Tools It Depends On
Wallet & Agentic Payment Layer: x402 Enables AI to Pay, Tying Web Requests to On-Chain Settlement
AI-Driven Social Engineering & Phishing: Targeting Your Acquaintances
Summary: Attackers Truly Target the Junctions
4. Defense Project Landscape: Who Are the Participants and What Have They Done?
Custodial MPC Wallet Providers: Outsourcing Signature Security to a Professional Wallet Layer
Self-Custodial MPC / Key Management Solutions: Keeping Control Within Your Own System
Smart Contract Wallets: Enabling Developers to Build Their Own Security Boundaries
Large Platform / Exchange Agentic Wallet as a Service: Making the Execution Layer a Platform Capability
Skills Markets & Security Audits: Securing the Agent's Toolbox
Identity, Permissions & Verifiable Execution: Solving "Who Exactly is This Agent"
5. Security Design Principles: Key Recommendations from Case Studies
Principle 1: Agents Propose, the Rules System Authorizes
Principle 2: Private Keys, Funds, and High-Privilege Credentials Must Be Kept Away from the Agent
Principle 3: All On-Chain Actions Must Be Readable, Verifiable, and Traceable
Principle 4: Tools, Plugins, and Skills Should Be Managed as Supply Chain Assets
Principle 5: Design Payments and Execution as "Bounded Automation"
Principle 6: Assume Failure Will Happen, Therefore Have Monitoring, Circuit Breakers, and Recovery
6. Conclusion: Industry Opportunities in the Security Track Itself
Key Points Structure Diagram
References

1. Introduction: The On-Chain Security Boundary is Being Redrawn

In May 2026, an attacker didn't steal private keys, attack a contract, or breach Bankr's servers. Instead, they had Grok translate a Morse code message. Grok, following its "helpful" model objective, output a plain-text transfer instruction. Bankrbot then interpreted this natural language as an executable financial command, verified the NFT permission, signed and broadcast the transaction, ultimately causing a loss of approximately $15-20,000 from the Grok-associated wallet. Two weeks later, the attacker expanded the attack using the same Agent trust-layer vulnerability, resulting in losses exceeding $440,000 from 14 user wallets [1]. The most valuable insight from this case is that Grok had no traditional bug, and Bankrbot had no traditional bug. What truly failed was the trust boundary between two automated systems. Each performed its designed task, but together they produced an erroneous financial result [1].

This is the turning point for on-chain security in the AI Agent era: the interaction flow is shifting from "a human clicking the wallet" to "the Agent first understanding intent, then calling tools within the system, and finally using the wallet or payment layer to execute transactions." Cobo defines an AI Wallet as an on-chain wallet integrated with artificial intelligence for automating and optimizing blockchain operations. It can execute autonomous trades based on market conditions, analyze network congestion to optimize Gas, manage DeFi positions across multiple protocols, detect fraudulent transactions in real-time, and rebalance portfolios based on risk parameters [2]. Such capabilities are valuable for standard-paced trading and strategy automation. However, once deployed in high-frequency, cross-protocol, cross-tool, and automated payment scenarios, security issues are no longer just about private key custody. Attention must be paid to whether the model understands the true intent, whether tools are contaminated, whether permissions are overly broad, and whether signatures are verifiable [3].

A typical Agent system includes user interaction, application logic, the model layer, a tool invocation layer, memory systems, and an underlying execution environment. Attackers often don't just target one module but influence the Agent's behavioral control layer by layer along the path of context, memory, tools, permissions, and execution [4].

However, the AI Agent era hasn't erased Web3's old risks. Private keys, signatures, authorizations, phishing, oracles, and contract vulnerabilities still exist. The truly new change is that the execution chain has lengthened: Previously it was "person sees webpage, clicks wallet, signs transaction." Now it might be "person sets goal, AI reads context, calls MCP tools, uses OAuth permissions, initiates an x402 payment, waits for the service provider's settlement verification, and finally writes the result back on-chain." The longer the chain, the more opportunities for intermediary steps to be misled, contaminated, impersonated, or reused, shifting risk from a single-point vulnerability to cross-layer coupling.

Consequently, the main battleground for on-chain security is expanding from the traditional "private key security + contract vulnerabilities + front-end phishing" to a composite attack surface spanning five layers: model objectives and prompts, memory and knowledge retrieval, tool and MCP/Skill supply chain, wallet and payment authorization, and on-chain execution and governance. Especially as AI Agents enter Web3 DeFi, the core risk exposure doesn't necessarily come from the model itself, but from the fact that existing authorization systems are still built on the old assumption of human-confirmed transactions. Traditional wallets assume a person who pauses, hesitates, and performs a second check in front of the screen. But once an Agent gains permission, it executes continuously according to its task objective. Precisely because of this, wallet authorization mechanisms designed for human interaction will be rapidly amplified into systemic risks in automated execution scenarios [5]. Furthermore, Agent security assessments cannot simply use evaluation methods for stateless LLMs, because multi-step tool calls, state memory, and persistent permissions generate new vulnerability combinations that traditional scanners struggle to capture. Therefore, on-chain security in the AI Agent era must upgrade from "protecting keys" to "protecting intent and constraining execution."

2. Role Replacement: Which "Humans" is AI Replacing On-Chain?

Compliance Note: The following content is purely an objective analysis of the roles AI Agents can play in on-chain transactions and their resulting characteristics. It does not constitute any proposal or offer. Please be aware that issuing or participating in investments in Tokens is subject to varying degrees of stringent regulatory requirements and restrictions in different countries and regions. Specifically, issuing Tokens in mainland China may involve the act of "illegal issuance of securities," and providing crypto-asset transaction matching services is also considered "illegal financial activity" (Mainland Chinese readers are strongly advised to read "Compilation and Key Points of Laws and Regulations Related to Blockchain and Virtual Assets in Mainland China"). Therefore, please do not use this information for decision-making, strictly comply with the laws and regulations of your country or region, and refrain from participating in any illegal financial activities.

The simplest way to understand why AI Agents change the on-chain security boundary is to see which "people" they are replacing. Previously, a typical on-chain operation had a clear human role: someone watched market reactions, someone decided on a trade, someone clicked the wallet, someone confirmed the signature, and someone performed post-trade audits. If something went wrong, you could at least ask who clicked confirm and authorized the action. But as Agents enter the picture, these roles are being unbundled and handed over to software systems, and risk is no longer concentrated on one person or one private key, but distributed among various components.

The first type replaced are traders and strategy executors. Previously, traders had to watch the market and make decisions themselves. Now, trading platform APIs, MCP tools, and Skills are handing these actions over to Agents. Bitget Agent Hub is a typical example; it's based on the Bitget API and official MCP tool suite, allowing Agents to access market data and execution capabilities through standardized interfaces [6]. Other AI basic operating systems (like the recently popular Agent Harness) go further by placing trading Agents within a more complete execution system: not just letting AI place orders, but organizing context, tools, permissions, risk controls, evaluations, tracking, and feedback so the Agent can act continuously in the market [7].

The security issue with this type of replacement is direct: trading errors immediately translate into real financial losses. A human trader making a wrong order still has risk control prompts and manual approval as safeguards. An Agent with overly broad permissions could execute erroneous strategies continuously in milliseconds. Therefore, when asking an Agent to trade, you can't just tell it "please use this wallet to get higher returns." You must proactively set boundaries in the architecture beyond the prompt, such as budget limits, maximum leverage, and audit trails. These rules must be system-level hard constraints, residing above the prompt layer.

The second type replaced are payment initiators and API buyers. Previously, for a human to buy a data service, model inference, or API, they typically had to register an account, top up, get an API Key, and manually control usage. Now, x402 and machine payment protocols aim to change this, allowing Agents to automatically sign and pay for paid resources, obtain the service, and continue their tasks. This means the Agent genuinely becomes a machine butler capable of spending money.

With the payment role replaced, the risk shifts from "did I authorize this spending?" to "will this machine keep spending, on whom, and why?" At the recent Stripe Sessions conference, Stripe also announced a push for a new economic form – Agentic Commerce. Founder John Collison stated that if an agent completes the entire process from research to order placement, with products arriving at home a few days later, the user won't go to another website to fill out personal information from scratch, even if that website's products might be slightly better. Once a shopping agent completes the search process, the next natural step is checkout [8].

Simultaneously, when payment shifts from human confirmation on a case-by-case basis to Agent-driven continuous consumption for tasks, the system must impose session limits, per-transaction limits, and carefully manage authorization. Otherwise, every data call and task execution by the Agent could become a continuous bleeding payment path.

The third type replaced are wallet operators and signature interpreters. Previously, the wallet confirmation page presumed a human was sitting at the screen, checking the recipient address and amount, and then deciding whether to sign. With AI wallets, a user might only say "help me move these assets cross-chain to a higher yield opportunity," and the Agent handles the routing selection, protocol calls, authorization parameters, and transaction generation in between [5]. What the user sees then is just the final execution result packaged by the Agent.

This highlights the critical importance of a verifiable interface in the process. The idea is similar to a "security version" of Deepseek's chain-of-thought. It aims to ensure that what the wallet interface shows genuinely corresponds to what is about to happen on-chain. In other words, the wallet must evolve from a mere signature channel into the final determinism checkpoint before execution, translating the transactions generated by the Agent into content that users can understand, systems can verify, and post-hoc processes can trace [3].

The fourth type replaced are identity principals and business participants. For natural person users, trading platforms and financial applications typically use KYC for identity verification. For corporate entities, KYB (Know Your Business) is used. But an independently running Agent is neither an individual nor a traditional company, yet it might initiate payments, call DEXs, purchase APIs, sign instructions, and trade with another Agent [9]. Here, the question becomes: who created this Agent, on whose behalf is it acting, and who gave it permission?