AI Agent Era On-Chain Security Report: When the Subject of Transactions, Payments, and Signatures Changes from Humans to Agents, Where Lies the Web3 Security Boundary?

特邀专栏作者

2026-07-01 02:08

本文約24764字，閱讀全文需要約36分鐘

Over the past decade, the core issues of Web3 security have mostly revolved around private keys, contract vulnerabilities, and phishing. The security boundary was relatively clear: humans see a page, click a wallet, confirm a transaction, and the result is executed on-chain.

AI總結

展開

Core Thesis: AI Agents are shifting Web3 on-chain operations from “human confirmation” to “model understanding and automatic execution,” expanding the security boundary from protecting private keys and contract vulnerabilities to protecting intent, constraining execution, and managing the trust connections between components such as models, tools, and wallets.
Key Elements:
1. The 2026 Grok/Bankrbot Incident: The attacker did not steal private keys or attack the contract. They merely had Grok translate Morse code, and Bankrbot executed it as a transfer instruction, leading to a $440,000 loss. This exposed the fragility of trust boundaries between AI systems.
2. Attack Surface Complexity: The attack focus has shifted from single code vulnerabilities to cross-layer attacks, including prompt injection, memory poisoning, tool permission penetration, supply chain contamination, and x402 payment hijacking. The attacks target the “connection points.”
3. Layered Defense System: Cobo, Coinbase, and others isolate private keys via MPC/custodial wallets; SlowMist and GoPlus provide tool security audits; KYA/ERC-8004 establish a foundation for Agent identity and permission verification, forming a multi-layered defense.
4. Key Design Principles: Agents can only propose; the rule system is responsible for authorization. Private keys and high-level permissions must be kept away from Agents. All on-chain actions must be readable, verifiable, and auditable. The tool chain must be managed as supply chain assets.

Over the past decade, the core issues of Web3 security have largely revolved around private keys, contract vulnerabilities, and phishing. The security boundary was relatively clear: humans see a page, click on a wallet, confirm a transaction, and the result is executed on-chain. But the emergence of AI Agents is changing this set of default assumptions. The entity performing on-chain operations is shifting from "human manual confirmation" to "models understanding intent, tool invocation systems, and wallets or payment layers executing automatically." Consequently, security issues are evolving from single-point protection to a new phase of cross-layer coordination.

The Grok and Bankrbot incidents in 2026 served as a wake-up call for the industry. The attacker didn't steal private keys or directly attack the contract; they simply had Grok translate a piece of Morse code. Bankrbot then treated this natural language as a transfer instruction and executed it, ultimately causing asset losses for users. This case illustrates that risks in the AI Agent era have extended from code vulnerabilities to the trust boundary between model output, tool invocation, and wallet execution.

This change stems from the lengthening of the Web3 execution chain. In the past, a transaction typically involved a person viewing a page, clicking the wallet, and confirming the signature; now, a user might only propose a goal, and the Agent will read the context, call tools, use permissions, initiate payments, and complete on-chain execution. If any step in the middle is misled, poisoned, or over-authorized, it can turn a mistaken intent into a real transaction. Therefore, on-chain security needs to evolve from protecting keys to protecting intent and constraining execution.

This research report will start with actual security incidents like Grok/Bankr, PocketOS, and LiteLLM, dissecting how attackers can influence an Agent's understanding, memory, tools, wallets, and payment paths. It will then discuss which human roles AI is replacing on-chain. Finally, it will examine how projects and proposals such as Cobo, Coinbase, OKX, Binance, SlowMist, KYA, and ERC 8004 are collectively working to rebuild the on-chain security boundaries in the AI Agent era.

The report focuses on answering three key questions:

First, what do on-chain attacks look like in the AI Agent era?

Second, when Agents can autonomously trade, call tools, initiate payments, and manage assets, how should wallets, permissions, signatures, and identity systems be redesigned?

Third, as Agentic Wallets, x402 payments, MCP tools, Skills markets, and KYA identity systems mature, what new industrial opportunities will emerge in the Web3 security track?

This will be a systematic reassessment of the on-chain security paradigm in the AI Agent era. The real question is no longer whether AI can execute tasks on-chain, but whether Web3 is ready with a trustworthy, controllable, and auditable execution boundary when AI starts to understand, judge, sign, and pay on behalf of humans.

Original Author: Jesse, Researcher at Web3Caff Research

Produced by: Web3Caff Research × SlowMist

1. Introduction: The On-Chain Security Boundary is Being Redrawn
2. Role Replacement: Which "Humans" is AI Replacing On-Chain?
3. Potential Attack Vectors and Classic Cases in the AI Era
Model Objectives & Prompt Layer: Prompt Injection Turns Chat Content into "Execution Instructions"
Wallet & Signature Semantic Layer: Agents Amplify the Problem of Blind Signing
Memory & State Layer: Attackers Don't Need Immediate Success; They Can First Poison the Agent's Long-Term Memory
Tool & MCP/Skill Layer: Agent Privilege Escalation Strung "Limited Permissions" into "Full Control"
Autonomous Authorization Risk: An Agent Doesn't Need Malice; Being "Helpful" Enough Can Cause Disaster
AI Middleware Attacks: Attackers Don't Touch Your Agent; They First Poison the Tools It Depends On
Wallet & Proxy Payment Layer: x402 Enables AI to Pay, Tying the Web to On-Chain Settlement
AI-Driven Social Engineering & Phishing: Attacking Your Acquaintances
Summary: Attackers' Real Target is the Junctures
4. Defense Project Landscape: Who Are the Participants and What Are They Doing?
Custodial MPC Wallet Providers: Outsourcing Signature Security to a Professional Wallet Layer
Non-Custodial MPC/Key Management Solutions: Keeping Control Within Your Own System
Smart Contract Wallets: Allowing Developers to Build Their Own Security Boundaries
Large Platform/Exchange Agentic Wallet as a Service: Turning the Execution Layer into a Platform Capability
Skills Marketplaces & Security Audits: Protecting the Agent's Toolkit
Identity, Permissions, and Verifiable Execution: Solving "Who is This Agent, Really?"
5. Security Design Principles: Core Recommendations from the Case Studies
Principle 1: Agents Propose, Rule Systems Authorize
Principle 2: Keep Private Keys, Funds, and High-Privilege Credentials Away from Agents
Principle 3: All On-Chain Actions Must Be Readable, Verifiable, and Auditable
Principle 4: Manage Tools, Plugins, and Skills as Supply Chain Assets
Principle 5: Design Payment and Execution as "Bounded Automation"
Principle 6: Assume Incidents Will Happen – So Have Monitoring, Circuit Breakers, and Recovery
6. Conclusion: The Industrial Opportunities Within the Security Track Itself
Key Points Structure Diagram
References

1. Introduction: The On-Chain Security Boundary is Being Redrawn

In May 2026, an attacker didn't steal private keys, attack a contract, or breach Bankr's server. Instead, they had Grok translate a piece of Morse code; Grok, following its "helpful" model objective, output plaintext transfer instructions. Bankrbot then treated this natural language as an executable financial command. After verifying an NFT permission, it signed and broadcast the transaction, resulting in a loss of approximately $150,000–$200,000 from Grok's associated wallet. Two weeks later, the attacker exploited the same Agent trust layer vulnerability to expand the attack, causing losses exceeding $440,000 across 14 user wallets [1]. The most valuable lesson from this case is that Grok had no traditional "bug," nor did Bankrbot. What truly failed was the trust boundary between two automated systems; each performed its designed task perfectly, yet they collectively produced a wrong financial outcome [1].

This marks a turning point for on-chain security in the AI Agent era: the interaction flow on-chain is shifting from "humans clicking wallets" to "Agents first understanding intent, then calling system tools, and finally using wallets or payment layers to execute transactions." Cobo defines an AI Wallet as an on-chain wallet integrated with artificial intelligence for automating and optimizing blockchain operations; it can execute autonomous trades based on market conditions, analyze network congestion to optimize Gas fees, manage DeFi positions across multiple protocols, detect fraudulent transactions in real-time, and rebalance portfolios based on risk parameters [2]. Such capabilities are valuable for normal-paced trading and strategy automation. However, once applied to high-frequency, cross-protocol, cross-tool, and automated payment scenarios, security concerns are no longer just about private key custody. Attention must be paid to whether the model understands the true intent, whether tools are compromised, whether permissions are too broad, and whether signatures are verifiable [3].

A typical Agent system includes user interaction layer, application logic layer, model layer, tool invocation layer, memory system, and underlying execution environment. Attackers often don't attack just one module; instead, they progressively influence the Agent's behavioral control along the path of context, memory, tools, permissions, and execution [4].

However, the AI Agent era hasn't erased Web3's old risks. Private keys, signatures, authorization, phishing, oracles, and contract vulnerabilities still exist. The truly new change is that the execution chain has lengthened. Previously, it was "human views webpage, clicks wallet, signs transaction." Now, it might be "human sets goal, AI reads context, calls MCP tools, uses OAuth permissions, initiates x402 payment, waits for service provider verification and settlement, and finally writes the result back on-chain." The longer the chain, the more points can be misled, poisoned, impersonated, or reused, shifting risk from single-point vulnerabilities to cross-layer coupling.

Consequently, the main battlefield of on-chain security is expanding from traditional "private key security + contract vulnerabilities + front-end phishing" to a composite attack surface spanning five layers: model objectives & prompts, memory & knowledge retrieval, tool & MCP/Skill supply chains, wallet & payment authorization, and on-chain execution & governance. Especially after AI Agents enter Web3 DeFi, the core risk exposure may not come from the model itself, but from the fact that the existing authorization system is still built on the old assumption of human-confirmed transactions. Traditional wallets assume a person who pauses, hesitates, and double-checks is sitting in front of the screen. But once an Agent gains permission, it executes continuously according to the task objective. Therefore, wallet authorization mechanisms originally designed for human interaction are rapidly amplified into systemic risks in automated execution scenarios.[5]. Meanwhile, Agent security assessments cannot rely on evaluating stateless LLMs, as multi-step tool calls, state memory, and persistent permissions generate new vulnerability combinations that traditional scanners struggle to capture. Hence, on-chain security in the AI Agent era must evolve from "protecting keys" to "protecting intent and constraining execution."

2. Role Replacement: Which "Humans" is AI Replacing On-Chain?

Compliance Note: The following content is an objective analysis of the roles AI Agents can play and the characteristics they form in on-chain transactions. It does not constitute any proposal or offer. You should be aware that issuing or participating in investing in Tokens is subject to varying degrees of strict regulations and restrictions in different countries and regions. Specifically, issuing Tokens in Mainland China may constitute "illegal issuance of securities," and providing services related to cryptocurrency trading, such as order matching, is also considered "illegal financial activity." (Readers in Mainland China are strongly advised to read the "Compilation and Key Points of Laws and Regulations Related to Blockchain and Virtual Currencies in Mainland China"). Therefore, please do not use this information for related decisions, strictly comply with the laws and regulations of your country or region, and do not participate in any illegal financial activities.

To understand why AI Agents change the on-chain security boundary, the simplest way is to first look at which "humans" they are replacing. In the past, an on-chain operation typically involved a clear human role: someone watched market reactions, someone decided on trades, someone clicked the wallet, someone confirmed the signature, and someone performed post-audit. If something went wrong, you could at least ask who clicked confirm and authorized. Once Agents enter the picture, these roles start being decoupled and assigned to software systems, and risks are no longer concentrated on one person or one private key but are distributed among various components.

The first category being replaced is traders and strategy executors. Previously, traders had to watch charts and make judgments themselves; now, exchange APIs, MCP tools, and Skills are handing these actions over to Agents. The Bitget Agent Hub is a typical example; it's based on the Bitget API and official MCP tool suite, allowing Agents to access market data and trading capabilities through standardized interfaces [6]. Some basic AI operating systems (like the recently popular Agent Harness) further embed trading Agents into a more complete execution system: not just letting AI place orders, but organizing context, tools, permissions, risk control, evaluation, tracking, and feedback so the Agent can act continuously in the market [7].

The security problem with this type of replacement is immediate: trading errors instantly translate into real financial losses. A human trader who makes a wrong order at least has risk control prompts and manual approval as safeguards. An Agent with overly broad permissions could continuously execute wrong strategies in milliseconds. Therefore, when asking an Agent to trade, you shouldn't just say "Please help me achieve higher returns with this wallet." You must pre-establish boundaries in the architecture beyond the prompt, such as setting budget limits, maximum leverage, and audit records. Moreover, these rules must be hard system constraints above the prompt layer.

The second category being replaced is payment initiators and API buyers. Previously, to buy a data service, model inference, or API, a human had to register, deposit funds, get an API Key, and manually control usage. Now, x402 and machine payment protocols aim to change this, allowing Agents to automatically sign payments, acquire services, and continue their tasks when encountering paid resources. This means Agents are starting to become real machine stewards that can spend money to buy services.

Once the payment role is automated, the risk shifts from "Did I authorize this payment?" to "Will this machine keep spending money, to whom, and why?" At the recent Stripe Sessions conference, Stripe also announced plans to promote a new economic form – Agentic Commerce. Founder John Collison stated that if an Agent completes the entire process from research to ordering, and the product arrives at home a few days later, the user won't go to another website to re-enter personal information, even if that other site's product might be slightly better. Once a shopping Agent completes the search process, the next natural step is checkout.[8]

Simultaneously, when payment evolves from human confirmation each time to Agents spending continuously based on tasks, the system must set session limits, per-transaction limits, and pay attention to its authorization issues. Otherwise, every data call and task execution by the Agent could become a constant outflow of funds.

The third category being replaced is wallet operators and signature interpreters. In the past, the wallet confirmation page assumed a human was sitting at the screen, checking the recipient address, amount, etc., before deciding whether to sign. With the advent of AI wallets, a user might only say, "Help me bridge this asset to a place with higher yield," leaving the routing choices, protocol calls, authorization parameters, and transaction generation to the Agent [5]. What the user sees becomes an execution result packaged by the Agent.

This highlights the importance of having a verifiable interface during the process. This idea is similar to a "security version" of Deepseek's chain of thought. It aims to solve the problem of ensuring that what the wallet interface shows truly corresponds to what is about to happen on-chain. In other words, the wallet must upgrade from a mere signature channel to the final certainty checkpoint before execution, translating the Agent-generated transaction into content the user can understand, the system can verify, and post-mortems can trace [3].

The fourth category being replaced is identity subjects and commercial participants. For individual users, trading platforms and financial applications typically use KYC for identity verification; for corporate entities, it's usually KYB (Know Your Business) for verification. But an independently running Agent is neither a natural person nor a traditional company, yet it might initiate payments, call DEXes , purchase APIs, sign instructions, and trade with another Agent <

安全

歡迎加入Odaily官方社群