AI Agent Era On-Chain Security Report: When Transactions, Payments, and Signatories Shift from Humans to Agents, Where Lies the Web3 Security Boundary?

特邀专栏作者

2026-07-01 02:08

This article is about 24764 words, reading the full article takes about 36 minutes

Over the past decade, the core issues of Web3 security have largely revolved around private keys, contract vulnerabilities, and phishing. The security boundary was relatively clear: humans see a page, click on a wallet, confirm a transaction, and the on-chain execution result is produced.

AI Summary

Expand

Core Thesis: AI Agents are shifting Web3 on-chain operations from "human confirmation" to "model understanding and automated execution," thereby expanding the security boundary from protecting private keys and contract vulnerabilities to safeguarding intent, constraining execution, and managing the trust connections between components such as models, tools, and wallets.
Key Elements:
1. The 2026 Grok/Bankrbot Incident: The attacker neither stole private keys nor attacked the contract. Instead, they had Grok translate Morse code, which Bankrbot then executed as a transfer instruction, resulting in a $440,000 loss. This exposed the fragility of trust boundaries between AI systems.
2. Increased Attack Surface Complexity: The focus of attacks has shifted from single code vulnerabilities to cross-layer attacks, including prompt injection, memory poisoning, tool permission bypass, supply chain contamination, and x402 payment hijacking. Attacks target the "junctures."
3. Layered Defense System: Companies like Cobo and Coinbase isolate private keys via MPC/custodial wallets; SlowMist and GoPlus provide tool security audits; KYA/ERC-8004 establish a foundation for agent identity and permission verification, forming a multi-layered defense.
4. Key Design Principles: Agents can only propose actions, while the rule system is responsible for authorization; private keys and high privileges must be kept away from the Agent; all on-chain actions must be readable, verifiable, and auditable; the toolchain should be managed like supply chain assets.

Over the past decade, the core security issues in Web3 have largely revolved around private keys, contract vulnerabilities, and phishing. The security boundary was relatively clear: humans see a page, click on a wallet, confirm a transaction, and the results are executed on-chain. However, the emergence of AI Agents is changing this default premise. The entity initiating on-chain operations is shifting from "human personally confirming" to "the model understanding intent, the tool calling systems, and the wallet or payment layer executing automatically." Consequently, security issues are evolving from single-point protection to a new phase of cross-layer collaboration.

The 2026 Grok and Bankrbot incidents served as a wake-up call for the industry. The attacker didn't steal private keys or directly attack the contract; they simply had Grok translate a Morse code message. Bankrbot then treated this natural language as a transfer instruction and executed it, ultimately resulting in user asset loss. This case illustrates that risks in the AI Agent era have extended from code vulnerabilities to the trust boundaries between model outputs, tool invocations, and wallet executions.

This change stems from the lengthening of the Web3 execution chain. In the past, a transaction typically involved a human viewing a page, clicking on a wallet, and confirming a signature. Now, a user might only propose a goal, and the Agent will read the context, call tools, use permissions, initiate payments, and complete on-chain execution. If any step in between is misled, contaminated, or overstepped, a false intent can be turned into a real transaction. Therefore, on-chain security needs to evolve from protecting keys to protecting intent and constraining execution.

This research report will start with real-world security cases like Grok/Bankr, PocketOS, and LiteLLM, dissecting how attackers influence the Agent's understanding, memory, tools, wallet, and payment paths. It will then discuss which human roles AI is replacing on-chain. Finally, it will explore how projects and proposals such as Cobo, Coinbase, OKX, Binance, SlowMist, KYA, and ERC-8004 are working together to rebuild the on-chain security boundaries in the AI Agent era.

The report focuses on answering three key questions:

First, what do on-chain attacks look like in the AI Agent era?

Second, when Agents can autonomously transact, call tools, initiate payments, and manage assets, how should wallet, permission, signature, and identity systems be redesigned?

Third, with the gradual maturation of Agentic Wallets, x402 payments, MCP tools, Skills markets, and the KYA identity system, what new industry opportunities will emerge in the Web3 security sector?

This represents a systematic reassessment of the on-chain security paradigm in the AI Agent era. The real question is no longer whether AI can execute tasks on-chain, but rather, when AI begins to understand, judge, sign, and pay on behalf of humans, whether Web3 is ready with a sufficiently trustworthy, controllable, and auditable execution boundary.

Original Author: Jesse, Researcher at Web3Caff Research

Produced by: A joint publication of Web3Caff Research and SlowMist

1. Introduction: The On-Chain Security Boundary is Being Redrawn
2. Role Replacement: Which "Humans" is AI Replacing On-Chain?
3. Potential Attack Vectors and Classic Cases in the AI Era
Model Objective & Prompt Layer: Prompt Injection Turns Chat Content into "Execution Instructions"
Wallet & Signature Semantic Layer: Agents Amplify the Blind Signing Problem
Memory & State Layer: Attackers Don't Need to Succeed Immediately; They Can First Poison the Agent's Long-Term Memory
Tools & MCP/Skill Layer: Agent Permission Chaining Can Link "Limited Permissions" into "Full Control"
Autonomous Authorization Risk: An Agent Doesn't Need to Be Malicious; Being "Helpful" Enough Can Cause Disaster
AI Middleware Attacks: Attackers Don't Target Your Agent First; They Contaminate the Tools It Depends On
Wallet & Proxy Payment Layer: x402 Enables AI to Pay, Tying the Web and On-Chain Settlement Together
AI-Driven Social Engineering & Phishing: Attacking Your Acquaintances
Summary: Attackers Are Truly Aiming at the Junctions
4. Defense Project Landscape: Who are the Participants and What Are They Doing?
Custodial MPC Wallet Providers: Outsourcing Signature Security to a Professional Wallet Layer
Self-Custodial MPC/Key Management Solutions: Keeping Control Within Your Own System
Smart Contract Wallets: Letting Developers Build Their Own Security Boundaries
Large Platform/Exchange Agentic Wallet as a Service: Turning the Execution Layer into a Platform Capability
Skills Markets & Security Audits: Guarding the Agent's Toolkit
Identity, Permissions, and Verifiable Execution: Solving "Who Exactly is This Agent?"
5. Security Design Principles: Key Recommendations Drawn from Case Studies
Principle 1: The Agent Can Only Propose; the Rule System Authorizes
Principle 2: Private Keys, Funds, and High-Privilege Credentials Must Stay Away from the Agent
Principle 3: All On-Chain Actions Must Be Readable, Verifiable, and Auditable
Principle 4: Treat Tools, Plugins, and Skills as Supply Chain Assets
Principle 5: Design Payment and Execution as "Bounded Automation"
Principle 6: Assume Things Will Go Wrong – Have Monitoring, Circuit Breakers, and Recovery
6. Conclusion: Industrial Opportunities Within the Security Track Itself
Key Structural Diagram
References

1. Introduction: The On-Chain Security Boundary is Being Redrawn

In May 2026, an attacker didn't steal private keys, attack a contract, or breach Bankr's servers. Instead, they had Grok translate a Morse code message. Grok, following its "helpful" model objective, output plaintext transfer instructions. Bankrbot then treated this natural language as an executable financial command, validated the NFT permissions, signed and broadcast the transaction, resulting in a loss of approximately $150,000–$200,000 from the Grok-associated wallet. Two weeks later, the attacker expanded the attack using the same Agent trust layer vulnerability, causing losses exceeding $440,000 from 14 user wallets [1]. The most valuable aspect of this case is that Grok had no traditional bugs, and Bankrbot had no traditional bugs either. What truly failed was the trust boundary between two automated systems; each performed its designed task, yet together they produced a wrong financial result [1].

This is the turning point for on-chain security in the AI Agent era: the on-chain interaction flow is shifting from "humans clicking on wallets" to "Agents first understanding intent, then calling tools within the system, and finally executing transactions using wallets or payment layers." Cobo defines an AI Wallet as an on-chain wallet integrated with artificial intelligence for automating and optimizing blockchain operations; it can execute autonomous trades based on market conditions, analyze network congestion to optimize Gas, manage DeFi positions across multiple protocols, detect fraudulent transactions in real-time, and rebalance portfolios based on risk parameters [2]. Such capabilities are valuable for normal-pace transactions and strategy automation, but once they enter high-frequency, cross-protocol, cross-tool, and automated payment scenarios, security issues are no longer just about private key custody. Attention must be paid to whether the model understands the true intent, whether tools are contaminated, whether permissions are too broad, and whether signatures are verifiable [3].

A typical Agent system includes a user interaction layer, application logic layer, model layer, tool calling layer, memory system, and underlying execution environment. Attackers often don't just attack one module; they progressively influence the Agent's behavioral control along the path of context, memory, tools, permissions, and execution [4].

However, the AI Agent era hasn't erased Web3's old risks. Private keys, signatures, authorizations, phishing, oracles, and contract vulnerabilities still exist. The real new change is that the execution chain has lengthened: previously it was "human browses page, clicks wallet, signs transaction"; now it might be "human gives goal, AI reads context, calls MCP tools, uses OAuth permissions, initiates x402 payment, waits for service verification and settlement, finally writes results on-chain." The longer the chain, the more points can be misled, contaminated, impersonated, or reused, turning risk from single-point vulnerabilities into cross-layer coupling.

Consequently, the main battlefield of on-chain security is expanding from the traditional "private key security + contract vulnerabilities + frontend phishing" to a composite attack surface spanning five layers: model objectives and prompts, memory and knowledge retrieval, tools and MCP/Skill supply chain, wallet and payment authorization, and on-chain execution and governance. Especially when AI Agents enter Web3 DeFi, the core risk exposure may not stem from the model itself, but from the fact that existing authorization systems are still built on the old assumption of humans confirming transactions. Traditional wallets assume a human is in front of the screen – someone who pauses, hesitates, and second-guesses. But once an Agent obtains permissions, it executes continuously according to task objectives. For this reason, wallet authorization mechanisms originally designed for human interaction are quickly amplified into systemic risks in automated execution scenarios. [5] Furthermore, Agent security assessments cannot rely on evaluation methods designed for stateless LLMs, as multi-step tool calls, state memory, and persistent permissions generate new vulnerability combinations that traditional scanners struggle to capture. Therefore, on-chain security in the AI Agent era must evolve from "protecting keys" to "protecting intent and constraining execution."

2. Role Replacement: Which "Humans" is AI Replacing On-Chain?

Compliance Notice: The following content is an objective analysis of the roles AI Agents can play in on-chain transactions and their characteristics, and does not constitute any proposal or offer. Please be aware that issuing or participating in Token investments is subject to varying degrees of strict regulations and restrictions in different countries and regions. Notably, issuing Tokens in mainland China constitutes "illegal issuance of securities," and providing cryptocurrency-related services like Token matching also constitutes "illegal financial activities" (Readers in mainland China are strongly advised to read the "Compilation and Key Points of Laws and Regulations Related to Blockchain and Virtual Currencies in Mainland China"). Therefore, please do not use this information for decision-making, strictly abide by the laws of your country and region, and do not participate in any illegal financial activities.

To understand why AI Agents will change on-chain security boundaries, the simplest way is to look at which "humans" they are replacing. In the past, an on-chain operation usually had a clear human role: someone watched market reactions, someone decided on the trade, someone clicked the wallet, someone confirmed the signature, and someone audited afterward. If something went wrong, at least you could ask who clicked confirm. Once Agents enter the picture, these roles begin to be disaggregated and handed over to software systems. Risk is no longer concentrated in one person or one private key but is distributed among various components.

The first category being replaced is traders and strategy executors. Previously, traders had to watch the charts and make judgments themselves. Now, trading platform APIs, MCP tools, and Skills are delegating these actions to Agents. Bitget Agent Hub is a typical example; it is based on the Bitget API and official MCP tool suite, allowing Agents to access market data and execution capabilities through standardized interfaces [6]. Some foundational AI operating systems (like the recently popular Agent Harness) further integrate trading Agents into a more complete execution system: not just letting AI place orders, but organizing context, tools, permissions, risk control, evaluation, tracking, and feedback, allowing the Agent to act continuously in the market [7].

The security problem with this type of replacement is direct: trading errors immediately translate into real financial losses. A human trader making a wrong order still has safeguards like risk control prompts and manual approval. An Agent with excessive permissions could continuously execute wrong strategies at millisecond speed. Therefore, when asking an Agent to trade, you can't just say "Please use this wallet to get higher returns." You must establish boundaries in the architecture beyond the prompt, such as budget limits, maximum leverage, and audit trails. These rules must be hardcoded constraints in the system, overriding the prompt layer.

The second category being replaced is payment initiators and API buyers. Previously, a human wanting to buy a data service, model inference, or API had to register, deposit funds, get an API key, and manually control usage. Now, x402 and machine payment protocols aim to change this, allowing an Agent to encounter a paid resource, sign and pay for it itself, access the service, and continue its task. This means Agents are truly becoming butlers capable of spending money to buy services.

Once the payment role is replaced, the risk shifts from "Did I authorize this payment?" to "Will this machine keep spending money, to whom, and why?" At the recent Stripe Sessions conference, Stripe announced its push for a new economic form – Agentic Commerce. Founder John Collison stated that if an Agent completes the entire process from research to ordering, and the product arrives at your door days later, the user won't go to another website and fill in personal details again, even if that other site's product is slightly better. Once a shopping Agent completes its search process, the next natural step is checkout. [8]

Furthermore, when payment shifts from humans confirming repeatedly to Agents continuously spending for tasks, the system must impose limits such as session limits and per-transaction limits, paying close attention to authorization issues. Otherwise, every data call and task execution by the Agent could become a continuous payment path draining funds.

The third category being replaced is wallet operators and signature interpreters. In the past, the wallet confirmation page assumed a human was sitting in front of the screen, checking the receiving address, amount, etc., before deciding whether to sign. With the advent of AI wallets, a user might only say, "Help me cross-chain this asset to a higher-yield place." The routing selection, protocol calls, authorization parameters, and transaction generation are all completed by the Agent [5]. What the user sees is an execution result packaged by the Agent.

This highlights the importance of having a verifiable interface in the process, similar to a "security version" of Deepseek's chain of thought. It aims to ensure that what the wallet interface displays truly corresponds to what is about to happen on-chain. In other words, the wallet must upgrade from a mere signature channel to the final deterministic checkpoint before execution, translating the Agent-generated transaction into content that users can understand, systems can verify, and post-mortems can trace [3].

The fourth category being replaced is identity subjects and commercial participants. For natural person users, trading platforms and financial applications typically use KYC for identification. For corporate entities, KYB is used to verify the business entity. However, an independently operating Agent is neither a natural person nor a traditional company, yet it might initiate payments, call DEXs, purchase APIs, sign instructions, and trade with another Agent [9]. This raises the question: who created this Agent, on whose behalf does it act, and who gave it permission?