AI Agent Era On-Chain Security Report: When Transactions, Payments, and Signing Entities Shift from Humans to Agents, Where Does the Web3 Security Boundary Lie?
- Core Thesis: AI Agents are shifting on-chain operations in Web3 from "human confirmation" to "model understanding and automatic execution." Consequently, the security boundary is expanding from protecting private keys and contract vulnerabilities to safeguarding intent, constraining execution, and managing the trust connections between components like models, tools, and wallets.
- Key Elements:
- The 2026 Grok/Bankrbot Incident: The attacker did not steal private keys or attack the contract. They merely had Grok translate Morse code, which Bankrbot then executed as a transfer instruction, resulting in a $440,000 loss. This exposed the fragility of trust boundaries between AI systems.
- Attack Surface Complexity: The focus of attacks is shifting from single code vulnerabilities to cross-layer attacks, including prompt injection, memory poisoning, tool permission bypass, supply chain contamination, and x402 payment hijacking. The attacks target the "connection points."
- Layered Defense System: Companies like Cobo and Coinbase isolate private keys using MPC/custodial wallets; SlowMist and GoPlus provide tool security auditing; KYA/ERC-8004 establishes identity and permission verification foundations for Agents, forming a multi-layered defense.
- Key Design Principles: Agents can only propose actions; the rule system is responsible for authorization. Private keys and high-level permissions must be kept away from Agents. All on-chain actions must be readable, verifiable, and auditable. The toolchain needs to be managed like supply chain assets.
Over the past decade, the core issues in Web3 security have largely revolved around private keys, contract vulnerabilities, and phishing. The security boundaries were relatively clear: humans see a page, click the wallet, confirm the transaction, and the result is executed on-chain. However, the emergence of AI Agents is changing this set of default assumptions. The subject initiating on-chain operations is shifting from "human親自 confirmation" to "model understanding intent, tool calling systems, and wallets or payment layers executing automatically." Consequently, security issues are evolving from single-point protection to a new phase of cross-layer coordination.
The Grok and Bankrbot incidents in 2026 served as a wake-up call for the industry. The attacker didn't steal private keys or directly attack the contract; they simply had Grok translate a Morse code message. Bankrbot then treated this natural language as a transfer instruction and executed it, ultimately causing the loss of user assets. This case illustrates that risks in the AI Agent era have extended from code vulnerabilities to the trust boundaries between model outputs, tool calls, and wallet executions.
This change stems from the lengthening of the Web3 execution chain. Previously, a transaction typically involved a person viewing a page, clicking the wallet, and confirming the signature. Now, a user might only propose a goal, and the Agent will read the context, call tools, use permissions, initiate payments, and complete on-chain execution. If any step in this process is misled, polluted, or unauthorized, a mistaken intent can become a real transaction. Therefore, on-chain security needs to evolve from protecting keys to protecting intent and constraining execution.
This research report will start from actual security cases such as Grok / Bankr, PocketOS, and LiteLLM, dissecting how attackers influence the Agent's understanding, memory, tools, wallets, and payment paths. It will then discuss which human roles on-chain AI is replacing. Finally, it will examine how projects or proposals like Cobo, Coinbase, OKX, Binance, SlowMist, KYA, and ERC-8004 are collectively reconstructing the on-chain security boundaries in the AI Agent era.
The report focuses on answering three key questions:
First, what do on-chain attacks look like in the AI Agent era?
Second, how should wallets, permissions, signatures, and identity systems be redesigned when Agents can autonomously trade, call tools, initiate payments, and manage assets?
Third, what new industrial opportunities will emerge in the Web3 security赛道 as Agentic Wallet, x402 payments, MCP tools, Skills markets, and the KYA identity system mature?
This will be a systematic reassessment of the on-chain security paradigm in the AI Agent era. The real question is no longer whether AI can execute tasks on-chain, but whether Web3 is ready with a sufficiently trustworthy, controllable, and auditable execution boundary when AI starts understanding, judging, signing, and paying on behalf of humans.
Original Author: Jesse, Researcher at Web3Caff Research
Produced by: Web3Caff Research & SlowMist
Table of Contents
- 1. Introduction: The Security Boundaries on the Chain Are Being Redrawn
- 2. Role Replacement: Which "Humans" on the Chain is AI Replacing?
- 3. Potential Attack Methods and Classic Cases in the AI Era
- Model Objective & Prompt Layer: Prompt Injection Turns Chat Content into "Execution Instructions"
- Wallet & Signature Semantics Layer: Agents Amplify the Problem of Blind Signing
- Memory & State Layer: Attackers Don't Need Immediate Success; They Can First Pollute the Agent's Long-Term Memory
- Tool & MCP / Skill Layer: Privilege Escalation Can Chain "Limited Permissions" into "Full Control"
- Autonomous Authorization Risk: An Agent Doesn't Need to Be Malicious; Being "Helpful" Enough Can Be Catastrophic
- AI Middleware Attack: Attackers Don't Target Your Agent Directly; They First Pollute the Tools It Depends On
- Wallet & Agent Payment Layer: x402 Enables AI to Pay, Also Ties Web Requests to On-Chain Settlement
- AI-Driven Social Engineering & Phishing: Attacking Your Acquaintances
- Summary: Attackers Are Truly Targeting the Connective Tissue
- 4. Defense Project Landscape: Who Are the Participants and What Are They Doing?
- Custodial MPC Wallet Providers: Outsourcing Signature Security to a Professional Wallet Layer
- Self-Custodial MPC / Key Management Solutions: Keeping Control Within Your Own System
- Smart Contract Wallets: Allowing Developers to Build Their Own Security Boundaries
- Major Platform / Exchange Agentic Wallet as a Service: Turning the Execution Layer into a Platform Capability
- Skills Market & Security Audits: Defending Against the Agent's Toolkit
- Identity, Permissions, and Verifiable Execution: Solving "Who Exactly is This Agent?"
- 5. Security Design Principles: Core Recommendations Drawn from Cases
- Principle 1: Agents Can Only Propose; the Rule System is Responsible for Authorization
- Principle 2: Private Keys, Funds, and High-Privilege Credentials Must Stay Away from the Agent
- Principle 3: All On-Chain Actions Must Be Readable, Verifiable, and Traceable
- Principle 4: Tools, Plugins, and Skills Must Be Managed as Supply Chain Assets
- Principle 5: Design Payments and Execution as "Bounded Automation"
- Principle 6: Assume Breach, So Implement Monitoring, Circuit Breakers, and Recovery
- 6. Conclusion: Industrial Opportunities Within the Security Sector Itself
- Key Architecture Diagram
- References
1. Introduction: On-Chain Security Boundaries Are Being Redrawn
In May 2026, an attacker didn't steal a private key, didn't attack a contract, and didn't break into Bankr's servers. Instead, they had Grok translate a Morse code message. Grok, following its "helpful" model objective, output a plain-text transfer instruction. Bankrbot then treated this natural language as an executable financial command, verified the NFT permission, signed, and broadcast the transaction on-chain. This resulted in a loss of approximately $150,000–$200,000 from the wallet associated with Grok. Two weeks later, the attacker exploited the same Agent trust layer vulnerability on a larger scale, causing losses of over $440,000 from 14 user wallets [1]. The most valuable lesson from this case is that Grok didn't have a traditional bug, and Bankrbot didn't have a traditional bug either. What truly failed was the trust boundary between two automated systems. Each performed its designed task perfectly, yet together they produced a wrong financial outcome [1].
This is the turning point for on-chain security in the AI Agent era: the interaction flow is shifting from "humans clicking the wallet" to "Agent first understanding the intent, then calling tools in the system, and finally using the wallet or payment layer to execute the transaction." Cobo defines an AI Wallet as an on-chain wallet integrated with artificial intelligence for automating and optimizing blockchain operations. It can execute autonomous trades based on market conditions, optimize Gas by analyzing network congestion, manage DeFi positions across multiple protocols, detect fraudulent transactions in real-time, and rebalance portfolios according to risk parameters [2]. Such capabilities are valuable for normal-paced trading and strategy automation. However, once deployed in high-frequency, cross-protocol, cross-tool, and automated payment scenarios, security issues are no longer just about private key custody. Attention must be paid to whether the model understands the true intent, whether tools are contaminated, whether permissions are too broad, and whether signatures are verifiable [3].
A typical Agent system comprises a user interaction layer, application logic layer, model layer, tool calling layer, memory system, and underlying execution environment. Attackers often don't just attack one module in isolation. Instead, they progressively influence the Agent's behavioral control along the path of context, memory, tools, permissions, and execution [4].
However, the AI Agent era hasn't erased the old risks of Web3. Private keys, signatures, authorizations, phishing, oracles, and contract vulnerabilities still exist. What has truly changed is that the execution chain has lengthened: previously, it was "human sees webpage, clicks wallet, signs transaction." Now, it might be "human gives goal, AI reads context, calls MCP tools, uses OAuth permissions, initiates an x402 payment, waits for service-side verification and settlement, and finally writes the result back on-chain." The longer the chain, the more opportunities exist for it to be misled, polluted, impersonated, or replayed. Consequently, risk shifts from single-point vulnerabilities to cross-layer coupling.
Therefore, the main battlefield of on-chain security is expanding from the traditional "private key security + contract vulnerabilities + front-end phishing" to a composite attack surface spanning five layers: model objectives and prompts, memory and knowledge retrieval, tools and the MCP/Skill supply chain, wallet and payment authorization, and on-chain execution and governance. Especially after AI Agents enter Web3 DeFi, the core of risk exposure may not originate from the model itself, but from the fact that existing authorization systems are still built on the old assumption of human-confirmed transactions. Traditional wallets default to a person in front of the screen who will pause, hesitate, and perform a secondary check. However, once an Agent obtains permission, it will execute continuously according to its task objective. For this reason, wallet authorization mechanisms designed for human interaction will be rapidly amplified into systemic risks in automated execution scenarios [5]. Furthermore, Agent security assessment cannot rely on evaluation methods designed for stateless LLMs, as multi-step tool calls, state memory, and persistent permissions generate novel vulnerability combinations that traditional scanners struggle to detect. Thus, on-chain security in the AI Agent era must evolve from "protecting keys" to "protecting intent and constraining execution."
2. Role Replacement: Which "Humans" on the Chain is AI Replacing?
Compliance Note: The following content is solely an objective analysis of the roles AI Agents can play in on-chain transactions and the characteristics they form. It does not constitute any proposal or offer. Please be aware that issuing or participating in investments in Tokens is subject to varying degrees of strict regulatory requirements and restrictions in different countries and regions. Notably, issuing Tokens in Mainland China may be considered "illegal issuance of securities," and providing cryptocurrency trading matching services or similar activities is also classified as "illegal financial activities" (Readers in Mainland China are strongly advised to read the "Compilation and Key Points of Laws and Regulations Related to Blockchain and Virtual Currencies in Mainland China"). Therefore, please do not use this information for related decisions. Strictly adhere to the laws and regulations of your country or region and do not participate in any illegal financial activities.
To understand why AI Agents change on-chain security boundaries, the simplest way is to look at which "humans" they are replacing. In the past, an on-chain operation typically involved clear human roles: someone watched market reactions, someone decided on the trade, someone clicked the wallet, someone confirmed the signature, and someone audited afterwards. If something went wrong, you could at least ask who clicked 'confirm' or 'authorize'. When an Agent comes in, these roles start to be broken apart and assigned to software systems. Risk is no longer concentrated on one person or one private key but distributed among various components.
The first category to be replaced are traders and strategy executors. Previously, traders had to watch the charts and make judgments themselves. Now, trading platform APIs, MCP tools, and Skills are handing these actions over to Agents. Bitget Agent Hub is a typical example. Based on the Bitget API and official MCP tool suite, it allows Agents to access market data and execute trading capabilities through standardized interfaces [6]. Some foundational AI operating systems (like the recently popular Agent Harness) go further by placing the trading Agent into a more complete execution system. It's not just about letting AI place orders, but organizing context, tools, permissions, risk control, evaluation, tracking, and feedback so the Agent can act continuously in the market [7].
The security problem brought by this replacement is straightforward: trading errors immediately translate into real financial losses. If a human trader makes a wrong order, at least there are risk control prompts and manual approval as safeguards. If an Agent is given overly broad permissions, it could continuously execute wrong strategies at millisecond speed. Therefore, when asking an Agent to trade, you can't just say "please use this wallet for higher returns." You must set boundaries for it at the architectural level, beyond just prompts, such as setting budget limits, maximum leverage, and audit trails. Importantly, these rules must be hard system constraints above the prompt level.
The second category to be replaced are payment initiators and API buyers. Previously, if a human wanted to buy a data service, model inference, or API, they typically had to register an account, top up funds, get an API Key, and manually control usage. Now, x402 and machine payment protocols aim to change this, allowing an Agent, when encountering a paid resource, to sign the payment, obtain the service, and continue its task. This means the Agent truly becomes a machine butler capable of spending money to buy services.
With the payment role replaced, the risk shifts from "did I authorize this payment?" to "will this machine keep spending money, who is it paying, and why?" At the recent Stripe Sessions conference, Stripe also announced its push for a new economic form – Agentic Commerce. Founder John Collison stated that if an Agent completes the entire process from research to placing an order, and the product arrives at the doorstep days later, the user won't go to another website to fill in all their personal information again, even if the other site's product might be slightly better. Once a shopping Agent completes the search process, the next natural step is checkout [8].
At the same time, when payments shift from humans confirming each time to Agents consuming continuously based on tasks, the system must set session limits, per-transaction limits, and pay close attention to authorization issues. Otherwise, every data call and task execution by the Agent could become a continuous bleeding payment path.
The third category to be replaced are wallet operators and signature interpreters. In the past, the wallet confirmation page assumed a human was sitting in front of the screen, checking the recipient address, amount, and other details before deciding whether to sign. With the advent of AI Wallets, a user might only say, "Help me cross-chain this asset to somewhere with higher returns." The Agent handles the routing, protocol calling, authorization parameters, and transaction generation. The user then sees a packaged execution result from the Agent. [5]
This highlights the importance of a verifiable interface in the process. This idea is similar to a "security version" of Deepseek's chain of thought. It aims to ensure that what the wallet interface displays truly corresponds to what will happen on-chain. In other words, the wallet must evolve from a mere signing channel to the final deterministic checkpoint before execution, translating the Agent-generated transaction into content that the user can understand, the system can verify, and that can be traced back afterwards. [3]
The fourth category to be replaced are identity subjects and business participants. For natural person users, trading platforms and financial applications typically complete identity verification through KYC. For corporate entities, this is usually done through KYB (Know Your Business). However, an independently operating Agent is neither a natural person nor a traditional company, yet it might initiate payments, call a DEX, purchase an API, sign instructions, or trade with another Agent


