AI Agent 시대의 온체인 보안 보고서: 거래, 결제 및 서명 주체가 인간에서 Agent로 바뀔 때, Web3 보안 경계는 어디에 있는가?

特邀专栏作者

2026-07-01 02:08

이 기사는 약 24764자로, 전체를 읽는 데 약 36분이 소요됩니다

지난 10년간 Web3 보안의 핵심 문제는 대부분 개인키, 스마트 컨트랙트 취약점, 피싱 공격에 집중되어 왔습니다. 보안 경계는 비교적 명확했습니다. 인간이 페이지를 보고, 지갑에 서명을 클릭하며, 거래를 확인하면, 체인 상에서 결과가 실행되는 방식이었습니다.

AI 요약

펼치기

핵심 주장: AI Agent는 Web3 온체인 작업을 '인간의 확인'에서 '모델이 이해하고 자동 실행'하는 방식으로 전환시키고 있습니다. 이에 따라 보안 경계는 개인키 보호와 컨트랙트 취약점에서 의도 보호, 실행 제약, 그리고 모델, 도구, 지갑 등 구성 요소 간의 신뢰 연결 관리로 확장되고 있습니다.
핵심 요소:
1. 2026년 Grok/Bankrbot 사건: 공격자는 개인키를 탈취하거나 컨트랙트를 공격하지 않았습니다. 단지 Grok이 모스 부호를 번역하도록 하자, Bankrbot이 이를 송금 명령으로 실행하여 44만 달러의 손실이 발생했습니다. 이는 AI 시스템 간 신뢰 경계의 취약성을 드러냈습니다.
2. 공격 표면의 복잡성 증가: 공격의 중심이 단일 코드 취약점에서 교차 계층 공격으로 이동하고 있습니다. 여기에는 프롬프트 인젝션, 메모리 중독, 도구 권한 침투, 공급망 오염, x402 결제 하이재킹 등이 포함되며, 공격 대상은 '연결 지점'입니다.
3. 방어 체계의 계층화: Cobo, Coinbase 등은 MPC/수탁 지갑을 통해 개인키를 분리합니다; SlowMist, GoPlus는 도구 보안 감사를 제공합니다; KYA/ERC-8004는 Agent의 신원 및 권한 검증 기반을 마련하여 다계층 방어를 구성합니다.
4. 핵심 설계 원칙: Agent는 제안만 할 수 있고, 규칙 시스템이 권한을 부여해야 합니다; 개인키와 고권한은 Agent로부터 멀리 떨어져 있어야 합니다; 모든 온체인 동작은 읽고, 검증하고, 추적할 수 있어야 합니다; 도구 체인은 공급망 자산처럼 관리되어야 합니다.

Over the past decade, the core issues of Web3 security have mostly revolved around private keys, contract vulnerabilities, and phishing. The security boundary was relatively clear: a human sees a page, clicks the wallet, confirms a transaction, and the result is executed on-chain. However, the emergence of AI Agents is changing this default premise. The entity performing on-chain operations is shifting from "humans personally confirming" to "models understanding intent, tool invocation systems, and wallets or payment layers executing automatically." Consequently, security issues are evolving from single-point protection to a new phase of cross-layer collaboration.

The 2026 Grok and Bankrbot incident served as a wake-up call for the industry. The attacker didn't steal private keys or directly attack a contract. They simply had Grok translate a Morse code message, which Bankrbot then interpreted as a transfer instruction and executed, ultimately causing user asset losses. This case demonstrates that risks in the AI Agent era have extended beyond code vulnerabilities to the trust boundary between model output, tool invocation, and wallet execution.

This change stems from the lengthened execution chain in Web3. Previously, a transaction usually involved a human viewing a page, clicking the wallet, and confirming the signature. Now, a user might only propose a goal, and the Agent reads the context, calls tools, uses permissions, initiates payments, and completes on-chain execution. If any step in this process is misled, polluted, or overstepped, a wrong intent can be turned into a real transaction. Therefore, on-chain security needs to evolve from protecting keys to protecting intent and constraining execution.

This research report will start with actual security cases like Grok/Bankr, PocketOS, and LiteLLM, dissecting how attackers influence an Agent's understanding, memory, tools, wallet, and payment path. It will then discuss which human roles AI is replacing on-chain. Finally, it will examine how projects and proposals like Cobo, Coinbase, OKX, Binance, SlowMist, KYA, and ERC-8004 are jointly rebuilding the on-chain security boundary in the AI Agent era.

The report focuses on answering three key questions:

First, what do on-chain attacks look like in the AI Agent era?

Second, when Agents can autonomously trade, call tools, initiate payments, and manage assets, how should wallets, permissions, signatures, and identity systems be redesigned?

Third, as Agentic Wallet, x402 payments, MCP tools, Skills markets, and KYA identity systems mature, what new industry opportunities will emerge in the Web3 security track?

This will be a systematic reassessment of the on-chain security paradigm in the AI Agent era. The real question is no longer whether AI can execute tasks on-chain, but whether Web3 is ready with a sufficiently trustworthy, controllable, and auditable execution boundary when AI starts to understand, judge, sign, and pay on behalf of humans.

Original Author: Jesse, Researcher at Web3Caff Research

Produced by: A joint production of Web3Caff Research and SlowMist

1. Introduction: The On-Chain Security Boundary is Being Redrawn
2. Role Replacement: Which "Humans" is AI Replacing on the Chain?
3. Potential Attack Vectors and Classic Cases in the AI Era
Model Goals & Prompt Layer: Prompt Injection Turns Chat Content into "Execution Instructions"
Wallet & Signature Semantic Layer: Agents Amplify the Problem of Blind Signing
Memory & State Layer: Attackers Don't Need Immediate Success; They Can First Pollute the Agent's Long-Term Memory
Tool & MCP/Skill Layer: Agent Permission Escalation Strings "Limited Permissions" into "Full Control"
Autonomous Authorization Risk: An Agent Doesn't Need Malice; Being "Helpful" Enough Can Cause Disaster
AI Middleware Attacks: Attackers Don't Touch Your Agent; They First Pollute the Tools It Depends On
Wallet & Proxy Payment Layer: x402 Enables AI to Pay, Tying Web and On-Chain Settlement Together
AI-Driven Social Engineering & Phishing: Attack Your Acquaintances
Summary: Attackers Really Aim at the Connections
4. Defense Project Landscape: Who are the Participants and What Have They Done?
Custodial MPC Wallet Providers: Outsourcing Signature Security to a Professional Wallet Layer
Self-Custodial MPC/Key Management Solutions: Keeping Control Within Your Own System
Smart Contract Wallets: Letting Developers Build Their Own Security Boundaries
Large Platforms/Exchanges' Agentic Wallet as a Service: Making the Execution Layer a Platform Capability
Skills Marketplaces & Security Audits: Protecting Against the Agent's Toolbox
Identity, Permissions, and Verifiable Execution: Solving "Who Exactly is This Agent"
5. Security Design Principles: Core Recommendations from the Cases
Principle 1: Agents Propose; the Rule System Authorizes
Principle 2: Private Keys, Funds, and High-Privilege Credentials Must Be Kept Away from Agents
Principle 3: All On-Chain Actions Must Be Readable, Verifiable, and Auditable
Principle 4: Manage Tools, Plugins, and Skills as Supply Chain Assets
Principle 5: Design Payments and Executions as "Bounded Automation"
Principle 6: Default to Failure; Therefore, Implement Monitoring, Circuit Breakers, and Recovery
6. Conclusion: Industry Opportunities within the Security Track Itself
Key Structure Diagram
References

1. Introduction: The On-Chain Security Boundary is Being Redrawn

In May 2026, an attacker didn't steal private keys, didn't attack a contract, and didn't breach Bankr's servers. Instead, they had Grok translate a Morse code message. Grok, following its "helpful" model goal, output a clear text transfer instruction. Bankrbot then treated this natural language as an executable financial command, verified the NFT permissions, signed and broadcast the transaction, ultimately causing approximately $15-20k in losses from the linked Grok wallet. Two weeks later, the attacker used the same Agent trust layer vulnerability to widen the attack, resulting in losses exceeding $440,000 from 14 user wallets [1]. The most valuable lesson from this case is that there was no traditional bug in Grok or Bankrbot. The real failure was the trust boundary between two automated systems. Each did what it was designed to do, but together they produced a wrong financial outcome [1].

This marks a turning point for on-chain security in the AI Agent era: the interaction flow is shifting from "humans clicking the wallet" to "Agents first understanding intent, then calling tools in the system, finally using the wallet or payment layer to execute transactions." Cobo defines an AI Wallet as an on-chain wallet that integrates artificial intelligence to automate and optimize blockchain operations. It can execute autonomous trades based on market conditions, analyze network congestion to optimize gas, manage DeFi positions across multiple protocols, detect fraudulent transactions in real-time, and rebalance portfolios based on risk parameters [2]. Such capabilities are valuable for routine transactions and strategy automation, but once they enter high-frequency, cross-protocol, cross-tool, and automated payment scenarios, security concerns extend beyond just private key custody. Attention must be paid to whether the model understands true intent, whether tools are contaminated, whether permissions are too broad, and whether signatures are verifiable [3].

A typical Agent system includes a user interaction layer, application logic layer, model layer, tool calling layer, memory system, and underlying execution environment. Attackers often don't just attack one module; they progressively influence the Agent's behavioral control along the path of context, memory, tools, permissions, and execution [4].

However, the AI Agent era hasn't erased Web3's old risks. Private keys, signatures, authorizations, phishing, oracles, and contract vulnerabilities still exist. The real new change is that the execution chain has lengthened. Previously, it was "human sees webpage, clicks wallet, signs transaction." Now, it might be "human gives goal, AI reads context, calls MCP tools, uses OAuth permissions, initiates x402 payment, waits for service verification/settlement, and finally writes the result back on-chain." The longer the chain, the more opportunities there are for intermediary steps to be misled, polluted, impersonated, or reused. Risk has shifted from single-point vulnerabilities to cross-layer coupling.

Therefore, the main battlefield of on-chain security is expanding from traditional "private key security + contract vulnerabilities + front-end phishing" to a composite attack surface spanning five layers: model goals and prompts, memory and knowledge retrieval, tools and MCP/Skill supply chain, wallets and payment authorization, and on-chain execution and governance. Especially after AI Agents enter Web3 DeFi, the core of risk exposure may not necessarily come from the model itself, but from the fact that the existing authorization system is still built on the old assumption of human-confirmed transactions. Traditional wallets assume a human who pauses, hesitates, and performs a secondary check is in front of the screen. But once an Agent gains permission, it executes continuously according to the task goal. Consequently, wallet authorization mechanisms designed for human interaction are rapidly amplified into systemic risks in automated execution scenarios [5]. Furthermore, Agent security assessment cannot simply adopt the evaluation methods of stateless LLMs, because multi-step tool calls, state memory, and persistent permissions generate new vulnerability combinations that traditional scanners struggle to capture. Therefore, on-chain security in the AI Agent era must evolve from "protecting keys" to "protecting intent and constraining execution."

2. Role Replacement: Which "Humans" is AI Replacing on the Chain?

Compliance Note: The following content is solely an objective analysis of the roles AI Agents can play in on-chain transactions and their characteristics. It does not constitute any offer or solicitation. Please be aware that issuing or participating in investments in Tokens is subject to varying degrees of stringent regulations and restrictions in different countries and regions. Particularly in mainland China, issuing Tokens may constitute "illegal issuance of securities," and providing cryptocurrency trading matching services also constitutes "illegal financial activities" (Readers in mainland China are strongly advised to read "Compilation and Key Highlights of Laws and Regulations Related to Blockchain and Virtual Currencies in Mainland China"). Therefore, please do not make any related decisions based on this information. Please strictly abide by the laws and regulations of your country or region and do not participate in any illegal financial activities.

To understand why AI Agents are changing on-chain security boundaries, the simplest way is to look at which "humans" they are replacing. In the past, an on-chain operation usually had a clear human role: someone watched market reactions, someone decided on the trade, someone clicked the wallet, someone confirmed the signature, and someone performed post-audit. If something went wrong, you could at least ask who clicked confirm and authorized it. However, with Agents entering the picture, these roles are being decoupled and assigned to software systems. Consequently, risk is no longer concentrated on one person or one private key but is dispersed among various components.

The first type of human replaced is the trader and strategy executor. Previously, traders had to watch the charts and make decisions themselves. Now, exchange APIs, MCP tools, and Skills are delegating these actions to Agents. Bitget Agent Hub is a typical example; it's based on Bitget's API and official MCP tool suite, allowing Agents to access market data and execution capabilities through standardized interfaces [6]. Some AI operating systems (like the recently popular Agent Harness) further place trading Agents within a more complete execution framework: not just letting AI place orders, but organizing context, tools, permissions, risk control, evaluation, tracking, and feedback so the Agent can act continuously in the market [7].

The security issue with this type of replacement is direct: trading errors immediately translate into real financial losses. A human trader making a wrong order still has risk control prompts and manual approval as safeguards. An Agent with excessive permissions could continuously execute a flawed strategy in milliseconds. Therefore, when asking an Agent to trade, you cannot simply say "Please help me get higher returns with this wallet." You must establish boundaries outside its prompts, such as setting a budget cap, maximum leverage, and audit logs. These rules must be system-level hard constraints above the prompt level.

The second type of human replaced is the payment initiator and API buyer. Previously, to buy a data service, model inference, or API, a human had to register an account, deposit funds, get an API key, and manually manage usage. Now, x402 and machine payment protocols aim to change this, allowing an Agent to encounter a paid resource, sign and pay for it, obtain the service, and continue its task. This means an Agent truly becomes a machine steward capable of spending money.

After the payment role is replaced, the risk shifts from "Did I click to authorize this payment?" to "Will this machine keep spending money, to whom, and why?" At the recent Stripe Sessions conference, Stripe also announced it was pushing a new economic form – Agentic Commerce. Founder John Collison noted that if an Agent completes the entire process from research to order, and the product arrives a few days later, the user won't go to another website and fill in personal information from scratch, even if that site's product might be slightly better. Once a shopping Agent completes the search process, the natural next step is checkout [8].

At the same time, when payment shifts from human step-by-step confirmation to an Agent's continuous consumption based on tasks, the system must impose session limits, per-transaction limits, and pay close attention to authorization issues. Otherwise, every data call and task execution by the Agent could become a continuously bleeding payment path.

The third type of human replaced is the wallet operator and signature interpreter. Previously, the wallet confirmation page assumed a human was sitting in front of the screen, checking the recipient address, amount, etc., before deciding whether to sign. With the advent of AI wallets, a user might just say, "Help me bridge this asset to a higher-yielding place." The route selection, protocol calls, authorization parameters, and transaction generation are all handled by the Agent [5]. The user then only sees the final execution result packaged by the Agent.

This highlights the critical importance of a verifiable interface in the process—a "security version" of Deepseek's chain of thought, so to speak. It aims to ensure that what is displayed on the wallet interface genuinely corresponds to what will happen on-chain. In other words, the wallet must upgrade from a mere signing channel to the final certainty checkpoint before execution, translating the Agent-generated transaction into content the user can understand, the system can verify, and post-event actions can trace [3].

The fourth type of human replaced is the identity entity and business participant. For natural persons, trading platforms and financial apps typically use KYC for identification. For corporate entities, KYB is used. However, an independently operating Agent is neither a natural person nor a traditional company, yet it might initiate payments, call DEXs, purchase APIs, sign instructions, and trade with another Agent [9]. The question then becomes: Who created this Agent, on whose behalf does it act, and who gave it permission?