Vitalik's latest long-form post: In the age of AI, how can code become more secure?

区块律动BlockBeats

特邀专栏作者

2026-05-19 09:52

บทความนี้มีประมาณ 12144 คำ การอ่านทั้งหมดใช้เวลาประมาณ 18 นาที

As AI begins to automatically find bugs, Vitalik's answer is "formal verification"

สรุปโดย AI

ขยาย

Core Argument: Vitalik believes that when facing the security challenges posed by AI, formal verification is not a panacea. However, by combining it with AI, a more reliable software security core can be built, especially for complex cryptographic systems like smart contracts and ZK proofs. This is the key path to achieving a defensive advantage in cybersecurity.
Key Elements:
1. Formal verification uses machine-checkable mathematical proofs to verify code correctness, but the barrier to entry is high; AI can assist in writing code and proofs, making it practically relevant again.
2. “Provable security” does not equal absolute security; it can fail due to overlooked assumptions, specification errors, hardware boundaries, or side-channel attacks. However, it still provides a more reliable security paradigm than traditional methods.
3. For complex components that Ethereum's future depends on, such as STARKs and ZK-EVM, their core security goals are relatively clear, making them the most suitable candidates for AI-assisted formal verification.
4. AI will generate a large amount of flawed code, but software will likely differentiate into a “security core” and “insecure edge components.” Formal verification can ensure the reliability of the security core.
5. AI and formal verification are highly complementary technologies: AI improves coding efficiency but reduces accuracy, while formal verification can restore accuracy, making it even more powerful than in the past.

Original Title: A Shallow Dive into Formal Verification

Original Author: Vitalik

Original Translation: Peggy, BlockBeats

Editor's Note: As AI programming capabilities rapidly improve, software security is facing a new paradox: AI can generate code more efficiently, but it can also discover vulnerabilities more efficiently. This issue is particularly critical for the crypto industry. Defects in smart contracts, ZK proofs, consensus algorithms, and on-chain asset systems often result not in simple software errors, but in irreversible financial losses and a collapse of trust.

What Vitalik discusses in this article is another path to code security in the age of AI: formal verification. Simply put, it doesn't rely on human auditors manually checking code line by line. Instead, it formulates the properties a program should satisfy as mathematical propositions, and then uses machine-checkable proofs to verify whether these properties hold. In the past, formal verification remained a relatively niche field of research and engineering due to its high barrier to entry and tedious process. However, as AI can now assist in writing code and proofs, this methodology is regaining practical relevance.

The core judgment of the article is not that "formal verification can solve all security problems." On the contrary, Vitalik repeatedly reminds us that so-called "provable security" does not equate to absolute security: proofs may omit critical assumptions, the specification itself may be written incorrectly, and unverified code, hardware boundaries, and side-channel attacks can all become new sources of risk. Yet it still offers a more reliable security paradigm: express the developer's intent in multiple ways, and then let the system automatically check whether these expressions are compatible with each other.

This is particularly important for Ethereum. The future Ethereum will increasingly rely on complex underlying components, including STARKs, ZK-EVMs, quantum-resistant signatures, consensus algorithms, and high-performance EVM implementations. The implementation of these systems is extremely complex, but their core security goals can often be formalized relatively clearly. It is precisely in such scenarios that AI-assisted formal verification can realize its greatest value: letting AI handle the writing of efficient code and proofs, while humans focus on verifying whether the finally proven propositions truly correspond to their desired security goals.

From a broader perspective, this article is also Vitalik's response to cybersecurity in the AI era. Facing stronger AI attackers, the answer is not to abandon open source, abandon smart contracts, or re-rely on a few centralized institutions. The answer is to compress critical systems into smaller, more verifiable, and more trustworthy "security cores." AI will lead to a massive increase in rough code, but it may also make truly important code safer than ever before.

The following is the original text:

Special thanks to Yoichi Hirai, Justin Drake, Nadim Kobeissi, and Alex Hicks for their feedback and review of this article.

Over the past few months, a new programming paradigm has been rapidly emerging within Ethereum's frontier research circles and many other corners of the computing world: developers write code directly in very low-level languages, such as EVM bytecode, assembly, or Lean, and verify its correctness by writing machine-checkable mathematical proofs in Lean.

When applied correctly, this approach has the potential to produce both extremely efficient code and software that is far more secure than past development methods. Yoichi Hirai calls it "the final form of software development." This article will attempt to explain the basic logic behind this approach: what formal verification of software can do, and what its shortcomings and boundaries are—both within the context of Ethereum and in the broader field of software development.

What is Formal Verification?

Formal verification refers to writing mathematical theorem proofs in a way that can be automatically checked by a machine.

To give a relatively simple but still interesting example, let's look at a basic theorem about the Fibonacci sequence: every third number is even, while the other two are odd.

A simple proof uses mathematical induction, moving forward three numbers at a time.

First, the base case. Let F1 = F2 = 1, F3 = 2. By direct observation, the proposition—"Fi is even when i is a multiple of 3, and odd otherwise"—holds up to x = 3.

Next, the inductive step. Assume the proposition holds up to 3k+3, meaning we already know the parity of F3k+1, F3k+2, and F3k+3 is odd, odd, and even, respectively. Then, we can calculate the parity of the next set of three numbers:

Thus, we have completed the derivation from "the proposition is known to hold up to 3k+3" to "confirming the proposition also holds up to 3k+6." By repeatedly applying this reasoning, we can be confident that this rule holds for all integers.

This argument is convincing to a human. But what if you wanted to prove something a hundred times more complex, and you really, really wanted to be sure you hadn't made a mistake? Then, you could construct a proof that a computer could also be convinced by.

It would look something like this:

This is the same reasoning, just expressed in Lean. Lean is a programming language often used for writing and verifying mathematical proofs.

It looks quite different from the "human version" I gave above, and for good reason: what is intuitive to a computer is completely different from what is intuitive to a human. The computer I'm referring to here is the old-fashioned kind—a "deterministic" program made of if/then statements, not a large language model.

In the proof above, you aren't just asserting the fact that fib(3k+4) = fib(3k+3) + fib(3k+2); you are asserting that fib(3k+3) + fib(3k+2) is odd, and Lean's aptly named "omega tactic" will automatically combine this with its understanding of the definition of fib(3k+4).

In more complex proofs, you sometimes have to specify at every step which specific mathematical law allows you to take that step, and these laws sometimes have obscure names like Prod.mk.inj. On the other hand, you can also expand massive polynomial expressions in a single step, completing the argument with just a line like `omega` or `ring`.

This unintuitiveness and tediousness are significant reasons why machine-verifiable proofs, despite existing for nearly 60 years, have remained a niche field. But simultaneously, many things that were nearly impossible in the past are rapidly becoming feasible thanks to the rapid progress of AI.

Verifying Computer Programs

At this point, you might think: Okay, so we can have computers verify mathematical theorem proofs, allowing us to finally determine which crazy new discoveries about prime numbers are real and which are just errors hidden in hundred-page PDFs. Maybe we could even figure out whether Shinichi Mochizuki's proof of the ABC conjecture is correct! But besides satisfying our curiosity, what practical use is this?

There can be many answers. For me, a very important one is verifying the correctness of computer programs, especially those involving cryptography or security. After all, a computer program is itself a mathematical object. Therefore, proving that a computer program will behave in a certain way is essentially proving a mathematical theorem.

For example, suppose you wanted to prove whether an encrypted communication software like Signal is actually secure. You could first mathematically define what "secure" means in this context. Broadly speaking, you would need to prove that, under certain cryptographic assumptions, only the person possessing the private key can learn any information about the content of the messages. In reality, there are, of course, many different types of crucial security properties.

And it turns out, a team is indeed trying to figure this out. One of the security theorems they proposed looks roughly like this:

Here is a summary of what this theorem means, provided by Leanstral:

The `passive_secrecy_le_ddh` theorem is a tight reduction showing that, under the Random Oracle Model, the passive message secrecy of X3DH is at least as hard to break as the DDH assumption.

In other words, if an attacker can break the passive message secrecy of X3DH, they can also break DDH. Since DDH is generally assumed to be a hard problem, X3DH can also be considered secure against passive attacks.

The theorem proves that if an attacker can only passively observe Signal's key exchange messages, they cannot distinguish the resulting session key from a random key with more than a negligible probability.

If combined with a proof that AES encryption is correctly implemented, you get a proof that Signal protocol's encryption mechanism is secure against passive attackers.

Similar verification projects already exist, used to prove the secure implementation of TLS and other in-browser cryptographic components.

If you can achieve end-to-end formal verification, you prove not just that a protocol description is secure in theory, but that the specific piece of code users are running is secure in practice. From a user's perspective, this greatly increases the degree of "trustlessness": to fully trust this code, you don't need to inspect the entire codebase line by line, only the statements that have been proven to hold.

Of course, there are some very important asterisks to add here, especially regarding what the keyword "secure" actually means. It's easy to forget to prove truly crucial claims; it's easy to encounter situations where the claim you need to prove doesn't have a description simpler than the code itself; it's easy to sneak in assumptions that ultimately don't hold true within the proof. It's also easy to decide that only a certain part of the system truly needs formal verification, only to be hit by a critical vulnerability in another part, or even at the hardware level. Even Lean's own implementation could have bugs. But before we dive into these headache-inducing details, let's first explore the potential of a near-utopian scenario if formal verification could be ideally and correctly realized.

Formal Verification for Security

Bugs in computer code are already a cause for concern.

When you put cryptocurrency into immutable on-chain smart contracts, code bugs become even more terrifying. Because once the code has a flaw, North Korean hackers can automatically drain all your funds, and you have almost no recourse.

When this is wrapped in zero-knowledge proofs, code bugs become still more terrifying. Because if someone successfully breaks the zero-knowledge proof system, they can withdraw all the funds, and we might have no idea what went wrong—or worse, we might not even know a problem has occurred.

When we have powerful AI models, code bugs become even more terrifying. Models like Claude Mythos, after two more years of improvement, might be able to automate the discovery of these vulnerabilities.

Faced with this reality, some people have begun to advocate abandoning the fundamental idea of smart contracts, even arguing that cyberspace cannot be a domain where defenders have an asymmetric advantage over attackers.

Here are some quotes:

To harden a system, you need to spend more tokens finding vulnerabilities than an attacker spends exploiting them.

And:

Our industry was built on deterministic code. Write code, test, deploy, and know it works—but in my experience, this contract is breaking down. Among the top operators of truly AI-native companies, codebases are beginning to become something you "trust will work," and the probability associated with that trust can no longer be precisely articulated.

Worse, some people believe the only solution is to abandon open-source.

This paints a bleak future for cybersecurity. It is an especially bleak future for those who care about internet decentralization and freedom. The entire cypherpunk ethos is fundamentally built on the idea that on the internet, the defender has the advantage. Building a digital "castle"—whether it takes the form of encryption, signatures, or proofs—is easier than tearing one down. If we lose this, then internet security can only come from economies of scale, from hunting down potential attackers worldwide, and more broadly, from a binary choice: either dominate everything or face destruction.

I disagree with this assessment. I have a much more optimistic vision for the future of cybersecurity.

I believe the challenge posed by powerful AI vulnerability discovery is indeed severe, but it is a transitional challenge. Once the dust settles and we reach a new equilibrium, we will arrive at a state more favorable to defenders than the past.

Mozilla agrees. To quote them:

You may need to reprioritize all other matters, devoting yourself to this task in a continuous and focused manner. But there is light at the end of the tunnel. We are very proud of how the team stepped up to this challenge, and others will too. Our work is not finished, but we have passed the inflection point and are beginning to see a better future that is far more than "barely keeping up." Defenders finally have a chance to achieve a decisive victory.

...

The number of flaws is finite, and we are entering a world where we can finally find them all.

Now, if you Ctrl+F search for "formal" and "verification" in Mozilla's article, you'll find neither word appears. A positive future for cybersecurity is not strictly dependent on formal verification, nor on any other single technology.

Then what does it depend on? Basically, the following chart:

Over the past few decades, many technologies have driven the trend of "declining vulnerability counts":

Type systems;

Memory-safe languages;

Software architecture improvements, including sandboxing, permission mechanisms, and more broadly, clearly distinguishing the "trusted computing base" from "other code";

Better testing methods;

Accumulated knowledge about secure and insecure coding patterns;

An increasing number of pre-written and audited software libraries.

AI-assisted formal verification should not be seen as an entirely new paradigm, but rather as a powerful accelerator: it is speeding up a trend and paradigm that has long been moving forward.

Formal verification is not a panacea. But it is particularly applicable in certain scenarios: where the goal is much simpler than the implementation. This is especially evident in some of the most challenging technologies needed for Ethereum's next major iteration, such as quantum-resistant signatures, STARKs, consensus algorithms, and the ZK-EVM.

STARKs are very complex pieces of software. However, the core security property they implement is easy to understand and formalize: if you see a proof pointing to the hash H of program P, input x, and output y, then either the hash algorithm used by the STARK has been broken, or P(x) = y. Hence, we have the Arklib project, which aims to create a fully formally verified STARK implementation. Another related project is VCV-io, which provides foundational oracle computation infrastructure for formally verifying various cryptographic protocols, many of which are themselves dependencies of STARKs.

More ambitious is evm-asm: a project aiming to build a complete EVM implementation and formally verify it. Here, the security property is less clear-cut. Simply put, the goal is to prove that this implementation is equivalent to another EVM implementation written in Lean, which can be optimized for intuitiveness and readability without considering runtime efficiency.

Of course, in theory, a situation could still arise where we have ten EVM implementations, all proven equivalent to each other, yet all share the same fatal flaw that somehow allows an attacker to drain all ETH from an address they shouldn't control. But the probability of this happening is much lower than such a flaw existing in just one EVM implementation today. Another security property—DoS resistance—whose importance we truly realized only after painful experiences, is relatively easy to specify clearly.

Two other important directions are:

Byzantine Fault Tolerant consensus. In this area, formally defining all desired security properties is also not easy. But given that related bugs have been very common, attempting formal verification is still worthwhile. Hence, there is ongoing work on Lean consensus implementations and their proofs.

Smart contract programming languages. Examples include formal verification work in Vyper and Verity.

In all these cases, a huge added value of formal verification is that these proofs are truly end-to-end. Many of the most insidious bugs are interaction bugs, emerging at the boundaries between two subsystems considered separately. End-to-end reasoning about the entire system is too difficult for humans; but automated rule-checking systems can do it.

Formal Verification for Efficiency

Let's look at evm-asm again. It's an EVM implementation.

But it's an EVM implementation written directly in RISC-V assembly.

Literally assembly.

Here is the ADD opcode:

The reason for choosing RISC-V is that the ZK-EVM provers currently being built typically work by proving RISC-V, compiling the Ethereum client to RISC-V. Therefore, if you write an EVM implementation directly in RISC-V, it should theoretically be the fastest implementation you can get. RISC-V can also be very efficiently emulated on ordinary computers

Vitalik

เทคโนโลยี