Beam Sync: A New Way to Sync Ethereum Nodes

以太坊爱好者

特邀专栏作者

2020-03-06 07:00

This article is about 5312 words, reading the full article takes about 8 minutes

Beam sync mode can provide better feedback and faster execution results.

Editor's Note: This article is from:Ethereum enthusiasts (ID: ethfans)Ethereum enthusiasts (ID: ethfans)

, Author: Jason Carver, translation & proofreading: Chen Liang & A Jian, reprinted with authorization by Odaily.

text

Why improve the node sync experience?

It makes me cringe a little when I think about how many people are still using Infura (via Metamask, Gnosis-Safe, etc.) to interact with on-chain applications. Infura's service is great, but it's obviously not right if most users don't run their own nodes. Even the most capable and highly motivated developers cannot completely get rid of their dependence on Infura. At this point, we still haven't completed a significant part of Ethereum's "self-validating" vision.

Our team wants to do its part to reverse this trend. Our mission is to increase the number of nodes on the network as much as possible, especially those run by hobbyists, researchers, and developers. When we asked why they didn't run their own node, the answer was something like: "I installed the client software, and I tried to sync the blockchain, but it never seemed to sync. So I stopped because I have other things to do."

first level title

Existing Synchronization Methods

secondary title

Full Sync (full synchronization method)

The fully synchronous approach is to execute every block since the genesis block. The genesis block marks an initial genesis state (the content of the state includes account balance, contract bytecode, contract storage content, etc.). The so-called "execution block" means that every time a block is downloaded, the previous state is read and a new state is generated (according to the content of the block), and the new state is used to verify the state root in the block header (to verify that the block block is not a valid block). Full sync is very slow on the Ethereum mainnet, and as the network ages, it will take longer and longer to sync to the latest block using the full sync method. So people developed the "fast sync" method.

secondary title

Fast Sync (fast synchronization method)

Before fast sync can execute the start block, the required block state includes: contract bytecode, account and contract storage content. Any of these values may need to be read when executing a transaction. Therefore, the fast-sync approach requires obtaining a snapshot of the state before the block was started from other peers. Snapshots are marked with the state root hash value; the so-called state root hash value is the hash Merkle tree root value of all state content. Nodes use this state root hash to verify that state data downloaded from other peers matches the state claimed by the miner in the block.

After the fast sync method has downloaded all the required state, it means that the node already has all the data needed to execute the transaction. Then from this point on, the node can switch to the fully synchronous mode, and the blocks can be executed one by one from the start block, just like the node that has completed the full synchronization process before the start block.

secondary title

Other methods

Other fast sync methods include Warp Sync as well as some currently unproven sync methods. At a more abstract level, they all belong to different forms of fast synchronization methods. In addition, even understanding the principles of these other synchronization methods will not help to understand the Beam synchronization strategy, so these synchronization principles are not the focus of our article, and I will talk about it later.

first level title

How fast is the quick sync method?

The fast synchronization method faces some challenges in the current main network operating environment, because synchronization needs to download a lot of data, even more than 100GB of data, so it may be possible in the second step "Get All State" shown in the above figure ’ was going to be stuck for a long time.

If you can't download all the state data in 30 minutes (spoiler alert: you really can't finish it), you need to do a pivot (pivot), that is, change to a new boot block and restart the synchronization, although not from 0, but also increases the time to download and verify blocks.

first level title

overview

Beam sync method

secondary title

The beam synchronization method is the result of directly improving the fast synchronization method. The difference between the two synchronization methods is that the beam synchronization method directly executes the startup block at the beginning, and only requests the state data that is missing in the local database, and puts the input state and output state Save locally. After executing a block, it syncs to the next block and repeats the process, requesting missing data as needed.

Over time, the missing data will become less and less. Note that if a certain state is never accessed, the client will never request it (and thus never get this part of the state data), so we run another process in the background to fill these gaps. Through this backfilling process, Beam Sync will eventually fetch all state data and save it locally, and the node can then switch to a fully synced state.

We refer to the data set required to execute each block as the "block witness". Thanks to the structure of the Merkle tree, we can prove that the witness data is really taken from this state without downloading a complete state (Translator's Note: It is the so-called "Merkle proof").

secondary title

For simplicity, we use "block witness data size" to refer to the number of data elements required to execute a block. Such data elements may be a node on the main account state tree, or a node on the contract storage tree, or the complete bytecode of a contract (Translator's Note: "State Tree" here, "Storage trees" are all Merkle tree-structured data).

Analyzing the block witness data size is key to understanding the performance of the Beam synchronization method. The fast sync method must download all state data before executing the first block (i.e., the start block), while Beam sync only needs to download the witness data of one block, if the downloaded block witness data contains one-third of the complete state One, then Beam Sync will run about three times as efficiently as Fast Sync.

So, obviously, the next step is to see how big the mainnet witness data actually is. It may be too early to draw direct conclusions, but early test results suggest that 3000 state tree nodes is a reasonable estimate (90% confidence level). The total state information of the main network has more than 300 million tree nodes.

secondary title

Speed improvements for Beam sync

Let's define a new criterion: "start to execute" time. This is the time it takes from starting a node with an empty database to completing a full import of the most recent block.
If Beam only needs to download 3000 state tree nodes, and fast sync needs to download 300 million tree nodes, then we can determine the speed limit of the Beam Sync method: when synchronizing the mainnet, it can be in "start to execution" time Get up to a 100,000x boost!
However, the Beam Sync approach often doesn't really achieve a 100,000x improvement. Reasons include but are not limited to:

The block witness data is determined on demand, which means that we cannot predict in advance which state data will be needed (only after receiving one data can we determine what the next required data is). So when we request data from peer nodes, we can only request one state data at a time. In contrast, Fast Sync can request up to 384 tree nodes at a time, which makes Beam Sync more sensitive to peer network latency.

Finding high-quality, low-latency sync nodes takes time. To be honest, what kind of peer nodes will be encountered is a true random event.

Unlike the fast sync method, the beam sync method will continue to download the block state after the start block, which will also slow down the block import time. If you have some intuition about this, you might notice that if the average time to collect a block witness is longer than the average time to produce a block, then the problem is exacerbated.

secondary title

Beam sync lag

The delay in collecting block witness data may gradually increase, and then you will find that your Beam sync node is delayed by 5 minutes. That is to say, the latest block you have locally is actually the block generated by the entire network 5 minutes ago, which means that when RPC calls your current node account balance, your node will feedback the balance five minutes ago.

Beam Sync Pivot

In anecdotal testing, it is common for lag times to vary widely, ranging from 1 minute to 20 minutes. Luckily, we have some tricks for recovering from lagging block syncs, and in fact, generally speaking, the more you fall behind, the better you recover, which leads to huge fluctuations in lag time: , followed by rapid catch-up, and the cycle repeats.

One reason we can catch up faster if a block is behind is that we can generate witness data for multiple blocks in advance and simultaneously. After all, you can take advantage of these future blocks only if you are behind. Of course, if the time it takes to collect the required block data always exceeds the block generation time, then there is no doubt that you will be more and more behind the growth of the blockchain. We hope this will never happen, but we need Plan for this.

secondary title

The pivoting mechanism in beam sync is like in fast sync, your node selects a series of blocks to skip, and selects a new block header near the top of the chain, then, beam sync starts again, and the node It doesn't exactly sync from scratch, it still has all the data from the previous sync.

Alright, let's see what the real situation looks like.

first level title

Beam sync on Trinity client

secondary title

Prototype announced

A new alpha version of the Trinity client was announced last week, and this version includes a prototype of Beam's synchronization method that runs on high-end hardware.

Note: Raising the block Gas Limit from 8M to 10M seems to increase the average lag time, and the Istanbul upgrade may reduce the lag time because it increases the gas consumption required to write state data on the blockchain.

Currently, Trinity is still an alpha version, and the latest version still has many problems. For example, the synchronization may be stopped suddenly after a day or two. Installing the Trinity client required extra work, and the command line output was a mess. Therefore, Trinity is currently only ready for developers and researchers who are curious and don't mind playing with it.

Unresolved issues are basically exposed in daily development, so these issues are "bugs on the background log". At this point, it's safe to say that no one is worried that Beam Sync will be an unattainable dream. This confidence in Beam synchronization is new. But we also think that there may be sudden unknown problems to be solved in the future, just like a month ago!

Remaining work

In addition to basic debugging and implementation work, we still have a lot of other work to do. Trinity does not yet implement a backfill mechanism for state (i.e. downloading old events, transactions, and data before the start block). Currently the only way to activate the switching mechanism is to restart Trinity, and the minimum hardware configuration required for the Beam sync method is not known (we welcome everyone to help us gather relevant data).

All of the above issues are under active R&D, and this is just one of many works on Trinity, thanks to the Ethereum Foundation for their sponsorship and support of these efforts.

first level title

What's new about Beam Sync?

The beam synchronization method, like other theories, is based on previous research work. The method of speeding up synchronization by downloading the latest state and skipping the execution of old blocks (fast synchronization method) was not invented by us. The method of executing blocks by relying on witness data instead of complete state to speed up the execution is not invented by us. For details, please refer to the stateless client.

开发者

ETH

Welcome to Join Odaily Official Community

Subscription Group

https://t.me/Odaily_News

Chat Group

https://t.me/Odaily_CryptoPunk

Official Account

https://twitter.com/OdailyChina

Chat Group

https://t.me/Odaily_CryptoPunk