Risk Warning: Beware of illegal fundraising in the name of 'virtual currency' and 'blockchain'. — Five departments including the Banking and Insurance Regulatory Commission
Information
Discover
Search
Login
简中
繁中
English
日本語
한국어
ภาษาไทย
Tiếng Việt
BTC
ETH
HTX
SOL
BNB
View Market
IOSG Ventures: Dismantling the Data Availability Layer, the Overlooked Lego Blocks of the Modular Future
星球君的朋友们
Odaily资深作者
2022-08-10 06:00
This article is about 6589 words, reading the full article takes about 10 minutes
In the modular blockchain, the execution layer and the consensus layer are already the red ocean market, and the value of the data availability layer is still to be discovered.

Original title: "IOSG Weekly Brief | Dismantling the Data Availability Layer: Neglected Lego Blocks #136 in a Modular Future"

first level title

tl;dr

  • For data availability to light clients, there is little disagreement about using erasure codes to solve the problem, the difference is how to ensure that erasure codes are properly encoded. KZG commitments are used in Polygon Avail and Danksharding, while fraud proofs are used in Celestia.

  • For Rollup’s data availability, if DAC is understood as a consortium chain, then what Polygon Avail and Celestia do is to make the data availability layer more decentralized—equivalent to providing a “DA-Specific” public chain to improve the level of trust .

  • In the next 3 to 5 years, the blockchain architecture will inevitably evolve from monolithic to modular, and each layer will be in a low-coupling state. In the future, providers of many modular components such as Rollup-as-a-Service (RaaS) and Data Availability-as-a-Service (DAaaS) may appear to realize the composability of blockchain architecture. Modular blockchain is one of the important narratives underpinning the next cycle.

  • In the modular blockchain, the execution layer has already "divided the world", and there are very few latecomers; the consensus layer is competing in the Central Plains, and Aptos and Sui have emerged. Although the competition pattern of the public chain has not yet settled, its narrative is old wine in new bottles , it is difficult to find reasonable investment opportunities. The value of the data availability layer is still to be discovered.

Modular Blockchain Modular Blockchain

image description

Image source: IOSG Ventures, adapted from Peter Watts

There is no strict definition for the layering of modular blockchains. Some layering methods start from Ethereum, while others tend to be generalized, depending on the context in which they are discussed.

  • Execution layer: Two things happen at the execution layer. For a single transaction, the transaction is executed and the state changes; for the same batch of transactions, the state root of the batch is calculated. Part of the work of the current execution layer of Ethereum is assigned to Rollup, namely StarkNet, zkSync, Arbitrum and Optimism that we are familiar with.

  • Settlement layer: It can be understood as the process of verifying the validity of the state root (zkRollup) or fraud proof (Optimistic Rollup) by the Rollup contract on the main chain.

  • Consensus layer: Whether PoW, PoS or other consensus algorithms are used, in short, the consensus layer is to reach a consensus on something in a distributed system, that is, to reach a consensus on the validity of state transitions. In the context of modularization, the meanings of the settlement layer and the consensus layer are somewhat similar, so some researchers have unified the settlement layer and the consensus layer.

  • History state layer: proposed by Polynya (for Ethereum only). Because after the introduction of Proto-Danksharding, Ethereum only maintains instant data availability within a certain time window, and then performs pruning operations, leaving this work to others. For example, Portal Network or other third parties that store these data can be classified into this layer.

  • Image source: IOSG Ventures

Image source: IOSG Ventures

image description

DA in Nodes

 

Image source: https://medium.com/metamask/metamask-labs-presents-mustekala-the-light-client-that-seeds-data-full-nodes-vs-light-clients-3bc785307ef5

Let's look firstFull nodes and light clientsthe concept of.

Since full nodes download and verify every transaction in every block themselves,Doesn't require honesty assumptions to ensure state is executed correctly, has a good security guarantee. However, running a full node requires resource requirements for storage, computing power, and bandwidth. Except for miners, ordinary users or applications have no motivation to run a full node. Moreover, if a node only needs to verify certain information on the chain, it is obviously unnecessary to run a full node.

light clientlight clientdoing things. In the IOSG article "Multi-chain ecology: our current stage and future pattern” we briefly introduced light clients. Light clients are different from full nodes, they often do not directly interact with the chain,Instead, rely on neighboring full nodes as intermediaries to request the required information from full nodes, such as downloading block headers, or verifying account balances.

The light client as a node can quickly synchronize the entire chain because it only downloads and verifies the block header; in the cross-chain bridge model, the light client acts as a smart contract—the light client of the target chain only needs to verify Whether the tokens of the source chain are locked without verifying all transactions of the source chain.

What's the problem?

There is an implicit problem with this: since light clients only download block headers from full nodes, rather than downloading and validating each transaction themselves, a malicious full node (block producer) can construct a block containing invalid transactions , and send it to light clients to trick them.

It is easy for us to think of adopting"Proof of Fraud"To solve this problem: that is, only one honest full node is required to monitor the validity of the block, and after finding an invalid block, construct a fraud proof and send it to the light client to remind them. Or, after receiving the block, the light client actively asks whether there is any fraud proof in the whole network. If it does not receive it after a period of time, it can assume that the block is valid. Thus,Light clients can achieve nearly the same level of security as full nodes (but still rely on honesty assumptions).

However, in the above discussion, we actually assumed that block producers will always publish all block data, which is also the basic premise of generating fraud proofs. but,Malicious block producers may hide some of the data when publishing blocks.At this time, full nodes can download this block and verify that it is invalid; but the characteristics of light clients prevent them from doing so. andFull nodes also cannot generate fraud proofs to warn light clients due to lack of data.

Another situation is that some data may be uploaded later due to network reasons.We can’t even judge whether the missing data at this time is caused by objective conditions or intentional by block producers——Then the reward and punishment mechanism of the fraud proof will not be effective.

image descriptionData Availability Issues


Image source: https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding

Two situations are shown in the figure above: first, a malicious block producer publishes a block with missing data, at this time, an honest full node issues a warning, but then the producer supplements and releases the remaining data; Two, honest block producers publish complete blocks, but malicious full nodes issue false warnings at this time. In both cases, the block data seen by others in the network after T3 is complete, but there are people doing evil in it.

In this way,solution

solution

In September 2018, Mustafa AI-Bassam (now Celestia CEO) and Vitalik proposed to use multi-dimensional erasure codes to check data availability in a co-authored paper——A light client only needs to download a random portion of data and verify that all data blocks are available and rebuild all data if necessary.

There is almost no objection to using erasure codes to solve the data availability problem of light clients. Reed-Solomon erasure codes are used in Polygon Avail, Celestia (and Ethereum's Danksharding).

The difference isHow to ensure erasure codes are encoded correctly: KZG commitments are used in Polygon Avail and Danksharding, and fraud proofs are used in Celestia. Both have their own advantages and disadvantages. KZG promises that it is not quantum-resistant, while fraud proofs rely on certain honesty and synchronization assumptions.

In addition to the KZG commitment, there are schemes using STARK and FRI that can be used to prove the correctness of erasure codes.

(Note: The concepts of erasure coding and KZG commitments are described in the IOSG article "Merger is imminent: Detailed explanation of the latest technical route of EthereumIt is mentioned in ", due to space limitation, it will not be explained in this article)

DA in Rollup

image description


Image source: https://forum.celestia.org/t/ethereum-rollup-call-data-pricing-analysis/141

Let's look at the fee structure of Layer2. In addition to the fixed cost, the variables related to the number of transactions per batch are mainlyLayer 2's gas cost and the cost of data availability on the chain. The impact of the former is negligible; while the latter requires a constant payment of 16 gas per byte, accounting for as much as 80%-95% of the cost of Rollup as a whole.

(On-chain) data availability is expensive, what to do?

One is to reduce the cost of storing data on the chain:This is what the protocol layer does. In the IOSG article "Combining Soon: Detailed Explanation of Ethereum's Latest Technical Route", we mentioned that Ethereum is considering introducing Proto-Danksharding and Danksharding to provide Rollup with "big blocks", that is, a larger space for data availability, and adopt Erasure coding and KZG promise to solve the ensuing node burden problem.But from the perspective of Rollup, it is unrealistic to passively wait for Ethereum to adapt itself.

The second is to put the data off the chain.image description

Image source: IOSG Ventures

(Note: Validium originally refers to the expansion scheme that combines zkRollup and off-chain data availability. For convenience, in this article, Validium refers to the off-chain data availability scheme and participates in the comparison together.)

exist

DA Provided by Rollup

existThe simplest Validium solutionthen,

then,StarkExFurther proposed in 2020The Validium scheme maintained by the Data Availability Council (DAC).Members of the DAC are individuals or organizations well-known and within the legal jurisdiction, and the trust assumption is that they will not collude and do evil.

ArbitrumThis year, AnyTrust was proposed, which also adopts the data committee to ensure data availability, and builds on the basis of AnyTrustArbitrum Nova。

zkPorter proposes that Guardians (zkSync Token holders) maintain data availability, and they need to pledge zkSync Token. If data availability failure occurs, the pledged funds will be forfeited.

All three provide Volitionimage description

Image source: https://blog.polygon.technology/from-rollup-to-validium-with-polygon-avail/

General DA Scenarios

The above proposal is based on the idea that since the credibility of ordinary operators is not high enough, a more authoritative committee should be introduced to improve the credibility.

Is the level of security high enough for a small committee?The Ethereum community proposed Validium two years agoransomware attackProblem: If enough committee members' private keys are stolen such that off-chain data availability is unavailable, users can be threatened - only if they pay enough ransom to withdraw from Layer2. Given the lessons learned from the theft of the Ronin Bridge and Harmony Horizon Bridge, we cannot ignore the possibility.

Since off-chain data availability committees are not sufficiently secure, thenWhat if the blockchain is introduced as a trust subject to ensure the availability of off-chain data?

If the aforementioned DAC is understood as a consortium chain, what Polygon Avail and Celestia have done is to make the data availability layer more decentralized—equivalent to providing a "DA-Specific" public chain with a series of verification nodes, zone Block producers and consensus mechanisms to increase trust levels.

image description

Image source: https://blog.celestia.org/celestiums/

Let's take Celestia's application of Quantum Gravity Bridge on Ethereum Rollup as an example to explain. L2 Contracts on the Ethereum main chain verify validity proofs or fraud proofs as usual, with the difference that data availability is provided by Celestia.There are no smart contracts on the Celestia chain, no calculations are performed on the data, only data is guaranteed to be available.

The L2 Operator publishes the transaction data to the Celestia main chain, and the Celestia verifier signs the DA Attestation's Merkle Root, and sends it to the DA Bridge Contract on the Ethereum main chain for verification and storage.

In this way, the Merkle Root of DA Attestation is actually used to prove the availability of all data. The DA Bridge Contract on the Ethereum main chain only needs to verify and store this Merkle Root, and the overhead is greatly reduced.

image description

summary

Image source: IOSG Ventures, adapted from Celestia Blog

After discussing the above schemes one by one, we will make a horizontal comparison from the perspective of security/decentralization and Gas cost. Note that this coordinate map only represents the author's personal understanding, as a vague rough division rather than a quantitative comparison.

The Pure Validium in the lower left corner has the lowest security/decentralization and Gas cost.

The middle part is the DAC scheme of StarkEx and Arbitrum Nova, the Guardians validator set scheme of zkPorter, and the generalized Celestia and Polygon Avail schemes. The author believes that zkPorter uses Guardians as the validator set, which is slightly more secure/decentralized than DAC; while the DA-Specific blockchain scheme is slightly higher than a set of validators. At the same time, Gas costs also increase accordingly. Of course this is only a very rough comparison.

The box in the upper right corner isA solution for on-chain data availability,It has the highest level of security/decentralization and gas cost. From inside the box, all three schemes are equally secure/decentralized since their data availability is provided by the Ethereum main chain. The pure Rollup solution obviously costs less Gas than the monolithic Ethereum, and after the introduction of Proto-Danksharding and Danksharding, the cost of data availability will be further reduced.

Note: The context of "data availability" discussed in this article is mostly under Ethereum. It should be noted that Celestia and Polygon Avail are general solutions and are not limited to Ethereum itself.

Image source: IOSG Ventures

Image source: IOSG Ventures

Closing Thoughts

After discussing the data availability issues above, we found that all scenariosIn essence, it is to make trade-offs under the mutual constraints of the trilemma, while the difference between the schemes lies in the trade-off"fine-grained"different.

From a user perspective, it makes sense for the protocol to provide the option of simultaneous on-chain and off-chain data availability. becauseUnder different application scenarios or between different user groups, users have different sensitivity to security and cost.

The data availability layer support for Ethereum and Rollup was discussed more above. In terms of cross-chain communication, Polkadot's relay chain provides native security guarantees for data availability for other parallel chains; while Cosmos IBC relies on the light client model, it ensures that the light client can verify the data availability of the source chain and the target chain to important.

The advantage of modularity lies in pluggability and flexibility, which can be adapted to the protocol as needed: for example, unloading the data availability burden of Ethereum while ensuring security and trust levels; or improving light client communication in a multi-chain ecosystem The security level of the model, lowering trust assumptions. Not limited to Ethereum, data availability can also play a role in multi-chain ecology and even more application scenarios in the future.

We believe that in the next 3 to 5 years,The architecture of the blockchain will inevitably evolve from monolithic to modular, and each layer will be in a low-coupling state. In the future, providers of many modular components such as Rollup-as-a-Service (RaaS) and Data Availability-as-a-Service (DAaaS) may appear to realize the composability of blockchain architecture.executive layer

in,executive layerconsensus layerconsensus layer(that is, each Layer1) competes in the Central Plains. After public chains such as Aptos and Sui began to emerge, the competition pattern of public chains has not yet settled, but its narrative is old wine in new bottles, and it is difficult to find reasonable investment opportunities.

The value of the data availability layer is still to be discovered.

IOSG Ventures
public chain
Welcome to Join Odaily Official Community