Risk Warning: Beware of illegal fundraising in the name of 'virtual currency' and 'blockchain'. — Five departments including the Banking and Insurance Regulatory Commission
Information
Discover
Search
Login
简中
繁中
English
日本語
한국어
ภาษาไทย
Tiếng Việt
BTC
ETH
HTX
SOL
BNB
View Market
The pursuit of random numbers is the most simple justice of human beings
DfinityFun
特邀专栏作者
2019-07-24 06:41
This article is about 10156 words, reading the full article takes about 15 minutes
A random number that runs on the 01 machine and can be recognized by everything in the world has never been so charming before the blockchain.

foreword

Community: Nutshell Universe (ID: DfinityFun)

foreword

In the Eastern Continent 4,000 years ago, the shaman slowly picked out the tortoise shell from the ashes of the bonfire. The random cracks on the tortoise shell would determine the timing of the tribe's next hunt; Five hundred outstanding citizens are randomly selected to participate in the discussion of the city-state's government affairs; in the Middle East 2,000 years ago, merchants sat around a gambling table, nervously watching the four-sided dice that rolled on the table, and its final result would determine The attribution of a large fortune.

In the thousands of years of human history, we have been constantly searching for reliable random numbers and using random numbers in all aspects of life, the lottery system of democracy and republic, the transmission of military information during wars, the model calculation of production, and A trusted reference when gambling and entertaining.

first level title

true random number

In terms of the simple technology of our ancestors, humans have created many methods of generating random numbers, but the randomness and unpredictability of these random numbers are no higher than those of burning tortoise shells.

image description

The picture of "Millions of Random Tables" comes from the Internet

image description

Lava lamp pictures from the Internet

first level title

pseudo random number

We can get mysterious and infinite randomness from physical phenomena, but some phenomena cannot be precisely quantified. But at the same time, we need a huge amount of random numbers to encrypt data, train models, and make fair arbitrations. Natural random numbers alone are far from enough.

image description

ENIAC computer | Picture from the Internet

This algorithm is like this: first obtain a very short random value, such as the number of milliseconds at the time of operation, as the "seed" of this algorithm, then square the "seed", output the middle part of the square result, and then use it as the "seed" Repeat the above operation. After repeating enough times, a "random number" is obtained.

This is called the square method, and it is just one of countless pseudo-random number algorithms that output a sequence that depends only on the randomness of the initial seed. Due to the certainty of the machine, the same seed can calculate the same random number.

The number of digits of the seed determines the randomness of the random number. When you set a two-digit seed to generate a 10-digit random number, you can only get up to 100 available random numbers before the function generates a repeated cycle; while in nature , there should be ten billion possibilities for a 10-digit random number. The huge order of magnitude difference between the two is the difference between true and false random numbers.

The picture comes from the Internet

The picture comes from the Internet

Pseudo-random numbers have a regular cycle, and the concentration of points is expressed as veins

Of course, this kind of random number is enough to use. We must introduce a very important concept in computer science, the time boundary, that is, how long this random number is safe and non-repetitive. Just like we use a password car lock, we know that the correct password must be in 10,000 combinations, but it takes several days to try each one. It is considered safe within this time boundary, and the same is true for pseudo-random numbers.

The random strings we type on the keyboard with our eyes closed actually have rules to follow. This is the keyboard layout of QWER. After analysis, the possibility of cracking random numbers will be very high.

first level title

Why Blockchain Needs Random Numbers

In the traditional Internet, random numbers are the basis of cryptography and privacy security. By sharing random keys, encrypted private communications can be carried out between two nodes. In the blockchain, private key transmission is used Property; at the same time, random numbers are also widely used in multi-node communication under limited bandwidth. Random numbers can be used to determine the reasonable order of data transmission to coordinate multi-party nodes. On the blockchain, it is to use consensus algorithms based on random numbers to coordinate The transaction confirmer ensures that everyone only updates the messages of some nodes within a period of time, so as to obtain consensus when the number of network messages is limited.

Summarize

Summarize

  • Fairly determine the right to generate blocks and maintain a consistent consensus. Under some PoW and PoS mechanisms, random numbers are used to select block producers or block producers, including the order of circular block production under the DPoS mechanism, which is also determined by random numbers.

  • Generation of private keys. At present, as long as the private key is generated by the random number method determined by each wallet, there is a big security risk.

  • Random number source for on-chain applications. Ensure the fairness and justice of applications such as gambling, games, lottery draws, distribution, surveys, etc., which are easy to be hacked.

  • first level title

The Difficulty of On-Chain Random Numbers

Although the blockchain is still based on the past Internet technology, there is a big difference in the random number generation part.

The traditional random number generation method is centralized, and the random numbers produced are related to the state value and physical state of a specific machine. However, the random numbers obtained by the same random number algorithm on different nodes are different, and also It is impossible to verify each random number, so the traditional method cannot generate consistent randomness, which is incompatible with the blockchain.

secondary title

Principles of On-Chain Random Numbers

1. Unpredictable

Because the random number determines the fairness of the entire protocol layer including all the above layers, if the random number can be predicted in advance, then it can wait for an opportunity to launch an attack. Of course, this unpredictability has a time boundary. Generally, the blockchain time is used as the boundary, and the difficulty of prediction is increased by controlling the difficulty of calculation or setting waiting. In short, there are two options:

  • Guaranteed that the random number is serialized from the block time (VRF/VDF)

  • Guarantee the difficulty of random number generation, and adjust the difficulty according to the situation (hash collision)

2. Do not interfere

The random number determines the consensus confirmation of the block (in a non-Byzantine fault-tolerant system), so if you can interfere with the occurrence of the random number, you can indirectly control the future direction of the blockchain without mastering the nodes/computing power beyond the upper limit, although Interference can be small, but the accumulation of multiple interferences can lead to more serious problems ("amplification attacks"). There are two solutions:

  • Ensure that random number generation is non-interactive (threshold signature scheme), or calculated entirely based on the state of the node itself (hash collision)

  • Set the random number generation delay, you need to wait for a long time for complex calculations to get the random number, so that the interferer can not predict the impact of their own

3. Verifiable

secondary title

Some solutions that do not follow the principles above

In order to solve the difficulty of random numbers on the blockchain, many solutions have been spontaneously produced. The following is a brief introduction to the following solutions that do not meet the above three principles. Although they each have self-consistent logic and certain security, they are currently There are still problems in the stage, and it can still be used in places where the security requirements are not so high. This is not specific to a particular project, but points to issues with similar paradigms.

1. Use off-chain true random numbers

Instead of computing random numbers on-chain, use true random numbers derived from physical phenomena such as atmospheric noise, electrical pulses, and rely on chipsets containing thermally noisy circuits. But for the blockchain, the off-chain part is invisible, and it is untrustworthy by default. Obtaining random numbers off the chain must require a third party to upload it to the chain, which not only violates the spirit of decentralization of the blockchain , is completely unverifiable, and there are risks of tampering and advance prediction. Even if the oracle machine is used to obtain truly random numbers off-chain in a decentralized manner, there is still physical human interference. Therefore this scheme is very undesirable.

2. Use the data of the current block as a random source

Many gambling Dapps on the chain are used to directly refer to the hash root value of the latest block as the random seed of the contract, and then generate random numbers. Although the brief introduction relies on the guarantee of the computing power on the chain, the random numbers generated in this way seem to be credible enough, but multi-dimensional security needs to be considered. First of all, the data on the block is transparent, all nodes can obtain it, and attackers can also use it to attack the contract. The only obstacle is the random number algorithm that is not open source. The second point is that after the block producer is qualified to produce blocks, he can try to change the order of packaged transactions and try to package different transactions to generate the hash root value that is most beneficial to him, thereby expanding his chances of winning, which is beneficial to other participants unfair.

3. Generate random numbers with the help of distributed organizations

first level title

Three types of programs

secondary title

hash collision

In the PoW system, the client of the miner has a random number generator based on the state of the node, which can output a sequence of random numbers and then calculate its hash value. Once the generated hash value is within the specified size range, it is In order to obtain the qualification to produce a block, other nodes obtain this random value and calculate the hash value once for verification, so its verification is very simple.

Therefore, this random number has to experience hundreds of trillions of random number occurrences before it has a chance to obtain a random number that meets the requirements, so its degree of randomness is very high; It is almost impossible to predict this random number several blocks in advance; since the random number is completely output according to the state of the node itself, this random number will not interfere with other attackers; instant difficulty adjustment (adjust the size range of the hash value ), and will not be breached within the security time boundary.

advantage:

advantage:

  • The strongest randomness, security number, very suitable for driving the consensus layer;

  • shortcoming:

shortcoming:

  • Calculation redundancy, large calculation consumption, waste of resources;

  • Difficult to use at the application layer;

  • Not unique, multiple random numbers that meet the requirements can exist at the same time, which may lead to block forks;

  • secondary title

Verifiable Random Function (VRF)

The VRF (Verifiable Random Function) algorithm was proposed by Professor Mocali in 1999. Due to its better security and efficiency, it is used by more and more blockchain projects to optimize the consensus process, allowing the random number part of the consensus to occupy the calculation The resources are reduced, and more resources are occupied by the confirmation of transactions and the operation of contracts.

Set the above figure to briefly describe the generation steps:

0. Generate a public-private key pair dedicated to random number signatures;

1. Obtain a sufficiently random seed, which can directly use the results of the previous round and combine with variables such as block height and time;

2. Use the random number private key to sign it (jointly participate in the random number generation), or sign first and then combine;

3. Make a hash summary of the signed value to get the latest random number;

4. Check whether the random number is within the legal range, and judge whether the lottery has successfully entered the block production group (some projects do not need this process)

5. According to the input of the public key and round, etc., the proof is calculated, and the verifier uses the function to verify after receiving the random number.

Therefore, the random number generated in this way, the node can easily verify whether it conforms to the algorithm,"Verifiable"This is done; and after a sufficiently complex algorithm, coupled with the hash digest process, a sufficiently randomly distributed result is obtained, and "Random" is guaranteed.

VRF is mainly used in conjunction with the PoS mechanism to randomly select block producers (groups), and directly determine the consistent selection of blocks in individual projects. The random number generation process can be run completely off the chain of a node, and it can also be run entirely on the chain.

advantage:

advantage:

  • Low computing power requirements, high efficiency in generating random numbers;

  • Generate unique and deterministic random numbers, which are not prone to bifurcation;

  • Verification can lag behind random number generation, suitable for secret elections;

  • shortcoming:

shortcoming:

  • There are many verification steps, and multiple verifications are required under secret elections;

  • The uniformity of random number distribution is not good, because it is calculated according to the characteristic key;

  • The bandwidth usage is high and the delay is long, which is to allow the secretly elected nodes to confirm each other, and the BLS project is not a big problem;

Algorand

Each node obtains the random seed on the previous round of confirmation blocks, sets the round time, etc., runs the VRF function separately under the chain, and then performs a hash summary on the result, and the node compares the result with the hash value range in the network (Comparatively similar to PoW hash comparison), members within the range are eligible to participate in verification and block generation.

Because VRF is running at the network boundary, that is, the part of the node chain, it is unknown to the entire network which nodes are selected. The selected nodes do not know which nodes are also selected, and random numbers and proofs need to be sent to each other. And collect verification to judge the identity of other nodes, so this process is called "secret election".

As for the selection of the block, it is determined according to the "qualification" of the random number of the block-producing node. At the same time, the node needs to vote for other blocks. This process is an improved Byzantine fault-tolerant protocol, and voting requires two rounds , each round requires a "secret election".

advantage:

advantage:

  • "Secret election", it is impossible to know all the selected nodes at the first time, so it is impossible to collude or attack at a fixed point in advance;

  • shortcoming:

shortcoming:

  • It takes a long time for the selected nodes to confirm each other's identities and traverse the entire network. The bandwidth usage rate is too high, which prevents ordinary people from participating in the consensus and affects the degree of decentralization;

  • Since the size of the verification group is distributed according to the probability range, enough nodes cannot be selected when the network size is small, and when the network size is large, the verification group is too large and the performance is too low;

secondary title

DFINITY

The first time a node pledges to join the network, it will randomly generate a "random number key" exclusive to each node according to the distributed key generation protocol.

This distributed key agreement will generate a total key pair that meets the requirements, and according to the size of the committee group specified by the system, the total key pair will be divided and distributed to the nodes in the group. Nodes are given a private signing key, but no one knows what the master key is.

The allocation of groups is random, and a node can exist in multiple groups at the same time, but the members of the group are fixed and known in advance, which is different from Algorand's "secret election" - DFINITY is "a priori", Algorand is "posteriori".

DFINITY's random number generation process is completely on-chain, and you cannot control the random number process. The random number of the last round will select the committee group for this round from multiple groups, and the whole process is open. The committee group will fulfill the obligations of block generation and notarization. During the notarization process, nodes will continue to look for the block with the highest "weight". This "weight" is also determined by the random number of the previous round. When a block is found, it will be used. The "random number key" is signed and broadcast.

And there is a program running on the chain, called a random number beacon, which is constantly accepting the signature data of the current round. When the signature of a block exceeds 50% of the number of nodes in the verification group, it will send the signature Aggregate to generate a uniquely confirmed random number. The appearance of this random number heralds the end of this round of consensus and will continue to select participants for the next round of consensus.

advantage:

advantage:

  • In the stage of consensus and random number generation, the nodes do not need to interact, they are executed automatically, no Byzantine fault tolerance is required, the efficiency is high, and the bandwidth usage is low;

  • The random number generation process is inoperable and non-interactive, so it cannot be influenced, and the random number is more fair;

  • The threshold group has been determined and does not change with the consensus, which is naturally suitable for sharding, so the scalability is very good;

  • shortcoming:

shortcoming:

  • Posterior, the team members have been determined before the consensus, there may be the possibility of collusion;

  • Driven entirely by random numbers, although colluding nodes cannot control the blockchain, they can stop the system from shutting down;

  • The academic foundation of the BLS algorithm is sufficient, but there is a lack of engineering implementation and no open source cases.

Cardano

This project is the first implementation of the famous Ouroboros algorithm. The consensus phase is cycled by epoch. There are multiple slots in each round, and each slot will generate a block. If A slot can be skipped if something goes wrong. And who should generate blocks for each slot needs to use the random number seed given by the genesis block of this round, the mortgage rights of nodes eligible to participate in the consensus, the node public key, and the list of slots ( index) and other data, gather the public keys announced by the nodes, and calculate the final result. The number determines which nodes (slot leaders) can produce blocks.

Therefore, the key to Cardano is that in each era, the origin block contains the random number that determines the block producer. In fact, in the nodes under this state machine, the block generation of this era has already been determined at the beginning of the origin block generation. Who is the winner? This selection algorithm needs to be calculated for a while. The process of calculating the block producer can be regarded as the post-verification process of VRF, so Cardano can also be regarded as a "secret election".

The key lies in the generation of random numbers in the genesis block, which needs to be random enough to be recognized by nodes. Cardano uses the process of committing to reveal the secret to let each node upload their own random numbers generated from the machine state. After the secret is revealed, these random numbers are combined to generate a random string. This is the process of generating random numbers.

The process of revealing the secret is more complicated. First, before the slot block generation stage, all nodes selected according to the equity (according to the mortgage size) will secretly calculate a random string, and then perform complex encryption on this string, and the encrypted The cipher text is broadcast to the entire network (not broadcasting is considered abstention in the next era), and promises to explain the secret. Then, in the process of slot block generation, the node will disclose a revealing data Open, which contains the decryption and the original random number, which can be used to verify the random number of the node. In this way, it can be guaranteed that before the random number is generated, the node has already prepared the random number and has not modified it, instead of generating it after receiving other people's secrets, because that can indirectly interfere with the generation of random numbers, thereby indirectly controlling the area blockchain.

But there is a problem. Nodes can not send Open and random numbers during the revealing stage. This may cause the mobile phone to not have enough node random numbers, the genesis block cannot be generated, and eventually the blockchain will go down. In order to resist this risk, Cardano stipulates that nodes must divide Open into several shares according to the algorithm at the stage of sending encrypted ciphertext, and use the public key of the block producer of this era to encrypt. Once a node runs away, the block producing node can use its own private key to decrypt the fragments of Open, and then put together Open.

And this algorithm for splitting Open is called verifiable secret sharing VSS (verifiable secret sharing). After the Open is divided, it is impossible to see the whole picture of the Open by eating a fragment, so this random number cannot be obtained in advance even by the block producer. At the same time, the advantages of this VSS algorithm are similar to the BLS algorithm used by DFINITY. It does not need to put all the fragments together, and only part of the fragments can restore the complete Open.

advantage:

advantage:

  • Good randomness and high security;

  • shortcoming:

shortcoming:

  • The verification group is determined first and sorted according to the size of the stake, so the decentralization is slightly worse;

  • Collusion between nodes is possible;

  • first level title

VDF+Randao

As mentioned in Hash Collision, due to the transparency of block data, it is difficult for random numbers under the PoW mechanism to be used as seeds in upper-layer contracts. This problem also exists in Ethereum 1.0. Therefore, some people consider using a decentralized organization to collect seeds in a distributed manner, so as to realize a safe and unpredictable random number on the contract. This is Randao. As mentioned above, although its economic design limits collusion, there are still risks of "last participant attack" and "amplification attack".

Last participant attack: The last person participating in Randao can predict some future random numbers, so he can choose not to disclose the secret and interfere with it.

Amplification attack: Execute the last participant attack multiple times to amplify the disturbance to the random number, thereby briefly controlling the organization.

In the recently announced Ethereum 2.0 scheme, for performance considerations, PoS needs to be used instead of PoW, so random numbers are needed to select block qualifications. The random number runs on a single chain called the "beacon chain". The random number generated by it is used to randomly control the composition of the shards, thereby realizing multi-shard parallelism and expanding the performance of Ethereum. Therefore, random numbers have become an important basis for the upgrade of Ethereum, which involves the fairness of sharding and the possibility of final confirmation.

The sharding scheme of PoW is too complicated, so the sharding of Ethereum 2.0 is based on PoS, and the same is true for random numbers. It is no longer possible to use hash collisions to obtain random numbers at the consensus layer, so Ethereum 2.0 borrows ideas from Randao , but in order to solve the two problems mentioned above, a verifiable delay function VDF (Verifiable Delay Function) is introduced, so that the random numbers generated by Randao need to go through a long period of time to get the result, which is relatively difficult to calculate , and must wait for multiple blocks, it runs serially, so that when the secret is transmitted in Randao at the beginning, the attacker cannot predict the impact of eating his own secret on the final state-of-the-art technology, and cannot launch the above attack.

Let me introduce Randao first. Nodes on Ethereum 2.0 are eligible to join Randao by mortgaging 32 ETH. Every time they get a random number seed, the nodes participating in Randao will silently read a secret number in their hearts, and then encrypt it and broadcast it. After everyone has obtained other people's ciphertext, the nodes will disclose the secret one after another, and the random numbers are combined in a certain way, which can be used as a random number seed to generate the final random number. This is like playing bridge. First, the cards are dealt face down to the players, and then they are revealed one by one when the cards are opened. In order to prevent the last participant from predicting the result and exerting interference, VDF is chosen to generate the final random number.

And this VDF has the following characteristics:

1. The calculation is difficult and the result is delayed. The calculation process of VDF is relatively complicated, and it takes a long time to get the result, so it is very difficult to quickly predict the result if the delay time is set long enough;

2. Simple verification. The process of running a VDF can be difficult and time-consuming, but the results must be checked quickly;

3. Seriality. The calculation of VDF is performed step by step. To calculate the final result, it must be done step by step. Parallel acceleration cannot be performed. Even if the attacker holds multiple machines, he cannot predict the result in a short time.

For example, after taking a long integer to the 10th power continuously, find the remainder of a certain large prime number, and then perform the above steps hundreds of times continuously, this is a function of VDF. Therefore, the algorithm for calculating the power is difficult, and under multiple cycles, the previous result must be known before the next calculation can be performed. Therefore, the final result of this algorithm cannot be calculated quickly, and it must be done step by step honestly. carried out. At the same time, due to the human computer architecture, even if you are very different, the difficulty of this calculation will not change exponentially, and it is still very difficult to solve. This unpredictability ensures the security of random numbers.

In fact, the above part is just an algorithm for calculating the result. VDF should also include an algorithm for verifying the result. The result of VDF is not only a random number, but also a verification of the random number, which can be quickly verified by any node.

advantage:

advantage:

  • Currently the most difficult to predict pseudo-random number on the blockchain, the strength of the random number is very high, but weaker than POW;

  • Random numbers strictly follow the serial order, anti-parallel accelerated operation;

  • shortcoming:

shortcoming:

  • The VDF calculation process is very difficult, the efficiency of obtaining random numbers is very low, and there is a large waste of calculation performance;

  • Although the calculation is very difficult, it is not as resistant to quantum computing as PoW, and it may be broken;

  • Summarize

Summarize

Statistician Francis Galton wrote in Nature in 1890: "As an instrument for selecting randomness, I find nothing superior to dice."

Human beings are constantly trying to summarize randomness with laws, but new randomness will appear under new laws. We are surprised by the disorder and randomness of the quantum world, which stimulates our curiosity to explore. It can be said that the understanding of random numbers is the history of human progress. The pursuit of random numbers is not only an appeal to science, but also a refutation of fatalism.

At the same time, we have created a binary virtual world. We want to simulate and copy our universe as much as possible in this world, and also obtain the wonderful and mysterious random numbers in nature. This is of great significance—with the machine True random numbers, can we give life to the virtual world? Can we simulate a "free will"? Is it possible for us human beings to survive in virtualization?

The randomness in the universe is the greatest fairness to life, and the random numbers on the blockchain will become the most intuitive and fundamental pursuit of fairness for human beings. The past has been decided, the present is happening, the future is random in general, and so should the world on the blockchain.

A random number that runs on the 01 machine and can be recognized by everything in the world has never been so charming before the blockchain.

Dfinity
Algorand
PoS
Welcome to Join Odaily Official Community