Editors Note: This article comes fromNPC source planNPC source plan
"Randomness is something beyond human beings."
—— The Big Qiu
Preamble:
, Author: Qiu Feiyang, reproduced by Odaily with authorization.
secondary title
Preamble:
Randomness is uncertainty, and it is human nature to rely on certainty. The program is inherently deterministic, but after introducing the uncertain factor of random numbers, it brings more possibilities.
At the application layer, random numbers determine whether the rules of the game are fair or not. After repeated losses due to random number vulnerabilities in various EOS Dapps, more and more attention has been paid to them. At the protocol layer, random numbers can improve overall performance and determine whether a chain is trustworthy. , for projects such as Dfinity and Algorand that use random numbers as their selling point, it even determines their life and death.
——Ryan
about the author:
Not long ago, the first prize of the National Blockchain Competition was DnaRand a fair decentralized random number infrastructure, which realized the provision of decentralized, unmanipulable and unpredictable random numbers when no more than half of the malicious nodes . Its designer is the author of this article.This article is the first Overview and Construction of the NPC random number topic, enjoy.about the author:RandomnessThe acquisition of data is a very important topic in the blockchain. The acquisition of randomness here includes but is not limited to: how to introduce unpredictable random numbers in smart contracts; how to safely draw lots in consensus protocols. Clearly, the problem description above for randomness acquisition already shows why this topic is important. while inIt is very difficult to obtain random numbers in the blockchain,thisOn the one hand, it comes from the transparency of the blockchain system
——In a general sense, this feature will make the input, output and algorithm itself of all algorithms exposed to all system participants——Therefore, the pseudo-random number generator widely used in cryptography cannot be directly The way of encoding or the way of smart contract code is applied in the blockchain system to obtain secure random numbers. Because of the transparency, system participants can predict random numbers or even manipulate random numbers according to the code, so that the random source is not random . On the other hand, as a sub-protocol of the blockchain system, the random number acquisition protocol often has a close relationship with other protocols under the system, such as the consensus protocol, which means
As a result, the design of the random number acquisition protocol becomes very complicated, and specific analysis of specific issues is often required.
This article summarizes the current unpredictable random number acquisition protocols mainly applied in the blockchain, and extracts their design ideas, methodologies and dependent assumptions, and then compares them. This article is divided into two parts: the first part introduces the basic concepts, and explains from scratch the core of the random number protocol that is suitable for distributed systems; the second part introduces the current mainstream random number protocols used in blockchain projects, and analyzes How do they use one or several types of protocol cores introduced in the first part.
This article assumes that the reader already has basic blockchain knowledge, and has a general understanding of the fundamentals of Ethereum smart contracts and the fundamentals of the Bitcoin consensus protocol. (If you dont understand, you can choose to click the upper left corner of the screen)
secondary title
Definition of randomness
In daily life, we often hear words such as random selection, pseudo-random number, random model, random sequence, and words such as pseudo-random number and true random number. concept. To understand these words and concepts, it is necessary to figure out what randomness is. In fact, the opposite of random is deterministic. Therefore, we can intuitively understand random as uncertain-whether it is a random number or a random selection, we all hope that the result of this number or selection is uncertain to some extent. Therefore, if a number is given directly without giving the method of generating the number, it cannot be called a random number. For example, if a number 1 is given directly, we cannot say that 1 is a random number, but if this 1 is passed If it is determined by rolling a dice, it can be said that this 1 is random. Of course, these are very intuitive and broad understandings. More precisely, random, or we say random number, random sequence, has different definitions in different fields. In mathematics, the definition of random numbers is related to probability theory; in computational complexity theory, it is defined by the length of the program describing the random sequence (that is, the more random the sequence, the longer the length of the program describing it); the cryptography society combines statistics properties and cryptographic attacks to describe random numbers.
We first give a descriptive definition of a random sequence here. We can call a sequence a random sequence when it satisfies:
Uniformity: The sequence obeys a uniform distribution.
Independence: The elements of the sequence are independent of each other.
Unpredictability: Based on any segment of the sequence, the remainder of the sequence cannot be predicted.
To expand, we can first consider the following questions:
1. Consider an array in which each item is either 1 or 0. Assuming each of its entries is 1, it is clearly not a random sequence because uniformity is violated. Uniformity requires that 0 and 1 occur with the same probability.
2. Assume that each of its items is not equal to the previous item. For example, 0101010101, which satisfies uniformity, but is still not a random sequence because independence is violated.
We said that different fields have different definitions of randomness. For example, in the simulation we want to simulate the interval time between customer arrivals, it is sufficient to use a random sequence that only satisfies the first two. But in cryptography, such as generating a random key, it is not enough to satisfy the first two random sequences. Of course, a random sequence that may be predicted is used in key generation. Of course, there is a security problem. For example, we use the number of base e of natural logarithm as a random sequence. e can indeed be considered to be statistically uniform and independent (although not fully proven), which is sufficient for simulations, but cannot be used as a random seed in cryptography. Because the opponent may guess that e is being used through a known sequence of a certain length. Similarly, random numbers used in lottery draws and games are usually unpredictable. Given that the blockchain field mostly involves cryptography and game scenarios, the following content is a random sequence that satisfies all three properties.
Random sequences can be divided into True Random sequences and Pseudorandom sequences. Pseudo-random sequences, as the name suggests, are not really random, just seem random. Therefore, according to the concept proposed by Turing Award winner Yao Qizhi, roughly speaking, a pseudo-random sequence refers to a sequence that is computationally indistinguishable from a true random sequence. A true random sequence refers to a random sequence that cannot be reproduced, such as a random sequence generated by flipping a coin. We can see that under this definition, the statistical properties of the pseudo-random sequence should be indistinguishable from the true random sequence, that is to say, the pseudo-random sequence is also a random sequence that satisfies all three properties. Of course, such a definition is usually used in computational complexity theory and cryptography. In other fields, a random sequence that only satisfies the first two properties can also be called a pseudo-random sequence.
A generator that generates random sequences is called a Random Number Generator (RNG). According to the nature of the generated sequence, we can divide it into True Random Number Generator (TRNG) and Pseudorandom Number Generator (PRNG). In addition, there is another orthogonal classification method to classify from the implementation method of random number generators. Random number generators can be divided into hardware random number generators and software random number generators. The relationship between them is as follows: As shown in Figure 1:
(figure 1)
True random number generators usually take advantage of some non-deterministic phenomena and convert them into truly random sequences by physical means. Common non-deterministic phenomena include chaotic effects and quantum stochastic processes. The chaotic effect is characterized by the fact that physics can clearly explain its cause and effect, but the result cannot be accurately predicted because the result is too sensitive to the initial value. For example, random numbers generated by collecting atmospheric noise are examples of random numbers generated using the chaos effect. The quantum random process uses the uncertainty of the microscopic quantum state. This uncertainty has been recognized by the current physical theory. It can guarantee that even if the input values are exactly the same, the output values may be completely different. For example, the phase noise of a laser is used to generate random numbers. Hardware true random number generators are usually implemented using chips; software true random number generators usually use some non-deterministic phenomena that come with the system, such as hard disk seek time, RAM content, or user input. The /dev/random is a software true random number generator, which obtains sufficient randomness sources by collecting hardware noise data during machine operation, and generates random numbers accordingly.
The pseudo-random number generator is a program, which is a deterministic algorithm. It usually takes short true random numbers as input and expands to generate longer random number sequences that are very close to true random sequences. Its input is called a seed (Seed). It also has hardware and software implementations.
secondary title
How to get random numbers from the blockchain
In the previous section we gave a definition of randomness, which will also be used in the rest of this paper. And, we also give the concept of pseudo-randomness and pseudo-random number generator. In practical applications, there are many pseudo-random number generators that can be used in cryptography and they are already very mature, so we naturally think whether the pseudo-random number generator can be used directly in the blockchain, in the blockchain Randomness is added to the consensus process or application, so that such randomness satisfies the three properties we mentioned above?
In addition to pseudo-random number generators and true random number generators, there is another type of random number generator that directly utilizes the randomness naturally generated by the consensus process in the blockchain system. For example, use the Hash value of a future block or a previous block as one of the seeds to generate random numbers (other seeds can be the users address, the amount of Ethereum paid by the user, etc., but these are user-controllable parts, have no effect of increasing unpredictability). This practice is also common in various blockchain gambling games and fund board games, but such a random number acquisition process has a fatal loophole-users may control the random number in a direction that is beneficial to them by carefully choosing the transaction time. Direction generation; even if users cannot control it, miners can control the generation of random numbers, and such an attack does not require much computing power to participate. As long as the amount involved in the final random number is sufficient, attacks can be carried out by renting computing power or bribing miners.
So in the final analysis, where can randomness come from in a system like blockchain? That is to say, through the above analysis, we found that for the random number generator as above, no matter how complicated the rules or programs are designed, it is a deterministic algorithm. For a deterministic algorithm, the algorithm itself will not have any influence on the randomness of the output, only the input of the algorithm can affect the randomness of the final output. Therefore, in the blockchain system, we need to carefully select an input with sufficient randomness in a distributed, open and transparent environment.
So does the required input exist? The answer is yes, such input actually comes from our assumption that the participants in the blockchain system are not a whole.
Random Number Generation Protocol Model
We now start from the simplest case to gradually construct a fair random number generator that can be used on the blockchain. The protocols in the distributed environment mentioned below can be converted to the blockchain environment, so there is no distinction between distributed and blockchain.
image description
Next, we consider such a scenario: Alice and Bob pooled money online to buy a lottery ticket together, and they won the mysterious first prize. To their surprise, the prize turned out to be a Pikachu, as shown in Figure 3. But Pikachu is inseparable, and since the two are too far apart to meet and guess punches, the two of them decide to design a random number generation protocol that is fair to both of them to determine who gets the Pikachu.
(image 3)
secondary title
v1.0 The simplest random number generation protocol
(Figure 4)
So how does such an agreement work? In order to construct such a protocol, we need to determine what properties such a protocol needs to satisfy. Considering that the input between each person is independent of each other, such a protocol needs to ensure that each persons own input should also be independent of the output, but they jointly make a certain contribution to the output. Only in this way can it be ensured that no one can change the output simply by changing their choices. At the same time, the protocol also needs to ensure that as long as one persons input is evenly distributed, the result is evenly distributed. In reality, there are many construction methods that satisfy these conditions, one of which is the XOR operation, which XORs the input of the two people and then outputs: given that Bob chooses 1 (Alice does not know), Alice chooses 0 or 1 , the output results are both 0 and 1 with half the possibility; given that Bob chooses 0, the same is true.
secondary title
v2.0: version with promises
v1.0 seems to solve our problem, in fact it has a very large hole. The loophole is that we cannot guarantee that Alice and Bob will input simultaneously. If Bob waits for Alice to enter her choice into the protocol before making her choice, then since the interaction of the protocol is public to both, Bob can adjust his choice according to Alices choice. For example, if Alices choice is 0, Bob enters 0; if Alices choice is 1, he enters 1. In this way, no matter what Alice chooses, Bob can make the XOR result always be 1, and he can take this Pikachu away.
image description
(Figure 5)
In fact, simultaneous input is difficult to guarantee, and in order to prevent this kind of cheating, we need to ensure that the input from other people in the protocol should be temporarily confidential to the participants and will not reveal any of their choices information. Correspondingly, there should be an additional process of de-confidentialization to calculate the output of the protocol. In order to achieve such a requirement, we need to introduce a new mechanism: Commitment.
Such an agreement ensures that in the first stage, no ones choice will be known to anyone other than himself, and in the second stage, even if Bob first knows the choice value revealed by Alice and calculates it before revealing himself After the result, he cant change his choice, because the signature of the first stage has already made a commitment. Here, the digital signature can guarantee the immutability, non-repudiation and temporary confidentiality of the message. If the protocol is running on the blockchain, since the blockchain protocol usually digitally signs the transaction content, our protocol can also change the use of digital signatures to use Hash functions.The v2.0 version looks pretty fair for two people, but its still buggy for more than two people. Assume that Alice, Bob, and Clare share a Pikachu, other settings remain unchanged, and the v2.0 protocol is adopted. At this time, Bob came up with an idea: In the final second stage, I can calculate the output based on other peoples revealing results before inputting my own choice for revealing. If it is not an output that is beneficial to me, I will not reveal stage, pretending that the network cable was cut.”
The protocol just now cannot handle this situation. Do you do it all over again, or just take the input of the remaining two people? Both of these methods have problems: if the protocol is re-run, the attacker can use this re-run mechanism to force the protocol to re-run every time it is unfavorable; if only two people are left The attacker can also use this mechanism to choose whether to give up the input to seek advantages and avoid disadvantages. Therefore, we need a mechanism to ensure that participants will not give up at will, and the easiest way is to use economic penalties. As shown in Figure 6, when the participants commit in the first phase, they must lock a bitcoin (or other digital currency) to the agreement—if it is in the blockchain environment with smart contracts, This operation is very easy. If Bob doesnt reveal his options on time
image description
(Figure 6)
secondary title
In addition to economic punishment, there is another way, which we call the Threshold mechanism. The threshold mechanism refers to a mechanism in which a specific process can be executed when a certain indicator of the protocol reaches a certain threshold. Here we introduce the threshold mechanism, mainly to enable the normal output to be given when the number of participants in the protocol is missing. The role of the threshold mechanism is to enhance the robustness of the protocol, so that it can tolerate a certain degree of denial of service attacks. The threshold mechanism we will discuss next is the (t, n) threshold mechanism, which means that for a protocol with n participants, only the input of t participants is needed to complete the output of the protocol. Note that n here is not necessarily a protocol It is pre-specified, or must be known by the protocol, and it may also be an indeterminate number. If the protocol must specify the exact n to ensure correctness, then such a protocol can only be used in a permissioned environment (Permissioned Setting); if it does not need to specify an exact n, then such a protocol can be used in a non-permissioned environment ( Permissionless Setting). As shown in Figure 7 and Figure 8, we use the most basic v1.0 version of the Coin-Tossing protocol, and use the output of the threshold mechanism as the input of the Coin-Tossing protocol. Generally speaking, there are three kinds of such threshold mechanisms.
image description
image description
witch attack
(Figure 8)
witch attack
The second threshold mechanism is of the Distributor-less Secret Sharing series. This series of protocols requires a secret sharing every time a random number output is generated to ensure the threshold mechanism, which belongs to the stateful (Stateful) protocol. More intuitively, for example, there are n people, assuming that only t people can submit the random number we want, and at the same time, we need r such random numbers. Then, if we adopt the threshold mechanism of secret sharing series without distributor, we need these n people to interact with each other for at least r rounds. Another limitation is that the scheme needs to be implemented in a permissive environment, that is to say, the protocol must know the total number n in order to determine a reasonable threshold t.
image description
(Figure 10)
references
[1] Yao, Andrew C. "Theory and application of trapdoor functions." Foundations of Computer Science, 1982. SFCS08. 23rd Annual Symposium on. IEEE, 1982.
[2] Boneh, Dan, et al. "Verifiable delay functions." Annual International Cryptology Conference. Springer, Cham, 2018.
——————
About NPCs
The third method is Distributed Key Generation + Threshold Signature. Threshold signature can be understood as the application of secret sharing to digital signature schemes, but it is not simply a superposition of the two. The usual digital signature scheme is that after a person encrypts a message with his own private key to obtain a signature, the signature can be verified by public parameters such as the public key. The threshold signature scheme used in this scheme also has a pair of public and private keys, but each participant only has one fragment of the total private key and the corresponding public key fragment; these private key fragments can be combined to recover the complete private key. The same is true for the public key fragments; everyone can use their own private key fragments to sign to obtain signature fragments, and these signature fragments can be verified by the corresponding fragments of the public key; and any t of these signature fragments can be combined to calculate a total Signature, the total signature is equivalent to signing with the total private key, so it can also be verified by the total public key. Therefore, such a signature process does not require everyone to participate, and only needs the valid signatures of t people out of n people to complete the signature process. And no matter which t individuals participate in the signature, the final signature is the same. And the private key involved in the signing process will not be leaked, and everyone only shares the public key fragment and their own signature results, which makes multiple rounds of non-interactive signatures possible. Of course, the total public-private key pair and the corresponding fragments are not selected arbitrarily. Its generation requires all n people to run the distributed key generation protocol when the protocol is run for the first time to generate a public-private key pair with such cryptographic characteristics. and its fragments. Therefore, this scheme also has the problem that the protocol must know the total number of people n.