Editor's Note: This article comes from Editor's Note: This article comes from(ID:secbitlabs)Amby Laboratories
Due to the mistaken use of a certain zkSNARKs contract library in a large number of zero-knowledge proof projects, the "Input Aliasing" vulnerability is introduced, which can lead to forged certificates, double spending, replay and other attacks, and the attack cost is extremely low. Many open source projects in the Ethereum community are affected, including the three most commonly used zkSNARKs zero-knowledge development libraries snarkjs, ethsnarks, and ZoKrates, as well as three recently popular currency mixing (anonymous transfer) applications hopper, Heiswap, and Miximus. This is a murder caused by a piece of code posted by Chris, the father of the Solidity language, two years ago.
secondary title
Double Spending Vulnerabilities: Initial Exposures
Semaphore is an anonymous semaphore system using zero-knowledge proof technology, which evolved from the previous coin-mixing project of the famous developer barryWhiteHat.
The Russian developer poma first pointed out that the project may have a double-spend vulnerability [1].
The problem is in line 83 code [2], please look carefully.
This function requires the caller to construct a zero-knowledge proof that he can withdraw money from the contract. To prevent "double spending" from happening, the function also reads the "discard list" to check whether a specified element of the certificate is marked. If the certificate is in the abandoned list, the contract determines that the verification fails, and the caller cannot withdraw the money. The developer believes that in this way, the same certificate cannot be repeatedly submitted for profit, and believes that this can effectively prevent double-spending or replay attacks.
This problem can be traced back to 2017. Christian Reitwiessner, the inventor of the Solidity language, provided an example of zkSNARKs contract cryptography implementation [3]. Since then, almost all contracts using zkSNARKs technology on Ethereum have followed this implementation. Therefore, it is possible to be attacked by the following processes.
secondary title
Currency Mixing Application: The hardest hit area of this security issue
The earliest and most widespread application scenarios of zero-knowledge proof technology on Ethereum are currency mixing contracts, or anonymous transfers and private transactions. Since Ethereum itself does not support anonymous transactions, and the community's calls for privacy protection are getting stronger and stronger, so many popular projects have emerged. Here, taking the application scenario of the mixed currency contract as an example, we will introduce the security threat of the "input pseudonym" vulnerability to zero-knowledge projects.
Mixing contracts or anonymous transfers involves two main points:
1. Prove that you have money
2. Prove that the money has not been spent
For ease of understanding, here is a brief description of the process:
1. A will spend a sum of money.
2. A needs to prove that he owns the money. A presents a zkproof to prove that he knows the preimage of a hash (HashA), and this hash is on the leaf of the tree marked by root, and proves that another hash of this preimage is HashB. Among them, HashA is witness, and HashB is public statement. Since A does not need to expose HashA, it is anonymous.
3. The contract verifies zkproof and checks whether HashB is in the discard list. If not, it means that the money has not been spent and can be spent (this call of A is allowed).
4. If it can be spent, the contract needs to put HashB into the discard list, indicating that the money represented by HashB has been spent and cannot be spent again.
The core logic of many zkSNARKs contracts, especially the mixed currency contracts, is similar to lines 82 and 83, so they all have the same security issues and can be attacked by exploiting the "input pseudonym" vulnerability.
secondary title
Vulnerability Analysis: How to anonymously spend 5 times a sum of money?
The function of the above verifyProof(a, b, c, input) function is to perform calculation and verification on the elliptic curve according to the input value. The core uses a function named scalar_mul() to realize the scalar multiplication on the elliptic curve [4 ].
We know that Ethereum has built-in multiple pre-compiled contracts to perform cryptographic operations on the elliptic curve and reduce the gas consumption of zkSNARKs verification on the chain. The implementation of the function scalar_mul() calls the Ethereum precompiled No. 7 contract, and realizes the scalar multiplication on the elliptic curve alt_bn128 according to EIP 196 [5]. The figure below shows the definition of this operation in the Yellow Paper, which we often call ECMUL or ecc_mul.
In cryptography, the value domain of {x, y} of elliptic curve is a finite field based on mod p, and this finite field is called Zp or Fp. That is, a point {x,y} on an elliptic curve where x,y is the value in Fp. Some points on an elliptic curve form a larger cyclic group, and the number of these points is called the order of the group, denoted as q. Encryption based on elliptic curves is carried out in this cyclic group. If the order (q) of this cyclic group is a prime number, then encryption can be performed in a finite field mod q, which is denoted as Fq.
Generally, a larger cyclic group is selected as the basis of encryption calculation. In the cyclic group, randomly select a non-infinity point as the generator G (usually the order q of this group is a large prime number, so any non-zero point is equivalent), and all other points can pass through G+ G+... Generated. The number of elements in this group is q, that is, there are q points in total, then we can use 0,1,2,3,....q-1 to number each point. The 0th point here is the point at infinity, and the point 1 is the G mentioned just now, also called the base point. Point 2 is G+G, point 3 is G+G+G.
So when we want to represent a point, we have two ways. The first is to give the coordinates {x,y} of this point, where x,y belong to Fp. The second way is to give it in the form of n*G. Since G is public, it is only necessary to give n. n belongs to Fq.
Take a look at the scalar_mul(G1Point point, uint s) function signature, use point as the generator, calculate point+point+.....+point, and add a total of n points. This belongs to using the second method above to represent a point in a cyclic group.
In the implementation of Solidity smart contracts, the uint256 type needs to be used to encode Fq, but the maximum value of the uint256 type is greater than the value of q, then there will be such a situation: multiple numbers in uint256 will correspond to the same Fq after mod operation value in . For example, s and s + q represent the same point, that is, the sth point. This is because point q is actually equivalent to point 0 in the cyclic group (each point corresponds to 0, 1, 2, 3, ... q-1). Similarly, s + 2q etc. all correspond to point s. We call the phenomenon that multiple large integers can be input that correspond to the value in the same Fq as "input pseudonyms", that is, these numbers are pseudonyms for each other.
The elliptic curve implemented by Ethereum 7 contract is y^2 = ax^3+bx+c. p and q are as follows respectively.
The q value here is the order of the group mentioned above. Then in the range of uint256 type, there are total uint256_max / q, which means that there will be at most 5 integers representing the same point (5 "input aliases").
Therefore, when the user presents a zero-knowledge proof to the contract for withdrawal, the contract will put input[1] (that is, a certain s) into the invalidation list. The user (or other attacker) can also submit the attestation again with another 4 values. These 4 values have not been included in the "abandoned list" before, so the "forgery" proof can pass the verification smoothly, and a sum of money can be spent 5 times using 5 "input pseudonyms", and the attack cost is very low !
secondary title
And many more affected projects
Among them, several well-known zkSNARKs library or framework projects have the greatest impact, including snarkjs, ethsnarks, ZoKrates, etc. Many application projects will directly quote or refer to their code for development, thus burying security risks. As a result, the three aforementioned items were quickly updated with security fixes. In addition, a number of well-known currency mixing projects using zkSNARKs technology, such as hopper, Heiswap, and Miximus, also immediately carried out synchronous repairs. These projects are very popular in the community, among which Heiswap is called "Vitalik's favorite project".
secondary title
Solution to "Input Pseudonym" Vulnerability
Fortunately, the fix is simple. It is only necessary to add a check on the size of the input parameter in the validation function, forcing the input value to be smaller than the q value mentioned above. That is, it is strictly forbidden to "enter a pseudonym", and to prevent the use of multiple numbers to represent the same point.
secondary title
Exposed deep-seated problems are worth reflecting on
The security loopholes caused by this "input pseudonym" deserve serious reflection by the community. Let's review the whole story again. In 2017, Christian posted his own zkSNARKs contract calculation implementation on the Gist website. As a computing library, we can think that its implementation has no security issues, does not violate any common sense of cryptography, and perfectly completes the proof verification work in the contract. In fact, as the inventor of the Solidity language, Christian certainly won't make any low-level mistakes here. Today, two years later, this code has caused such a security storm. For more than two years, countless peers and experts may have seen or used this code with only more than two hundred lines, but they did not find any problems.
Where is the core problem? It may be that there is a deviation in the understanding of the program interface between the implementer of the underlying library and the user of the library. In other words: the implementers of the underlying library lack consideration for the improper use of application developers; while the upper-level application developers do not have a deep understanding of the underlying implementation principles and precautions during use, and make wrong security assumptions.
The "Input Pseudonym" vulnerability can't help but remind us of the "Integer Overflow" vulnerability that was frequently exposed before. There are quite a lot of similarities between the two: both stem from the wrong assumptions of a large number of developers; both are related to the uint256 type in Solidity; both have a wide impact; there are also many hidden tutorial codes or library contracts circulating on the Internet. But it is clear that the "input pseudonym" vulnerability is obviously more difficult to detect, the latency is longer, and more background knowledge is required (involving complex elliptic curves and cryptography theories). SECBIT Lab believes that with the rise of zkSNARKs, zero-knowledge proof applications, and privacy technologies, more new applications will emerge in the community, and more hidden security threats may be further exposed. I hope that in this wave of new technologies, the community can fully absorb the painful lessons of the past and pay attention to security issues.
references
