Guo Yu from Anbi Lab: When Deep Neural Networks Meet Zero-Knowledge Proofs
Editor's Note: This article comes fromWanxiang Blockchain (ID: gh_1b8639a25429), reprinted by Odaily with authorization.
Editor's Note: This article comes from
Wanxiang Blockchain (ID: gh_1b8639a25429)
zero-knowledge proof
, reprinted by Odaily with authorization.
This article is the sharing content of the twenty-ninth online open course of Wanxiang Blockchain Hive Academy. In this issue, Guo Yu, the founder of SECBIT Lab, is invited to share "When Deep Neural Network Meets Zero-Knowledge Proof".
Today, I will share with you "When Deep Neural Networks Meet Zero-Knowledge Proofs".
zero-knowledge proof
The concept of "zero-knowledge proof" is no stranger to the underlying technical architecture of the blockchain. It was proposed by Goldwasser, Micali and Rackoff in 1985 (as shown in the figure). Although "zero-knowledge proof" has long been confined to a small area of computational theory research, its impact has been far-reaching.
There is a word "proof" in zero-knowledge proof. This concept has undergone many paradigm shifts throughout the history of human civilization development. In ancient Greece, "proof" was an ingenious mathematical technique; by the beginning of the 20th century, mathematical proof was reinterpreted in the third mathematical crisis, and formal logic was introduced to be formalized, thus giving birth to computing theory and computers.
By the 1970s, the wonderful connection between programs and proofs was discovered, and many concepts of modern functional programming and automatic theorem proving were introduced; by the 1980s, the concept of "proof" was extended to interactive systems. After the 1980s, the interactive proof system brought revolutionary new insights both in the philosophical sense and in the theoretical and technical aspects.
When I first saw the concept of zero-knowledge proof, my first reaction was that this thing is very counter-intuitive. Why "counter-intuitive"? Let me first briefly introduce the basic framework of zero-knowledge proofs.
In the zero-knowledge proof, there must be a person like the right, we call him Bob, and the left may be an untrusted machine. At this point, suppose that Bob has a computing task to be handed over to the machine on the left to run, but the machine itself may be controlled by humans. Bob needs to use a "F ()" function as a calculation task, bring the input "X", and hand it over to the left machine to let it complete the calculation task. Because this calculation process took place on the machine on the left, Bob has not witnessed the entire calculation process, so why does Bob believe that the formula of the calculation result Y=F (X, W) can be established. Moreover, some secret data (W) that only the machine on the left knows is mixed in the calculation process, but Bob does not have W, so Bob cannot know whether the calculation result is correct by repeating the calculation process. Under such circumstances, can Bob still believe that the calculation result is correct?
In a word, zero-knowledge proof can magically convince Bob that the calculation result is correct. This sounds a bit unbelievable, the trust calculation result Y is based on the distrust of the left machine, that is to say, the zero-knowledge proof that it generates some trust out of thin air, which is exactly the counter-intuition. It's very, very interesting to imagine that, when two people don't trust each other at all, there is something that can build trust.
Counter-intuitive zero-knowledge proof Let me summarize, it can be understood as follows: the integrity and confidentiality of a remote computing process can be guaranteed.
1. Integrity. Although I haven't seen the calculation process happening remotely, although I haven't witnessed its process, I know that the calculation process must have happened in real life, and the calculation result is correct, which means that the calculation process has not been maliciously modified or been forged.
2. Confidentiality. Although I cannot witness the calculation process, and it is impossible to know some intermediate results in the calculation process, and it is impossible to know some information inside it, so the entire calculation process, including some secret inputs, is confidential and confidential to me .
This is what counterintuitive zero-knowledge proofs can achieve. What is the use of zero-knowledge proof? I think the most direct use is the so-called "data privacy protection". I briefly listed some, such as personal data: health data, bank statements, travel arrangements, communication records. In fact, we don't want anyone to know our health status, including heartbeat, medical records, and how much money we spend every day. But we also want to enjoy third-party services, such as going out to take a taxi. We want some people to know where we are, but we don’t want others to know, or we only want him to know a general place, and we don’t want him to know the specifics. Location.
Enterprise data: personnel files, warehousing and logistics, financial ledgers, customer information, traditionally these are very confidential data, but in order to maximize the effect of inter-enterprise collaboration, some data must be shared. Zero-knowledge proof provides a way to ensure data privacy and share data at the same time, so its biggest impact on our future life and work is to provide a very effective data privacy protection technology.
Zero-knowledge proof still has many interesting uses in the underlying technology of the blockchain. Since the zero-knowledge proof was proposed in 1985, it has remained at the level of theoretical results for a long time until the blockchain technology has been applied on a large scale, probably since 2015. Year - 2020, zero-knowledge proofs are widely used on the blockchain. Including block compression, protection of transaction anonymity, identity privacy, shared data, off-chain data storage integrity, etc., technical friends in the blockchain industry should be familiar with these.
zkSNARK
Next, two very basic concepts are introduced.
Non-interactive zero-knowledge proofs
As we mentioned earlier, when the zero-knowledge proof was proposed, a new innovation was made to the "proof" paradigm. In the past, the proof was written on paper. For example, the teacher asked a question for an exam. come out. The zero-knowledge proof is an interactive proof. By talking to the teacher, you can convince the teacher to believe me without telling the teacher any proof process.
Why are there non-interactive zero-knowledge proofs again? Non-interactive zero-knowledge proofs use some special techniques to fold the interactive proof process into non-interactive zero-knowledge proofs. Non-interactive zero-knowledge proofs are more widely used in the blockchain field.
This is also a very common word, although it has many less consistent interpretations in the academic field. Strictly speaking, zkSNARK is a special kind of NIZK, that is, a special non-interactive zero-knowledge proof. What is so special about it?
Three points are listed below:
1. Concise. The proof it generates is very small, how small? Usually on the order of KB, and can even be as small as 200 bytes or less. Although it is only a few hundred bytes, the calculation process it can represent can be very, very large, and the data it can protect can also be on the order of terabytes.
2. Argument. It is a type of zero-knowledge proof, which will be briefly explained later.
3. There is a "K" at the end of zkSNARK, which means Knowledge. The meaning is that a proof is a proof about "knowledge", and this "knowledge" has a mathematical definition.
Privacy protection, blockchain and zero-knowledge proof
Before talking about this concept, we must first talk about a very important concept of cryptography-commitment. As shown in the picture, you can see that there is a database on the left. This database is very, very large, perhaps as large as more than 1TB. It can be converted into a very small segment of bytes by means of cryptography, probably less than 100 bytes. No matter how big the database is, the final commitment can be compressed into data that does not exceed 100 bytes.
Although it is only more than 100 bytes, it can establish a unique binding relationship with the database. If any changes occur in the database, the promise must be changed along with it. A promise is like a "lock". As long as someone writes down the promise, as long as there is any modification in the database, the promise will change dramatically.
Promises have two properties:
Nature 1: Binding, a promise binds an original data set (cannot be tampered with). This data set can be very large, but can be bound with a very small commitment.
Nature 2: Hiding, promise not to disclose the information of the original data. Because there are only about 100 bytes, no original data information is leaked, because the amount of information has been greatly lost.
If there are many databases, such as travel records and a bank each have a database, you can generate two commitments, and then put the commitments on the chain. After putting the commitments on the blockchain, there will be an effect, that is, you can no longer go back on your word. The database has been modified, but after putting the promise on the chain, you can show it to everyone. You can access the database through the promise, of course, selectively open it, or open it after some processing of the database data. But because there is this commitment locked on the chain, it is impossible to cheat. Before handing over the data to another person, you can do very complicated processing, and you can delete the sensitive data in it. This must be very honest and not fake, because the promise is on the chain.
Commitment sounds magical, but it is a very traditional and commonly used cryptographic tool that non-professionals know very little about. There are actually many ways to implement promises, the simplest one is to use some operations like Hash. There are also Merkle Tree, Pedersen Commitment or Polynomial Commitment, Elgamal Encryption, all of which can fulfill promises.
There are two key types of commitment, which are analyzed from the perspective of cryptography theory: the first type: Computational Binding and Perfect Hiding. The second category: Perfect Binding and Computational Hiding.
The so-called Perfect Hiding is the promise that it is absolutely impossible to leak a single bit, and the original data is perfectly hidden, but its binding relationship is Computational. If you imagine that someone has a super quantum computer in the next 100 years, it will far exceed the current computing power. It can be faked, forging a fake promise that is not related to the original data.
The other type is reversed. Its binding relationship is Perfect Binding. It is impossible to fake it even if there is a super quantum computer, but its Hiding is Computational. That is to say, if a super quantum computer comes out in 100 years, it can crack the current Generate a promise and be able to extract some useful information from it.
It has been proved theoretically that a perfect commitment scheme in both Binding and Hiding is impossible to exist. But in terms of zero-knowledge proofs for privacy protection, we usually use the first type of commitment, which is Perfect Hiding, but the commitment is Computational Binding. Because we want to put the promise on the chain, it means that it will still be visible to everyone after 100 years, but there is no need to worry that this information will really be cracked by others after 100 years. In addition, this type of commitment can produce a very concise zero-knowledge proof, which is the Argument I mentioned earlier.
zkSNARK principle
Why it is possible to promise a database with 100 bytes and at the same time prove that the data meets certain properties sounds incredible.
First of all, let's review what is "zero-knowledge proof"? It is Bob on the right who said to do a calculation, and send the calculation to a remote machine that he doesn’t believe. This machine may have been controlled by a hacker or this person may make trouble, but it doesn’t matter, give it the function, enter X, and after a while Give me a Y. Although I don't believe that machine, but because I saw Y, although I have doubts about Y, but because I checked the additional zero-knowledge proof, the zero-knowledge proof told me that Y is true, so I believed it. How did this work?
Let’s go back to the simplest idea first. If there is a machine to run my program, I won’t worry if I can’t see it. The easiest way is to take a camera to record the entire calculation process from beginning to end. After getting the video It can be checked frame by frame to see if every step of the calculation is correct. This one is silly, but in theory it might work.
But obviously the solution is very impractical, because when the machine is running, the memory state of the machine is very large, such as 4G memory, and the video file will be so large that it cannot be verified after recording the whole process.
Is there any way to improve it? We use the computing model of computing circuits, rather than the model of CPU plus memory. The expression ability of the arithmetic circuit calculation model and the computer CPU running program is roughly equivalent, and basically common calculations can be expressed.
Arithmetic circuits consist of interconnected "gates" such as "multiplication gates" and "addition gates". The input is input from the left side of the circuit, and the operation result is generated on the output line on the right after the operation. The whole process is to calculate "+" and "×". Don't underestimate these two operations, they can express most operations.
The advantage of the arithmetic circuit is that its calculation process can be photographed. At this time, as long as the camera presses the shutter to take a picture and take a panoramic view of the circuit, it is equivalent to recording all the calculation processes. In this way, the verification calculation is no longer a verification video. The video is viewed frame by frame, but the photo is just a photo, as long as every detail of the photo is verified.
How to verify the circuit calculation? To verify the operation of each gate, for each addition gate, verify that the addition of the left and right inputs does not equal the output. For a multiply gate, verify that the multiplication of the left and right inputs does not equal the output. As long as you patiently check each door one by one, you will be fine.
If there are tens of billions of gates, verification may take more than a year. Is there a way to simplify the verification process of so many gates? This can be done with the mathematical tool "polynomial encoding".
For example, there are some numbers: y0, y1, y2, we put them on an XY axis plane, turn them into points one by one, and then find a curve that passes through those points, which is equivalent to encoding all the values with one curve Value, from y0 to yn.
What are the benefits of encoding as a curve? The advantage is that if I want to change one of the values yk, even if I change a little bit, the whole curve will be obviously disturbed. The perturbed curve is only the same as the original curve at n points, while other points deviate. Polynomial encoding can amplify any even small modification. So we can check whether the point encoded by the curve has been modified by randomly verifying a point.
Next, three different polynomial encodings can be performed on the left input gate, right input gate, and output gate of all the multiplication gates in the arithmetic circuit just now to form three curves. Now as long as there are three curves, it is enough to verify that the multiplicative relationship between the three curves (three polynomials) is satisfied. To verify the multiplication relationship, it is only necessary to randomly sample a point on the X-axis, and there is no need to verify ten billion gates. Although the three curves are verified, it is clear that the verifier has seen too many secrets, and the zero-knowledge proof must ensure that the secrets of the calculation process are not leaked. How to do it? Polynomial operations can be homomorphically mapped onto elliptic curve groups.
Addition operations on finite fields of integers map to binary operations on elliptic curve groups. The integer multiplication operation can be homomorphically mapped to the bilinear pairing operation on the elliptic curve group. The multiplication can only be single multiplication, but it is enough to verify the operation relationship of the arithmetic circuit.
This process seems simple, but the research of zkSNARK is based on the hard work of many cryptographers. The route can be traced back to IKO07 (2007). In 2010, Groth had a preliminary idea, and the key breakthrough was in 2013. The improved solution of Groth16 is the most popular zero-knowledge proof algorithm in the industry. There are also some very new improvement schemes such as: Marlin, Aurora, Spartan, and Fractal, which are all developed from the continuous improvement of the GGPR13 algorithm.
The GGPR13 paper in 2013 was a relatively big breakthrough, and many wonderful ideas came from it. The four authors of the paper are Rosario Gennaro, Craig Gentry, Bryan Parno, Mariana Raylova.
Take a look at the GGPR13/Pinocchio/Groth16 framework. When we want to express any calculation, we first write the code in a high-level language, then compile it into an arithmetic circuit through the compiler, then generate the R1CS matrix through matrix representation, and generate QAP through polynomial encoding, and finally generate through Trusted-Setup: Proving Key, Verifying key. Proving Key can be used to prove the calculation process, and the Verifying Key can be used to know that the proof π is correct.
Zero-knowledge proof + deep neural network
With the zero-knowledge proof, we can calculate the commitment for the database and publish it on the chain. After putting the database on the chain through Commitment, we can do many things. We can use the database to generate a lot of useful information without exposing personal privacy. Ability to share information with friends, businesses, and society as a whole. Next, let me introduce the staged results of our team in the deep network.
Image recognition is widely used. This is a standard image classification model. Standard test cases include airplanes, cars, birds, dogs, etc., and the resolution is 224×224. The next step is privacy protection + machine learning (zero-knowledge proof + deep neural network).
There is a database on the left, which contains a lot of private information, and artificial intelligence models are generated after machine learning (training). Alice does not want to expose the model, after all, the model can reflect personal privacy.
But Alice feels that the model can provide Bob with valuable information, and Bob will provide a picture to Alice and say that you can help me identify the picture, but I don't know the details of your training set. Alice can take the picture and put it into the model for recognition and generate results, along with a zero-knowledge proof. If the result of the model is that a cat is recognized, but I also tell you a zero-knowledge proof, which proves that the result is indeed recognized by a secret model. Of course, cats and dogs can be seen without proof, but for some In the machine learning scenario, there is no way to determine whether the prediction results are credible.
For example, whether there are early signs of cancer in a medical diagnosis film, this result cannot be seen by looking at the film, but the model can tell you that you are very healthy. It doesn't matter if you don't believe it, the zero-knowledge proof can tell you that this is indeed the result inferred from the very reliable X-ray film database.
The model can promise to be on the chain, making it absolutely impossible for Alice to cheat Bob. The key point is that you no longer need to trust whether Alice is a good person or a bad person. If you can produce a proof result, it is right. You only care about the result, not who provides it.
Next, I did an image recognition experiment, which is the neural network model of VGG16. It is a 16-layer deep neural network. There are a large number of parameters, 14 convolutional layers, 2 fully connected layers, Relu activation function, Max-pooling pooling.
It is very challenging to do this. First of all, there is no previous work for us to refer to.
Challenge 1: How to perform scientific calculations in arithmetic circuits.
There are a lot of decimal and floating-point arithmetic in scientific computing. We choose to encode fixed-point numbers on finite fields, and normalize all the data of the VGG16 model to ensure that all numbers do not exceed 256. Take six decimal fixed-point numbers, 8bit represents the integer part, 6bit represents the fractional part, and 24 bits represent a certain point. Use digital circuits to realize various operations on fixed-point numbers (squares, square roots, logarithms, etc.).
Challenge 2: The R1CS matrix is huge and requires PB-level memory.
The memory of a general computer is GB, and the memory of a server to hundreds of G is already considered very large. If the R1CS matrix arithmetic circuit is really encoded, PB-level memory is required. Normal machines can hardly bear it. We calculated that doing VGG16 image recognition requires 14.6 billion multiplication gates. To the best of our knowledge, this is the world's first zero-knowledge proof of such a large-scale practical algorithm. The distribution of the number of multiplications is mostly convolutional layers, and when doing deep neural networks, a large number of them are convolutional operations, accounting for more than 90%.
The circuit matrix is too large, we had to invent a new zero-knowledge proof scheme - CLINK, to prove the image recognition process. Compared with other zero-knowledge proof schemes, CLINK adopts some special techniques to deal with large-scale circuits.
Technology 1: There are a large number of overlapping circuits in a huge circuit. As far as image recognition is concerned, there are a lot of convolutional layers, and the circuit parts of so many convolutional layers are similar, and similar ones can be combined for processing.
Technology 2: Huge circuits can be disassembled into many small circuits, so that the memory can hold small circuits, one by one, and then linked after completion. After the circuit is disassembled but the connection between multiple circuits needs to be protected, these values are still secret and cannot be exposed. We still protect the exposed wires between all the subcircuits in the middle with promises. Then the disassembly of the circuit can be realized.
Identical subcircuits can be combined for proof. In the end, many zero-knowledge proofs were generated for the huge circuit, and the intermediate value commitment was added to form a complete zero-knowledge proof. In the process, a zero-knowledge proof is generated for the sub-circuit, and the exposed line calculation commitment in the middle, the intermediate value is not used as a public input/input, and the result is packaged and compressed.
We did two experiments:
Solution 1 (parallel): Take advantage of the multi-core CPU as much as possible, and calculate VGG16 in parallel at the same time. All 16-layer circuits are proved in parallel, and the link proofs between all commitment layers are done at the same time.
Solution 2 (serial): Serialize layer by layer, first make the first layer and the second layer, and then do the link proof between the first layer and the second layer.
Here we are very grateful to the QTUM team for providing us with a super server equipped with 1TB memory and high-performance SSD. Let's use the first parallel solution first, because the parallel solution still requires a lot of memory. Although it does not require PB level, it takes about 5TB of memory to run the parallel solution, of which 4TB is running on the swap partition. Because the whole process uses the SSD swap partition, the performance will be greatly affected.
Parallel solution: Prove that the picture recognizes a kitten with a resolution of 224×224, and the proof time will take 25 hours. If you are given a machine with 4TB memory, it may only take 5 hours, but a machine with 4TB memory is hard to find, and the verification time will take 1 hour. The proof length is 80KB, and the peak memory usage is 5TB.
Serial scheme: It looks slower, but it runs faster on the machine, so only 500GB of memory is used, and no swap partition is used, so it is faster overall. The proof time is 20 hours, and the verification time is 1 hour , the proof length is 110KB.
In the process of our experiments, Korean scholars also sent a draft paper, doing similar work to us. The biggest difference is that they have made some compromise modifications to VGG16, especially when it is very difficult to do Max pooling in the circuit, so they use the average value to do pooling. Among them a very large 83GB CRS was introduced. In our scheme, the CRS is very small, and the practicability is better. This is the latest international research result that we can know.
Application prospects of big data zero-knowledge proof
Zero-knowledge proof has moved from theory to engineering, and is gradually moving towards application. In the future, it will be able to handle larger amounts of data. The CLINK algorithm we developed during the exploration process can support big data statistics, queries, and other complex operations, even including the inference process of deep neural networks. In the future, these technologies can be used in:


