Hardcore | In-depth analysis of the technical characteristics of zero-knowledge proof

November 7, 2020

If a proof system has “zero-knowledge”, then it is called a “zero-knowledge proof”.

Original Title: “Dry Goods | Introduction to Zero Knowledge Proof (1)”
Written by: Cyte

Following the last time Shor issued a comprehensive research on routing issues in payment networks , another Nervos partner Cyte who loves research did a detailed research on zero-knowledge proofs.

In this article, Cyte will introduce to you the definition of zero-knowledge proof (ZKP) , and distinguish zero-knowledge proof from the two concepts of SNARK and STARK.

These cryptographic concepts such as ZKP, SNARK and STARK have become popular with the recent rise of blockchain. However, they are often misunderstood and mixed. In fact, all these concepts belong to a broader category, called Proof System , or Cryptographic Proof . Zero-knowledge proof and SNARK and STARK have crossover parts, but they do not contain each other. The relationship between them can be represented by a diagram.

This article will first introduce the definition of the proof system, and discuss the various properties of the proof system, focusing on “zero knowledge”, “knowledge proof”, “simplicity” and “non-interactive”. In particular, if a proof system has “zero-knowledge”, then it is called a “zero-knowledge proof”. Finally, the article will discuss the definitions of SNARK and STARK and compare them.

Proof system

A proof system (Proof System) is an interactive protocol that includes two participants, Prover and Verifier, and an algorithm Setup. The function of the proof system is to allow Prover to convince Verifier to believe in one thing, which we call a statement .

Before the protocol starts, someone needs to call the Setup algorithm. The Setup algorithm accepts some public parameters as input and outputs the Setup information required by Prover and Verifier. The information learned by Verifier is recorded as Setup _P. The information learned by Prover is recorded as Setup _P. The common part of Setup _P and Setup _V is called Common Reference String (CRS) . Who calls the Setup algorithm and when it depends on the design of the proof system.

At the beginning of the agreement, Prover and Verifier get the input statement x at the same time. Prover must have some additional advantages over Verifier, such as more powerful computing power, or some additional input w. In addition, Prover and Verifier also learned about Setup _P and Setup _{V respectively} . The time to obtain the Setup information depends on the design of the certification system. For example, it may be that Prover and Verifier have been downloaded and stored in their respective hard drives for repeated use, or they may have been entered on the spot before the start of the agreement.

Then Prover and Verifier start to implement the protocol stipulated by the certification system. If Prover and Verifier are both honest, then they both strictly abide by the agreement. However, it is also possible that one party is malicious and does not execute in accordance with the agreement. What happens at this time depends on the security of the certification system. If both parties are malicious and neither follow the agreement, then it has nothing to do with this certification system.

Finally, Verifier outputs accept or reject, indicating whether to believe the statement x.

A proof system needs to satisfy two properties:

Completeness : If the statement x is correct and both Prover and Verifier comply with this protocol, then Verifier outputs accept with a probability of at least 1-ε _c , where ε _c is called the completeness error of the proof system. )

Soundness : If the statement x is incorrect, then Prover must be dishonest, and Verifier abides by the agreement, then any Prover cannot let the probability of Verifier output accept exceed ε _s , this ε _s is called a proof System reliability error (Soundness error)

These two requirements are the most basic requirements for a certification system to be established. Without any requirement, we can get a proof system that meets the conditions but is completely useless. For example, if we only require integrity, then no matter what Prover does, Verifier will always output accept only; if we only require reliability, then let Verifier always output reject only. In addition, it is generally hoped that ε _s and ε _c do not exceed, and the sum is less than, otherwise this proves that the system error is too large and it is almost useless.

If the reliability of a proof system is only valid for any Prover with limited computing power , that is, an adversary with unlimited computing power is likely to deceive Verifier. At this time, the proof system has only computational reliability (Computational Soundness) , so The system is also known as the Argument System (Argument System) . In contrast, the reliability that is safe for any Prover is called Statistical Soundness .

Prove other properties of the system

A proof system can satisfy some other (not necessary) properties

CRS model (CRS model) : If the Setup information is publicly visible to everyone, that is, Setup=Setup _P =Setup _V , it is said that this certification system is under the CRS model
Interactive/Non-interactive : If only Prover sends a message to Verifier during the entire interactive process, the system is called a non-interactive certification system; otherwise, the system is an interactive certification system
Transferable/Deniable : If the statement x is correct and the interaction process is sent to other Verifiers, it can also convince other Verifiers that the statement x is correct, the proof system is transferable; otherwise this Prove that the system is repudable
Public Verifiable/Designated Verifier : If Setup _V is publicly visible to everyone, that is, anyone can become a Verifier, this zero-knowledge proof system is publicly verifiable. Otherwise this system is for specific verifiers
Public coin : If the selection of all messages of Verifier are uniformly random and independent of Prover’s messages, the system is said to be publicly random
Zero-Knowledge : When the statement x is correct, if Verifier cannot obtain any other “knowledge” from the interaction except for the correctness of x, the system is said to be zero-knowledge
Succinctness : If the proof system is used to prove NP language, and the communication volume of the proof system is smaller than the proof w, then the proof system is concise

Example: Prove that the colors of the two balls are different

Setup message: There are two balls

Statement x: the two balls are of different colors

Verifier has limited computing power (blindfolded), Prover has normal vision

Verifier holds a ball in each hand and shows it to Prover.
Verifier puts his hands behind his back, and then (in his heart) randomly toss a coin. If it is heads up, exchange the balls in the left and right hands, otherwise not.
Verifier took the ball out and showed it to Prover.
Prover tells Verifier if the two balls have been exchanged.

Result: If the Prover guessed correctly, Verifier outputs accept, otherwise Verifier outputs reject.

Important nature

Above we have given the definition, examples and properties of the proof system. Next we discuss several important properties of the proof system.

Zero knowledge

Zero-knowledge is used to protect the honest Prover from being deceived by the malicious Verifier and reveal the secret evidence needed for the proof.

The zero-knowledge nature of the proof system has been mentioned above, which is simply described as Verifier cannot obtain any “knowledge” from the interaction. This description is inaccurate because it does not give a strict criterion.

The definition of zero-knowledge itself is counterintuitive: Prover clearly sent some bits of data to Verifier, why is this system “zero-knowledge”?

In fact, “information” is not the same as “knowledge.” If Verifier obtains the information, but obtaining the information does not allow Verifier to calculate more results, or that the information can be calculated by Verifier itself, then Verifier has not obtained any “knowledge”.

During the execution of a certification system, all information obtained by Verifier includes: Setup _V ; Verifier’s own random number r; all information sent by Prover to Verifier (denoted as p). We call this information Verifier’s “view”, denoted as View _V = (Setup _V ,r,p). This information is the source of all uncertainties in the Verifier calculation process. After confirming this information, everything else can be calculated deterministically.

Note that View _V is a random variable. After Verifier and Prover execute the proof system, Verifier will obtain a sample of this random variable. If Verifier can sample View _{V on its} own without Prover’s participation, then this system is zero-knowledge.

We call the algorithm for sampling this random variable the Simulator . Depending on how the simulator works, there are different definitions as follows:

Non-black box simulator, the corresponding zero knowledge is called non-black box zero knowledge. This zero-knowledge definition allows each Verifier to have an “exclusive custom” simulator. This definition allows the simulator to customize different sampling processes for different Verifier implementation details.
The black box simulator corresponds to the black box zero knowledge. The definition of this zero-knowledge requires a unique simulator, so that it can sample Setup _V for all Verifiers. It is impossible for this only simulator to know all the specific implementation details of Verifier, so it can only access Verifier through black box calls. However, the simulator has complete control over Verifier. The simulator can determine the random number r of Verifier and input any Prover message or Setup _V to Verifier. Therefore, in the eyes of the simulator, Verifier is a black box deterministic algorithm.
If the simulator is only for honest Verifier, it corresponds to Honest Verifier ZK. Because the honest Verifier’s behavior is completely expected, the simulator can naturally use this information, so this simulator’s access to Verifier is not a black box.

Non-black box simulators have access to more information, so non-black box zero knowledge is easier to establish than black box zero knowledge. And honest Verifier zero knowledge is the easiest to achieve.

Regarding the zero knowledge of honest Verifier, the honest Verifier here is more accurately semi-honest, or “honest but curious”. Such a Verifier will obey the agreement on the surface, but may try to extract knowledge from the message in private.

In contrast, malicious Verifier’s behavior is completely unrestricted. Verifier may crash, send messages that do not conform to the format, sample distributions that do not follow the protocol, and so on. To prove that a system satisfies zero knowledge of malicious Verifier, it is necessary to cover all these situations.

The simulator is a random algorithm, and its output value is also a random variable, denoted as Sim _V. Zero knowledge requires that the two random variables, View _V and Sim _V, are difficult to distinguish. However, there are many versions of the word “indistinguishable”, from which various definitions of zero-knowledge proof can be derived:

If the distribution of two random variables is statistically indistinguishable, that is, their statistical distance (Statistical Distance) is negligible, the proof system is said to be statistically zero-knowledge (Statistically Zero-Knowledge) ; if the statistical distance is 0, then It is called Perfect Zero-Knowledge ;
If the distributions of two random variables are computationally indistinguishable, that is, no random adversary in polynomial time can distinguish the two distributions, the proof system is said to be Computationally Zero-Knowledge .

Note that in the definition of zero-knowledge, only the correct statement x is required , and the distribution of View _V and Sim _V is difficult to distinguish. For false statements, we don’t care what knowledge Verifier can acquire, because in this case Prover itself is dishonest and there is no need to protect it. In other words, since Prover does not abide by the agreement, then our agreement is well designed Can not protect it.

However, even if it is an error, zero knowledge does not make any assumptions about the distribution of Sim _V , but if the probability of Sim _V obtained by inputting the wrong x sampling is verified by Verifier, there is a significant difference from the case where x is correct. We can use this to judge the correctness of x. This means that x can only come from an ordinary NP language. Therefore, for difficult NP problems, input the wrong x to the simulator, and the obtained Sim _V can be verified with the same probability.

In this way, zero knowledge and reliability are not contradictory? In other words, for the wrong x, why can’t Prover call the simulator to trick Verifier? In fact, Prover cannot control Verifier, and it cannot provide the simulator with the resources needed to sample Sim _V. Indeed, a malicious Prover can call the simulator, but this is useless for it, because the r in Sim _V output by the simulator is not the random number of the Verifier that is interacting with Prover. In addition, the Setup _V output by the simulator may be different from the Setup _V received by Verifier, which may cause the verification to fail. However, the simulator tuned by Prover cannot obtain the random number of Verifier, which is enough to ensure security, so even if the Setup _V is a fixed constant in the interactive proof, there is no problem.

Proof of knowledge

If Prover is required to “know” some information in order for Verifier to pass the verification, this system is called Proof of Knowledge . Proof of knowledge can be seen as an enhanced version of reliability. Knowledge proof also has a computational version called Argument of Knowledge .

Knowledge proof system is usually used to prove NP language. An NP language refers to a set L, so that the element x belonging to L can be proved by a piece of evidence w, that is, there is a polynomial time algorithm that can determine whether w is a legal evidence that x belongs to L.

Simplicity

In summary, for general NP languages, the concise proof system (logarithmic level) can only be an argument system.

Non-interactive

Non-Interactivity (Non-Interactivity) refers to the proof that all interactions in the system are only a message sent by Prover to Verifier. This message is called a proof and is denoted by π. Non-interactivity can bring a lot of convenience and bring more application scenarios for the proof system. For example, in a blockchain system, non-interactive zero-knowledge proofs can be attached to transactions for anyone to check at any time, without requiring the author of the transaction to interact with the verifier online at any time.

Any NP language naturally has a non-interactive proof protocol, that is, Prover sends proof directly to Verifier, and this proof is proof of knowledge. Therefore, constructing a proof system that is purely non-interactive is of little significance. Non-interactivity is only interesting when combined with the two properties discussed above, namely zero knowledge or simplicity.

Non-interactive + zero knowledge

Combining zero-knowledge and non-interactive, there is Non-Interactive Zero-Knowledge (NIZK) .

When we discussed zero-knowledge before, we mentioned that the reason why zero-knowledge is not inconsistent with reliability is that the probability of r in Sim _V sampled by the simulator is different from the random number of Verifier interacting with Prover. However, for non-interactive zero knowledge, we have to re-examine this reasoning process.

In the interactive proof, a Verifier with a random number of r ₁ can verify the passed Prover message p, and it is likely to fail the verification if it is directly moved to the Verifier with a random number of r ₂ . Therefore, in interactive proof, the correctness of p is not global, but depends on r.

In the non-interactive proof, Prover did not receive any message from Verifier, so the random number r of Verifier was not used in the calculation process of Prover. Therefore, in order to prove the integrity of the system, the p output by the honest Prover can be verified for most of the Verifier random numbers. Therefore, the correctness of p in non-interactive proof is global and does not depend on any r.

Non-interactive + simplicity

As mentioned above, a simple proof system must be an argument system. Combined with non-interactivity, there is a Succinct Non-interactive ARGument (SNARG) . In fact, the system that satisfies the definition of SNARG was constructed by Micali as early as 2000, and the name appeared later.

If a SNARG is also a knowledge argument, it is called Succinct Non-interactive ARgument of Knowledge (SNARK) . The name SNARK was pioneered by the paper BCCT12, and has now become one of the most popular concepts in the field of zero-knowledge proof. In fact, SNARK is only concise and non-interactive, not necessarily zero knowledge. If there is zero knowledge, it should be called zkSNARK.

Differentiation between STARK and SNARK

Another concept that is often mentioned with SNARK is STARK. It is only one word from SNARK, but there are many differences. Let’s compare these two concepts below.

Common points:

They are all argumentation of knowledge (ARK), that is, only the reliability in the sense of calculation, and the proof is knowledge

the difference:

It can be seen that the only restriction that SNARK has more than STARK is non-interactivity. Nevertheless, through the Fiat-Shamir transformation that will be introduced in subsequent articles, STARK can generally be transformed into a non-interactive proof, and the result of the transformation must be a SNARK. In this sense, STARK can be regarded as a subset of SNARK.

The above definitions of SNARK and STARK are the broad meanings of these two terms. In a narrow sense, they respectively refer to two specific structural schemes. SNARK refers to a series of zkSNARK construction schemes based on QAP and bilinear pairs represented by the Groth16 scheme. In a narrow sense, STARK specifically refers to the AIR and FRI-based solution proposed by Ben-Sasson et al. in 2018.

summary

This article introduces the definition of the proof system, and discusses the various properties of the proof system, focusing on “zero knowledge”, “knowledge proof”, “simplicity” and “non-interactive”, and explains how to use the simulator to Define zero knowledge and use extractors to define knowledge proof. Finally, the article discusses and compares SNARK and STARK.

Reference
1. Goldreich. Foundations of Cryptography, Volume 1. Basic Tools . 2001.
2. ZKProof Community. ZKProof Community Reference . 2019.
https://docs.zkproof.org/reference.pdf
3. Yehuda Lindell. How To Simulate It – A Tutorial on the Simulation Proof Technique . 2018.
4. Eli Ben-Sasson. A Cambrian Explosion of Crypto Proofs .
https://nakamoto.com/cambrian-explosion-of-crypto-proofs/

Source link: mp.weixin.qq.com