HashKey: Explain the technical solutions, challenges and future of secure multi-party computing MPC

August 29, 2020

Secure multi-party computing is of great value to the development of network information security and data market.

Written by: Qian Baijun, at HashKey Capital Research
Reviewer: Zou Chuanwei, Chief Economist of Wanxiang Blockchain, PlatON

Technology and application analysis made herein to secure multiparty computation. Conclusion is secure multiparty computation can solve the problem of protecting the privacy of collaborative computing mutual distrust between the parties. Secure multiparty computation expand the boundaries of the traditional areas of information security and distributed computing, to solve information security under the network environment of great value. Secure multiparty computation can be combined with multi-industries data fusion, data is important for the development of the market.

Data is a complex concept with many types and rich characteristics. With the transition from the era of the Internet to block chain, data is about to become assets to generate economic value. However, most companies take into account data security and privacy issues, data sharing are very cautious. For personal data, control and privacy protection are more important than ownership. Therefore, enterprises often encounter difficulties in balancing the privacy of data input and the results of output.

For example: Hospital patients need to share data with the insurance company, but can not divulge patient privacy. Secure Muti-party Computation (Secure Muti-party Computation) provides a technical solution that can safely perform multi-party collaborative calculations without a trusted third party.

This paper is divided into three parts: The first part discusses secure multi-party computation architecture. The second part of the study of secure multi-technology implementation. The third part analyzes secure multi-application and future development.

Definition and architecture of secure multi-party computing

definition

Secure multiparty computation can be defined as a trusted third party and does not exist in the case of a distributed network, the more participating entities each hold a secret input, and hope together to complete the calculation of a function and get results, provided that the requirements of each Participating entities cannot know any input information from other participating entities except themselves.

The following is the mathematical expression of safe multi-party calculation:

Secure multi-party computing architecture

Secure multiparty computation can be divided into two parties: the participating nodes and hub nodes. Each node involved in the same position, you can initiate collaborative computing tasks, you can also choose to participate computing tasks initiated by other parties. Hub node is not involved in the actual collaborative computing, mainly used to control the transmission network, addressing and routing computation logic transmission. In addition, in a decentralized network topology, hub nodes can be deleted, and participating nodes can connect with other participating nodes point-to-point to directly complete collaborative computing.

In the process of secure multi-party computing, each data holder can initiate a collaborative computing task, and route addressing through the hub node, and select other data holders of similar data types for secure collaborative computing. The participating nodes of multiple data holders participating in the collaborative computing will query the required data from the local database according to the computing logic, and jointly perform collaborative computing among the dense data streams for secure multi-party computing tasks. The whole process all parties plaintext data stored locally, and will not be available to other nodes. In the case of guarantee data privacy, the hub node outputs the calculation results to the entire computing tasks system, which the parties get the correct data results.

There are three main characteristics of secure multi-party computing:

Privacy . The primary purpose of secure multi-party computing is how each participant protects private data during collaborative computing, that is, it is necessary to ensure that each party’s private input is independent during the calculation process, and no local data is leaked during the calculation.
Correctness . The parties involved in multi-party computing initiate computing tasks and perform collaborative computing through an agreed secure multi-party computing agreement, and the results of the computing data are correct.
Decentralization . In secure multi-party computing, all participants have equal status, and there are no privileged participants or third parties, providing a decentralized computing model.

Figure 1: Framework diagram of secure multi-party computing technology, source: Institute of Cloud Computing and Big Data, China Academy of Information and Communications Technology

Secure multi-party computing trust environment

Trusted environment or secure multiparty computation of overall security is defined generally by the real – to express the ideal model (Real-Ideal Paradigm). In the real-ideal model, there is a virtual “ideal” environment to compare with the real environment. In an ideal environment, all parties involved will be their secret data to a trusted third party, to complete the calculation by a trusted third party. In a real environment, there is no such a trusted third party. All participants exchange information to complete collaborative calculations, and there will be situations where the adversary controls some of the participants.

A secure multi-party computing system satisfies the security under the real-ideal model, which means that the adversary in the real environment cannot cause more harm than the attacker in the ideal environment; in other words, if there is an adversary that can cause harm to the real environment, then There are also hazards that an adversary can have the same effect on the ideal environment. From the inverse proposition whether, in fact, there is no adversary can cause harm to an ideal environment, so it can be concluded: The adversary success in the real world does not exist.

Generally speaking, in secure multi-party computing, two different attacker-related security models can be defined according to the difference in the ability of the attacker.

Semi-honest model (Semi-Honest Adversaries’ Security) . In the semi-honest behavior model, it is assumed that the adversary will honestly participate in the specific agreement of secure multi-party computing, follow each step of the agreement, but will try to infer the privacy of the other party through the content obtained from the execution of the agreement.
Malicious behavior model (Malicious Adversaries’ Security). In the malicious behavior model, malicious nodes may not follow the protocol and take arbitrary actions (such as forging messages or refusing to respond) to gain the privacy of other parties.

There are many security improvements multiparty computation can reach safety in malicious behavior model, but need to pay a high performance cost, large-scale multi-party computation of security products, basically usually considered only semi-honest model, malicious behavior model solutions will seriously affect the efficiency and practicality.

Realization of secure multi-party computing

Secret sharing

Secret sharing is a technology that is often used in multi-party security signatures. It is mainly used to protect important information from being lost or tampered with. Through the secret sharing mechanism, the secret information will be split, and each participant only holds a part of the secret, and some fragments held by individuals cannot be used to recover the secret. A predetermined number (or threshold) of fragments must be collected.

Oblivious Transfer

Inadvertent transmission is a cryptographic protocol widely used in the field of secure multi-party computing. It solves the following problems:

For example, Alice has two sealed password combinations. Bob can only obtain one set of passwords and Bob hopes that Alice does not know which set of passwords he chooses. At this time, inadvertent transmission can be used to complete the transaction.

Figure 2: Schematic diagram of inadvertent transmission, source: Link News

Inadvertent transmission has two roles, sender and receiver. A feasible concrete realization process is divided into four steps:

Garbled Circuit

Confusion circuit is a secure computing concept multi Professor Yao made in the 1980s. The obfuscated circuit is a cryptographic protocol. According to this protocol, two participants can calculate a certain function without knowing each other’s data.

For Yao’s millionaires problem (Yao’s Millionaires’ Problem), for example, two millionaires Alice and Bob want to compare who’s more wealth value without knowing the precise wealth value of the other cases. For example, Alice’s wealth value is 20, and Bob’s wealth value is 15. By obfuscating the circuit, both Alice and Bob can know who is richer, but neither Alice nor Bob know each other’s wealth. Confusion core logic calculation is first converted by AND gates, OR gates, Boolean logic circuit consisting of NAND gate circuit and then to disrupt these values to mask information through public key cryptography, oblivious transfer technology, the entire process Both parties transmit passwords or random numbers, and no effective information will be leaked. Therefore, both parties have achieved the purpose of protecting private data while obtaining the calculation results.

Alice and Bob assumed that there are two sides to obfuscate circuit protocol. The implementation process of the obfuscated circuit is divided into four steps:

Alice generates the confusion circuit . It can be seen from Figure 4 that the confusion circuit generated by Alice will be connected to many logic gates. Each logic gate has an input line and an output line, and there is a set of Truth tables.
Alice communicates with Bob. Alice ranks the truth table of the logic gate truth table symmetric encryption and disrupted into confusion table (Garbled table) transmitted to Bob.
After receiving the encrypted Bob truth table, each row of the truth table to decrypt the encrypted ultimately only one line can successfully decrypt and extract the associated encrypted information. Among them, Bob obtains the corresponding decrypted string from Alice through the inadvertent transfer protocol. Inadvertent transmission can ensure that Bob obtains the corresponding decrypted string, and Alce cannot know which one Bob obtained.
Finally, Bob returns the calculation result to Alice, and both parties share the calculation result. Because the need for both the logic gate circuits for each of several symmetric key operation, and therefore the computational complexity of the scheme used to confuse circuit is also relatively high, and is further complicated when expanded to more participants computing scenarios.

Figure 3: General circuit, gate and truth table, source: secure computing and cryptography

Zero-knowledge proof

Zero-knowledge proof means that the prover can convince the verifier that a certain assertion is correct without providing any useful information to the verifier. Zero-knowledge proof of the presence of two or more parties Role: the prover (prover) and verifier (verifier). The demonstrator declares that a certain proposition is true, and the verifier confirms whether the proposition is true.

The classic zero-knowledge proof (Sigma protocol) usually consists of three steps:

Prover first discussed the proposition to send to the verifier according to the propositional content, this discussion must be treated converted into a dense state discussed (commonly known as “commitment”), and the propositional content can not be tampered with and the subsequent denial of a time.
Verifier randomly generated and issued a challenge to those who show evidence.
Prover based on the proposition discussed the challenges and generate information to prove that the verifier. The verifier uses the proof information to determine whether the prover passed the challenge.

Repeating these three steps many times can reduce the probability that the demonstrator will pass the challenge because of luck. The secret proposition provided by the prover has two functions. First, it can prevent the prover from temporarily falsifying the content of the proposition. Second, it can prevent the verifier from knowing all the information and maintain privacy.

Zero-knowledge proof has three attributes:

Completeness . If the argument is indeed true, then the honest verifier will be convinced by the honest prover.
Reliability , if the argument is false, then the prover can only deceive the honest verifier with a small probability.
Zero knowledge . The verifier can only know the result of whether the proposition is true, and cannot obtain any other useful information from the entire interactive proof process. Secure multi-party computing usually uses zero-knowledge proof as an auxiliary means, for example, verifying malicious nodes sending false data or doing node identity certification.

Secure multi-party computing applications and difficulties

For now, secure multiparty computation circuit mainly through the confusion and secret sharing two ways. The protocol based on the obfuscated circuit is more suitable for two-party logic operations, and the communication burden is lower, but the scalability is poor. The secure multi-party calculation based on secret sharing has strong scalability, supports unlimited multi-party participation in the calculation, and has high calculation efficiency, but the communication load is large.

Currently the secure multiparty computation can be divided into two parts: data integration and data capitalization.

Data Fusion

Two or more parties make data integration and cooperation is secure multiparty computation can play at the maximum value. For example, joint credit investigation . Banks have user behavior related financial data, and Internet companies typically have to use user data network, how to make data cooperation between the two parties work together to build a credit model, it is a key issue data collaboration. Using secure multi-party computing, a shared data set can be found while both parties retain privacy, and the credit model trained on the basis of multi-party data will be more accurate, thereby providing more reasonable predictions for unknown situations and reducing the externalities of data fusion .

In addition, data security is also a large storage applications. Enterprises can use secret sharing technology to store data in a secret manner, effectively preventing insiders from illegally using data. At the same time, other calculations can be performed on the stored data without decryption, which not only ensures security, but also improves calculation efficiency.

Data assetization

Secure multiparty computation have the opportunity to promote the future development of data assets and data market. Due to security issues to ensure that multi-party computation is indeed the right data from the technical level in the data transmission process, the ownership and right to use the data to draw the line, so companies or individuals will be calculated by the multi-party secure valuable data as an asset , And flow or trade in the market. The data provider can specify the use attributes of the data, such as the use, amount, and validity period, and the user of the data can only reasonably use the data within the authorized scope after receiving the data, and can quantify the use rights of the remaining data or make further circulation.

Secure multi-party nature of the data can be calculated by the data market data ownership steering the right to use, protect the interests of the owner of the original data, effectively curb the raw data leakage, reduce the risk of data leakage caused by the flow of data and promote large-scale application data.

Future challenges

With the gradual development of technologies such as blockchain and big data, our requirements for data and computing are relatively higher. For example: block chain of anonymity, data calculation requires privacy protection and so on. Therefore, in the actual use of cryptographic technologies such as secure multi-party computing, there will be problems of very high interpretation costs and low efficiency.

Secure multiparty computation involves a huge amount of computation and communications, particularly to public key operations. At present, a single operation of safe multi-party calculation can reach the millisecond level, which means that it can do hundreds of calculations per second. However, in the context of big data, a data application or model training often involves hundreds of thousands of units of data samples and feature quantities, and computational efficiency will be a problem. In addition, for certain online or real-time computing application scenarios with more complex computing tasks, secure multi-party computing may currently be difficult to afford.

HashKey: Explain the technical solutions, challenges and future of secure multi-party computing MPC