Privacy computing: the outlet on the Web3.0 outlet?

Privacy computing: the outlet on the Web3.0 outlet?

Loading

Last month, Gartner, a world-renowned information technology research and consulting company, released strategic technology trends that need to be focused on in 2021. Among them, the technology of privacy computing was singled out as a major topic. You must know that in the Gartner report last year, privacy computing can only appear as a small item. These are enough to show that the value of “privacy computing” is bound to rise in the next two years.

隐私计算:Web3.0风口上的风口?

The ultimate goal of private computing is not privacy

Why have privacy calculations been mentioned more and more frequently in recent years? Recall that when I mentioned privacy, I talked more about data security, and the targets were hackers, because now the “enemy” has changed, and the opponent is between users and enterprises.

There are more and more controversies about companies getting benefits from consumer data. More consumers are gradually realizing the value hidden in personal data, and people increasingly want to control their data assets. It has also increased the government’s supervision of personal data, and ultimately caused various organizations to bear increasing pressure on personal data protection and management.

The users themselves are also very contradictory. There is a sentence that can describe this embarrassment: “Chinese users are willing to sacrifice privacy for greater convenience.” We all know that 3G/4G networks, as well as the development of the Internet and big data, have brought a lot of user experience optimizations. For example, we can accurately obtain the content we want to see, the product information we want to buy, and get to the nearest and most convenient way. car. But these do hurt privacy. On the one hand, users want to continue to enjoy this convenience, and on the other hand, they don’t want companies to control privacy too much. On the surface, this has formed an unsolvable situation.

For these phenomena, the protection of private data is the most direct starting point. But this does not fully represent the entire concept of privacy computing. First, companies and institutions have different motives and needs in terms of privacy. Consumers may only want to protect their identities, credit card information or other sensitive data to prevent fraud or identity theft. However, companies must rely on these private data to bring more business growth opportunities, and at the same time bear the pressure of protecting these user information or corporate sensitive information during the entire process of using private data.

Therefore, when we look at privacy computing, one is privacy and then computing. The ultimate goal is to maintain user data sovereignty while allowing data to generate value after computing. Therefore, privacy computing is not only static protection of data, but a process. protection of.

The absolute prerequisite for the migration from Web2.0 to Web3.0

Data is an asset, which has become a key feature of the transition from Web2.0 to Web3.0. After the user has established the sovereignty of his own data, another problem will arise. The data is well and firmly held in the user’s hands. At this time, the data has no value. Only it is circulated and passed through algorithms and computing power. A certain value will only be generated after processing.

But the question is, after I hand over the data, will my data sovereignty be damaged? Will the company use my data to do something outside the scope of user authorization? Without a secure solution, it is difficult for companies to convince users. Then the final situation is that users hold data that does not have value in their hands, and enterprises cannot obtain a large number of high-quality data sources to generate commercial value.

This is the problem to be solved by privacy computing. Why is privacy computing an absolute prerequisite for the migration from Web2.0 to Web3.0? How do we accurately cut into the opportunity point of this tuyere track? We need a more comprehensive understanding of privacy computing technology.

The standard definition of privacy computing is: “the problem of collaborative computing between a group of untrusted parties under the premise of protecting private information and without a trusted third party “. We can understand it well through the famous “millionaire” hypothesis. Two millionaires want to be richer than the next. This is the problem we want to solve. The most direct way is to see who has the largest wealth. But At this time, neither party wants to disclose how much money they have. This amount of assets is equivalent to the personal data in our hands. Then we can design an algorithm or device. Two rich people input specific wealth into that algorithm or device in places where they cannot see. The closed algorithm or device will output the result of who has the more money and who in the whole process. I don’t know how much money they have. Then this process or algorithm and device is privacy computing.

Privacy computing covers all computing operations of information collectors, publishers, and users in the entire life cycle of information generation, perception, release, dissemination, storage, processing, use, and destruction. That is to say, from the beginning of data generation, except for data Outside of the owner, no one knows what the data is. Others can use the data with the owner’s authorization, but they cannot see the specific information.

This solves the previous contradiction very well, because companies may not need to know your specific information. What they really need is the value generated after the data is processed and calculated. Not only that, in the real market, each company has different amounts of data and different dimensions. For example, Tencent controls the social habits data of Chinese people, and Taobao has a large amount of user shopping behavior data. Any one-sided data cannot be fully and accurately outlined. All characteristics of the target object. Data flow is a rigid demand of enterprises in the digital economy era. More and more enterprises or organizations need to cooperate with upstream and downstream business partners of the industry chain in the field of data circulation and transactions. Because only through the collaborative calculation of data from all parties, can we better release the greater value of data, improve production efficiency, and promote industrial innovation. Data sharing and circulation will become rigid business requirements. However, due to many factors such as data ownership, data leakage, and their own business interests, companies or organizations holding large amounts of data are extremely cautious about opening up their internal data, especially core data. This is the relationship between companies and companies. “The Millionaire Problem”.

For example, an e-commerce company wants to use user data to make a recommendation algorithm, but there is no way to get user data from Taobao or JD.com. Buy it directly, and worry about someone brushing the wool. But if it is connected to the blockchain. Then I can only sell my Taobao shopping data and JD shopping data without providing my courier address, name, Alipay number, etc. Because it is all recorded on the blockchain, I can neither sell the same data to this e-commerce company twice, nor can I sell fake data to the e-commerce company. In return, the store paid me compensation and obtained real data to help him improve the algorithm himself, killing two birds with one stone.

Privacy computing can protect the data being used while ensuring confidentiality or privacy .

And as mentioned earlier, one of the core features of Web3.0 is data confirmation, so only when the data is always in the hands of the owner and has been protected can it be said to have complete sovereignty. Therefore, from the traffic economy under the control of the platform in the Web2.0 era to the data economy under the dominance of Web3.0 individuals, privacy computing is the absolute key to the transformation.

Track system for private computing

In the past year or two, we may have been hearing the concepts of zero-knowledge proof, TEE, and homomorphic encryption sporadically, but it is difficult to have a clear understanding of the overall concept of private computing. Generally speaking, privacy computing is mainly divided into two major directions: trusted hardware and cryptography.

隐私计算:Web3.0风口上的风口?

Trusted hardware direction: TEE

Among them, trusted hardware refers to the establishment of a secure area in a piece of hardware that we can trust, and then all data processing and calculations are placed in this closed secure area, but security needs to be completely dependent on the hardware and hardware The builder of a safe environment. Usually Intel or other chip manufacturers. (Intel’s SGX, ARM’s TrustZone, AMD’s SEV and other products).

The most representative one is TEE, and other technologies such as “neutral country” and “security sandbox” are based on this idea. The advantage of trusted hardware lies in its high versatility. It can handle almost all types of private computing tasks, with low development difficulty and higher performance. But the biggest problem is that betting on security on a vendor is tantamount to leaving the backdoor to the trusted hardware provider and designer. From the outside, this implies a huge risk. Therefore, trusted hardware solutions will be used more in scenarios that do not pursue extreme security but require more speed and experience.

Cryptography: Multi-party secure computing

Multi-party secure computing is also commonly known as MPC. It means that in the absence of a trusted third-party escrow, multiple parties can safely complete a collaborative calculation through a specific function. Everyone enters their own data secretly, and then everyone Only the last calculated result of this function can be obtained.

隐私计算:Web3.0风口上的风口?

Multi-party secure computing is the direction with the longest history and the most profound technical background. All other technologies or solutions involving privacy computing will more or less adopt some technologies of multi-party secure computing. And multi-party secure computing is also the most closely integrated technology direction with the blockchain in the past few years. The blockchain can use MPC technology to improve its data confidentiality ability to adapt to more application scenarios. MPC can use blockchain technology to achieve redundant calculations, thereby obtaining verifiable characteristics. Blockchain+MPC has developed extremely fast in the near future, and it is also very imaginative.

Multi-party secure computing can be regarded as a protocol or a collection of technologies. As we often hear, zero-knowledge proof, homomorphic encryption, inadvertent transmission, and obfuscated circuits are all tools for multi-party secure computing. Because of its relatively low versatility and moderate performance, multi-party secure computing is often used in some specific scenarios, such as joint data analysis between large institutions and enterprises, and it can also be used for blacklisting between banks. Heavy wait. However, in recent years, the performance has improved rapidly, and the application scope and value of secure multi-party computing have increased dramatically.

Confusion circuits and inadvertent transmission are the core technologies of the mainstream multi-party secure computing framework: during the calculation process, logic or functions will be converted into circuits, and data will be encrypted and calculated along this set of circuits, and this encrypted circuit and related information will be sent to Other participants. Then the receiver selects the tag with the help of inadvertent transmission, and then decrypts the encryption circuit to obtain the calculation result.

The obfuscation circuit refers to that all calculation logic and problems can be converted into different circuits. For example, I want to sum up, that is, an addition circuit, and then the obfuscation circuit conceals key information by encrypting and disturbing the values ​​of these circuits. In this way, no one can see the actual effective information in the whole process, and at the same time completes the tasks required by the circuit. Inadvertent transmission means that A sends a message to receiver B, and sender A does not know whether B has received the message.

Then there is the zero-knowledge proof of star technology in recent years. The principle of zero-knowledge proof is that A and B are separated by a door, and B needs to prove to A that I have the key to open the door. How can I make A, who is opposite the door, believe that B has the key? Just let A see that B opened the door and walked over from the locked door in the middle. Then A can trust the fact that B holds the key to the door.

A zero-knowledge proof refers to a statement that one thing is correct to the other party without revealing any information other than the statement. The prover does not directly tell you the answer, but uses another expression to prove to you until you think the other party does know the answer. There are also many types of zero-knowledge proof technologies, such as zk-SNARKs used by Zcash and zk-STARKs frequently mentioned by Vitalik Buterin.

隐私计算:Web3.0风口上的风口?

Zero knowledge proofs are currently recognized as the “perfect” solution. Although the technology is the most difficult to implement, it has the best effect in protecting the privacy and security of the Ethereum network.

Finally, there are two basic application technologies of secret sharing and homomorphic encryption. Secret sharing divides the secret and shares it among n participants, so that only when more than a specific t participants cooperate can the secret be calculated or recovered, while less than t participants cannot obtain the relevant information. secret. The principle of homomorphic encryption is to encrypt the information and send it to the other party. The difference from general encryption is that the homomorphic encryption can complete the predetermined calculation task when decrypting. The computing of ciphertext data in a distributed computing environment has a wide range of application fields, such as secure cloud computing and commissioned computing, multi-party confidential computing, anonymous voting, file storage, and ciphertext retrieval. On the blockchain, using homomorphic encryption technology, smart contracts can also process ciphertext without knowing the real data, which can greatly improve privacy and security.

However, federated computing and differential privacy are relatively new solutions in the past two years, and we will not repeat them here.

In general, the popularity of privacy computing has only just risen in recent years, but behind it, human beings have never stopped pursuing encryption and privacy protection. Privacy computing is the result of catering to new market needs based on static data security. With the continuous development of the big data industry and the increasing demand for data security sharing, there will be more and more landing scenarios for privacy computing technology.

At present, the main application field of privacy computing is still in the financial industry. Many financial institutions integrate various data to complete the risk mapping of customers or certain investments without revealing customer information, so as to achieve the purpose of risk control. Sensitive business scenarios such as health care are also one of the important directions. For example, the previously controversial health code control may involve disputes about privacy leakage. Contradictions like this will be fully intensified when Web3.0 arrives, and privacy computing is the key to this.