Dialogue with Amazon CTO: How to design a distributed system?

Dialogue with Amazon CTO: How to design a distributed system?

Loading

Amazon CTO Werner Vogels talks about the design principles of large distributed systems.

Original title: “Don’t lock yourself in your own structure”
Written by: tshi

As early as 2006, Jim Gray, the pioneer of transaction processing and the winner of the database field graph award, had a “first” dialogue with Werner Vogels. The theme of the dialogue is “learning from Amazon’s technology platform”, and the paradox is that the transaction processing pioneered by Jim Gray is the technical basis of Amazon’s e-commerce.

Recently, Akamai director Tom Killalea had a “second” conversation with Amazon CTO Werner Vogels. The topic of the dialogue is the evolutionary design of the large-scale simple storage system S3. The paradox is that just a month ago, Filecoin, the largest blockchain storage project that can be benchmarked against S3, just launched.

“I think it’s important to realize that Amazon is a technology company.” In the “first” conversation, Werner Vogels repeatedly explained to Jim Gray that Amazon should not be regarded as an online bookstore, but as an online bookstore. As a technology company. And, it was in this conversation that Amazon disclosed S3, a simple storage service, for the first time.

“Amazon.Com Books”, the name does not reflect our ambitions. Tom Killalea said. When Tom Killalea joined Amazon in 1998 (Tom Killalea joined Akamai as a director in March 2018), the company was just a website selling books: a simple C application Obidos, a key-value store deployed on Berkeley DBs, A relational database named “ACB” (referred to as “Amazon.Com Books”), these applications are deployed on 5 servers.

Constantly expanding customers and orders have caused Amazon to abandon its monolithic architecture and move towards a decentralized service-oriented architecture. When Jim Gray asked about Amazon’s biggest lessons, Werner Vogels said:

  • The first lesson, and the most important lesson, is the meta lesson: the sense of service . Strict service orientation is an excellent technology to achieve isolation, and you will reach an unprecedented level of ownership and control. By using the service, not only the technical aspects have been improved, but the development and business processes also greatly benefit from it. The service model is a key impetus for the creation of a customer-centric rapid innovation team. Each service has a team associated with it, and the team is fully responsible for the service-from determining the scope of functions to architecture, construction, and operation and maintenance.

  • The second lesson is that by prohibiting the client from directly accessing the database, the scalability and reliability of the service state can be improved without involving the client. These lessons are related to how to access services: if you want to be able to aggregate services easily, if you want to insert advanced infrastructure technologies, such as distributed request routing or distributed request tracking, you need a unified service access mechanism.

  • The third lesson: Entrusting developers with operational and maintenance responsibilities greatly improves the quality of services, whether from a customer’s perspective or a technical perspective. The traditional model is to put software on the wall separating development and operation and then leave it behind. This is not the case at Amazon, who builds and runs. This exposes developers to the daily operation and maintenance of the software. This also allows developers to contact customers every day. This customer feedback loop is essential to improve the quality of service.

“Technology is useless if it is not used for the greater benefit of serving customers. We are a strongly customer-oriented company, and we often use the “reverse work from customers” approach. This means that in your thinking process, start with the customer and work backwards until you find the simple and minimal technology needed to meet the needs of the new customer. It is important for engineers who come to work at Amazon to understand that we are not developing technology for technology, but to support customers. “

“Service-oriented architecture, the way we scale, the way we serve customers-I think our greatest success is that Amazon has become a platform from which other companies can benefit.”

Through the servitization of technology and business, Amazon and users have built a rapid feedback cycle and entered a fast-growing flywheel.

When S3 was launched in March 2006, S3 had only 8 services. By 2019, S3 has reached 262 services. In a conversation with Tom Killalea, Werner Vogels said: “I totally agree that this is an unprecedented scale. Even today, even if Internet services have reached an incredible scale, I think S3 is still two to two times ahead of it. Three generations.”

In the 2006 S3 announcement, Amazon adopted the following ten principles of distributed system design to meet the needs of Amazon S3:

  • Decentralization: Use completely decentralized technology to eliminate scaling bottlenecks and single points of failure.

  • Asynchronous: The system can continue to work under any circumstances.

  • Autonomy: A single component can make decisions based on local information.

  • Local responsibility: Each component is responsible for achieving its own consistency, which is by no means the responsibility of other peer nodes.

  • Controlled Concurrency: Operations are designed to require no or limited concurrency control.

  • Fault tolerance: component failure is considered normal operation mode, and operation continues without interruption or minimal interruption.

  • Controlled parallelism: The system abstraction has such a granularity: use parallelism to improve performance, restore robustness, or introduce new nodes.

  • Break it down into small, easy-to-understand building blocks: Don’t try to provide a single service that does everything, but build small components that can be used as building blocks for other services.

  • Symmetry: The nodes in the system are the same in terms of function and do not require or at least require specific configuration to operate.

  • Simplicity: The system should be as simple as possible, not simpler.

The ten principles above are how Amazon builds a large-scale distributed system. S3 is just an example of these design principles.

The principle is gray, and the needs of customers are evergreen. On the basis of the above principles, Werner Vogels proposed an evolutionary architecture .

At the time, most technology companies provided everything and “platforms.” They would provide a thick book and 10 different partners, and then tell customers how to use the technology. Amazon has not locked itself in its own technology and has embarked on another path. Jeff Bezos said many years ago that it is to build tools, not to build platforms . Platforms are the old way for large software platform companies to provide technical services.

“Before we started S3, we began to realize that what we did might fundamentally change the way software is built and how services are used. But we don’t know how this will develop, so it’s more important to build small, flexible tools. Let customers build on it (or we can build on our own) instead of preparing everything and the “platform” at a specific moment. It’s not a matter of time. What’s more, we firmly believe that whatever we add to the S3 interface or the functionality of S3, it should be driven by our customers-and how the next generation of customers will begin to build their systems. “

“In the past five to ten years, software has undergone fundamental changes. We need to build the right tools to support the rate of fundamental change. In this way, you cannot predict that you have to work with your customers. Wait for how they use your tools—especially if they have never been built before—and observe what they do. Then we sit down and ask ourselves, what is the minimum set.”

“You have to design the API consciously and carefully. The API is long-term. Once you put the API there, maybe you can provide a new version, but you can’t take it away from your customers. Keep it conservative and minimal in API design Modification can help you build basic tools. You can add more features to these tools, or partners can build new levels on top of them, or you can combine different building blocks. This is how we start from one The initial idea: to be minimalist, so that we can let our customers push what is going to happen instead of sitting in the back room and thinking: what the world should be like.”

Dialogue with Amazon CTO: How to design a distributed system?

These design decisions are reflected in Amazon’s data lake. Based on building blocks and tools, S3 far exceeds the data lake: around the database, S3 provides a huge toolbox (175 different services).

In the interview, S3’s design decisions also included:

Don’t lock yourself in your own structure. Werner Vogels said this when reviewing the design principles of S3. An effective complex system always evolves from an effective simple system. A complex system designed from scratch will never work, nor can it be patched to make it work. You must start with a simple and feasible system.

Perhaps readers do not need to read the original texts of the two interviews, but they need to remember and think about the points summarized in this article: service awareness, ten principles of distributed system design, building tools instead of platforms, and don’t lock yourself in your own In the framework.

Source link: mp.weixin.qq.com