DeFi’s next important primitive: exploring feedback control in applications such as Ampleforth and THORChain

November 18, 2020

A large number of emerging DeFi protocols vary greatly in functions and uses, but some primitives have become common components. Feedback control system is a possible way to improve protocol incentives, efficiency and flexibility.

Original Title: ” Feedback Control: The Next Important Primitive of DeFi “
Written by: Hsien-Tang Kao and Tarun Chitr, working at Gauntlet
Compilation: free and easy

The original authors are Hsien-Tang Kao and Tarun Chitra from Gauntlet. In this article, they use Ampleforth’s rebase mechanism, RAI’s reflection index, EIP-1559’s fee market proposal, and THORChain’s incentive pendulum mechanism to illustrate the feedback controller. The use in different mechanisms, in addition, they will also show how feedback control makes it possible to price derivatives on the chain.

Picture from: Flickr

This year, we have seen a large number of new DeFi agreements that provide new mechanisms to support transactions, lending, and other financial activities. Although these protocols differ greatly in function and usage, some primitives have become common components of many new protocols. Among them, constant function market makers (CFMMs) and automatic interest rate curves are the two most popular DeFi components, which appear in many DeFi products (such as Uniswap and Compound). As the industry gathers around these primitives, this begs the question: Is there a better choice?

In fact, the feedback control system (Feedback control system) is a possible way to improve protocol incentives, efficiency and flexibility.

What is feedback control?

“Feedback is the core feature of the living body. The feedback process controls how we grow, how we respond to stress and challenges, and is responsible for regulating body temperature, blood pressure, cholesterol levels and other factors. From the interaction of proteins in cells to organisms in complex ecosystems The interaction of these mechanisms is at every level.”
——Mahlon Hoagland and Bert Dodson, “How Life Works”, 1995

Control theory has been extensively studied in applied mathematics, electrical engineering and robotics.

It has a wide range of applications in many industries, including aerospace systems, autonomous vehicles and Internet of Things devices. In the classic “feedback system” textbook, Karl Johan Åström and Richard M. Murray defined control as the use of algorithms and feedback in engineering systems.

[1] Open loop system

[2] Closed loop system

Figures [1] and [2] illustrate the difference between open-loop and closed-loop control systems. In an open loop system, the controller output has nothing to do with the system output. In contrast, the controller of a closed-loop (feedback) system uses the system output as an additional input. In a closed-loop system, the system dynamics depends on the controller dynamics, and the controller dynamics depends on the system dynamics, which produces a coupling effect between the system and the controller dynamics. Due to the circular dependence, it is very important to understand the feedback system.

A brief history of feedback control and reinforcement learning

The proportional-integral-derivative (PID) controller is the most common feedback controller. It uses the difference between the desired system state and the observed state to continuously calculate the control signal. In 1922, the Russian Nicolas Minorsky published the first theoretical analysis of PID controllers for the automatic steering system of US naval ships. In the 1950s, commercial digital computers came out, which led to the rapid development of optimal control theory. The primary problem of optimal control is to find a control law that can produce an optimal state trajectory and minimize or maximize the measurement of dynamic system behavior. Richard E. Bellman’s “Optimality Principle” (or Bellman’s equation), dynamic programming algorithm and Markov decision process were developed in this era, and their purpose is to solve optimal control problems. In the late 1980s and early 1990s, preliminary work in the field of optimal control and artificial intelligence promoted the development of reinforcement learning. Reinforcement learning solves the optimal control problem through trial and error learning or approximation without fully understanding the system state. In the past two decades, with the development of computing and deep learning algorithms, a new round of successful deep reinforcement learning algorithms has emerged. Deep reinforcement learning uses deep neural networks to extend reinforcement learning without the need to explicitly design the state space. DeepMind uses these algorithms to create artificial agents that can play Atari games, and Go, which does better than humans.

PID controller

The intuitive way to understand feedback control or PID controller is through a proportional controller (P controller)

K_p is a constant. In a proportional controller, the control input u(t) proportional to the error e(t) between the observed output and the desired system output.

Here we will show how a thermostat uses a feedback mechanism to control the room temperature. Assuming that the current temperature is 90°F and the thermostat temperature is set to 70°F, the error is 20°F. When K_p = 0.1 kW/°F, the thermostat controls the air conditioning equipment to use u(t) = 2 kW to cool the entire room.

When the temperature drops to 80°F, the error is reduced to 10°F, and the air conditioner will output 1 kilowatt of power. From this example, we can see that the thermostat outputs a control signal to change the output power of the air conditioner and lower the temperature. The thermostat measures the temperature error and changes the output control signal. This feedback loop makes the room temperature gradually converge to the desired temperature.

PID controller block diagram (Source: Wikipedia)

The PID controller extends the concept of a proportional controller. In addition to the current error e(t) , it also measures the cumulative error int e(t) and the error rate of change frac{de(t)}{dt} to calculate the control input:

Among them, K_p , K_i and K_d are all constants.

Feedback control and DeFi

Feedback control is a simple and powerful idea, and it has been widely used in the real world. In addition to existing applications, feedback control is also an important part of DeFi applications.

Suppose a protocol has a high-level goal. The protocol measures the distance to the current state and uses a feedback mechanism to update the protocol parameters to encourage market participants to push the system to the desired state. For example, the stablecoin agreement hopes to anchor the token to 1 U.S. dollar. The agreement continuously adjusts the interest rate based on the stablecoin price. When the stablecoin price is higher than 1 U.S. dollar, the agreement will lower the interest rate and encourage participants to issue more stablecoins. Otherwise, the agreement will raise interest rates and incentivize participants to repay debts. Through algorithmic adjustment of interest rates, when the stablecoin is around $1, the market can reach a balance between supply and demand.

Many DeFi applications have implicitly or explicitly used this mode in protocol design. Here we will use Ampleforth’s rebase mechanism, RAI’s reflection index, EIP-1559’s fee market proposal and THORChain’s incentive pendulum mechanism to illustrate the use of feedback controllers in different mechanisms. We will also show how feedback control makes it possible to price derivatives on the chain .

Volatility restraining assets

Ampleforth and RAI pioneered the concept of uncorrelated and low-volatility crypto assets. At first glance, these protocols seem to have different underlying mechanisms. AMPL dynamically adjusts supply to solve the problem of incompatibility, while RAI uses a dynamic redemption rate mechanism to minimize reflection index fluctuations. However, these two agreements are essentially feedback control systems, and they are designed to create a volatility suppressing asset. The main difference between these protocols is that they use different control inputs. We will use the feedback control framework to show the similarities and differences between the two protocols.

Ampleforth Rebase mechanism

AMPL is a digital asset that dynamically adjusts its supply according to market prices. When the price of AMPL is higher than $1, its supply will expand, otherwise it will shrink. The expansion and contraction of the token supply mechanism encourages rational AMPL traders to intervene and push the AMPL price towards the $1 goal.

In order to formulate the rebase mechanism, we first define the error as the difference between the target value and the observed value:

Assuming the target value is $1 and the observed value is the current price, the error term is:

When the price deviation e(t) is greater than the deviation threshold d_t, the supply of AMPL is adjusted to:

According to the above equation, we can express rebase as a proportional controller, where:

Control rules:

As can be seen from this example, rebase lag is a key parameter that determines system behavior.

Choosing the appropriate rebase lag parameter is the same as adjusting the proportional gain of the controller. The effect of proportional gain on system characteristics has been extensively studied in control systems: high proportional gain (or low rebase lag ) can reduce steady-state error and speed up the rise time, but it will increase overshoot and make the system more accurate. Oscillatory.

Source: Matlab and Simulink control tutorial

RAI reflection index

The reflex index is an asset with lower volatility than its collateral. The system uses a collateralized debt position (CDP) similar to MakerDAO for asset issuance. When the redemption price of the reflection index deviates from the market price, the agreement will adjust the redemption rate (the rate of change in the redemption price) to incentivize CDP holders to generate more debt or repay outstanding debt.

The RAI reflection index is the first protocol that explicitly references the PID controller in the protocol design. The error term in this reflection index is the difference between the market price and the redemption price:

The redemption rate is the control input and is modified by a proportional controller:

as well as

In the two examples mentioned above (Ampleforth and RAI), there is a feedback control system. These agreements target specific reference prices, but use different economic mechanisms to influence the supply of tokens. Ampleforth directly changes the total supply of the system to motivate participants to perform “supply discovery” or “market value discovery”, thereby pushing the AMPL price to $1. RAI changed the redemption price and encouraged participants to rebalance the total outstanding debt to reduce price fluctuations.

EIP-1559: Ethereum fee market change proposal

The current Ethereum fee market uses a simple first price auction mechanism to price transaction fees. This auction mechanism is sub-optimal. It brings considerable overhead to bidders because each bidder needs to bid based on the expected bids of other competitors. EIP-1559 solves this problem through an adaptive charging mechanism, so that the total fee charged can exceed the social cost of the network.

The proposed transaction fee includes a dynamically adjusted base fee and an additional tip fee for miners. Block usage is the main factor that determines the basic cost:

When the block usage is higher than the target usage, the basic fee increases, and vice versa. This cost adjustment algorithm seeks game theory equilibrium and establishes a lower bound on cost. This proposal may be the most significant change in Ethereum 1.0, and it will greatly change the user experience and monetary policy.

Not surprisingly, EIP-1559 can be described as a feedback control problem, and its basic cost adjustment algorithm is:

The error term in the algorithm is:

The basic cost adjustment algorithm is also a proportional controller, where:

The control input is:

as well as

THORChain’s incentive pendulum mechanism

THORChain is a decentralized network that facilitates the exchange of cross-chain assets. The agreement requires that the total pool capital of the system is greater than the guaranteed capital to ensure its safety. In THORChain, a capital ratio of 2:1 is considered the optimal system state. This incentive pendulum mechanism is to keep the system in a balanced state, it redistributes the total inflation rewards and transaction costs to the participants, so that the system gradually converges to an optimal state. In particular, the proportion of system revenue allocated to liquidity providers is:

Among them, b and s represent the total guaranteed capital and total pool capital, and the rest is given to the bonder. In the optimal state, the incentive pendulum allocates 33% of the system revenue to the liquidity provider and 66% of the system revenue to the bonder. If the system only has guaranteed capital, the incentive pendulum will allocate 100% of the system’s revenue to the liquidity provider.

THORChain’s incentive pendulum uses a certain formula to calculate the income distribution of the system. Although it does not use the formula of the PID controller, the excitation pendulum and the PID controller have a very similar concept:

This mechanism attempts to minimize the change of error over time, even if the system state converges to the optimal state;
The control signal is an error function, where the error is the difference between the measured bonded-to-pooled capital and the best bonded-to-pooled capital;

On-chain derivatives pricing

One of the biggest surprises in 2020 is that the spot asset DEX can handle spot transactions of the same order of magnitude as centralized exchanges.

However, the most active crypto trading product-perpetual contracts, has not yet achieved decentralization.

Although there have been some attempts at decentralized futures products, such as FutureSwap and McDEX, so far, these agreements have not fulfilled their promises. One of the main reasons is that futures trading is much more sensitive to delays than spot trading. This is because the oracle price update needs to be very fast to avoid front running and back running transactions. In addition, because lower margin requirements allow users to make large-scale bets with less collateral, liquidity tends to increase and remove in derivatives trading venues at a faster rate. However, there are many new mechanisms that can replicate the results of derivatives without the need for high liquidity speeds. These methods involve automated market makers (such as Uniswap), which have dynamic curves. A basic work in this direction is a theorem by Alex Evans, which states that if a Balancer pool adjusts its weight according to a modified PID controller (as shown below), then you can replicate any unlevered return.

In the above equation, the weight w* Balancer pool follows the governing equation as a function of the expected return g . Generating arbitrary derivative returns is a question of increasing leverage-if someone can borrow against shares in the Balancer pool that pays g(x,t) and use the borrowed funds to create a new pool share, then they can add The leverage of one’s own exposure is a constant multiple of g . On-chain lending platforms like Aave and Compound are very suitable for this kind of operation. What does this have to do with perpetual contract trading?

We can think of the perpetual contract product as a function that maps the index price p （t） to positive or negative returns. For example, constant function market makers (CFMMs) such as Balancer allow p(t) expressed as a quantity vector, and the weight of the pool controls the mapping from quantity to price. Therefore, we can consider the alternative structure of a perpetual product (in financial terms, a copy of a portfolio) as a CFMM whose shape is being adjusted to maintain returns. Although the weight update can still be pushed forward and backward, it is much more difficult to do this than to manipulate the price. This is because you need to manipulate the amount held by the market maker (x in the equation above) to adjust the return g . Unlike price manipulation (single scalar), you must adjust the amount of collateral x (a pair of spot assets locked by many LPs). As we pointed out in Appendix D of the Uniswap paper, as the total value of the lock increases, this manipulation becomes more and more difficult (difficulty increases linearly).

This example illustrates that when an appropriate proportional controller is used, when coupled with a dynamically adjusted market maker, many derivative products can exist on the chain. Although the research on designing such controllers is still in its infancy, but like the CFMM designed by Yield, Opyn and other teams, this popular trend has shown that control theory makes possible on-chain derivatives.

Ethereum has limited computing and storage capacity

In the history of feedback control and reinforcement learning, the progress of algorithms can be said to be the main factor of success. However, people often overlook the fact that the paradigm shift in computing and storage has also led to these technological breakthroughs. In the absence of commercial computers in the 1950s, dynamic programming (Dynamic programming) is a way to solve the optimal control problem. Without GPU clusters and huge storage space, Deepmind cannot effectively train Atari games. Deep reinforcement learning model.

We know that the computing and storage capacity of Ethereum is limited. Currently, most DeFi protocols overcome these limitations by using simple feedback algorithms, which do not require a large amount of storage to track changes in historical state. Therefore, PID controllers or other constant space and time complexity algorithms (run time and space requirements will not increase with the increase of input size) are very suitable for resource-constrained computing environments.

The natural next step in the theory of on-chain leverage control is to formulate the DeFi protocol feedback mechanism as an optimal control problem. There are two reasons: There has been a lot of theoretical work on optimal control, and it does not rely on huge computing power. Another possible way is to introduce more complex algorithms to optimize parameters on the chain through the governance process of the agreement. Many neutral third parties can process blockchain data and external data sources off-chain, run complex algorithms, and submit optimized governance voting parameters to improve protocol efficiency.

Final thoughts

The proportional controller is the most common form of controller in the industry. It takes the current error as the input and solves most problems well. In order to further improve the existing feedback system, the agreement can consider adding “past error” (integral term) and “expected future error” (derivative term) as inputs to the controller.
The joint curve or interest rate curve is a mechanism that motivates specific user behavior. It is very important to parameterize these curves because the design space is very wide. For example, curves with different shapes may get very similar results, but it is difficult to assert that one curve is strictly better than the other. The method based on the joint curve has a curse of dimensionality. Parametric 3D or higher dimensional surface seems to be a challenging task. The protocol development team may consider using feedback control methods to simplify the design and parameterization methods. Developers do not need to design the entire curve describing the relationship between a series of parameter values, but only need to pay attention to the “rate of change” of parameter values.
Considering that smart contracts usually involve high risks and the dynamics of feedback systems, designing a smart contract based on feedback control is a challenge. We know that simulation is widely used in parameter debugging in industry, and Gauntlet can help protocol designers to stress test their protocols by simulating a large number of protocol parameters and market environment. Establishing a safe and efficient DeFi ecosystem has always been our top priority.

Thanks to John Morrow and Rei Chiang for their helpful editing, comments and suggestions for this article.

Source link: medium.com