# Vitalik Buterin: The application of Gini coefficient to cryptocurrency has limitations, here are alternatives

0

689 total views

The harm of concentration depends not only on the size of the participants, but also heavily on the relationship between the participants and their ability to collude with each other.

Original title: ” Vitalik Buterin: Opposing the Excessive Use of Gini Coefficient in Cryptocurrency
Written by: Vitalik Buterin, co-founder of Ethereum Compiler: Kyle
Source: Babbitt

The Gini Coefficient (also known as the Gini Index) is by far the most popular and well-known inequality measure of inequality, and is usually used to measure income or wealth inequality in certain countries, regions, or other communities. It is popular because it is easy to understand and its mathematical definition can be easily visualized on the graph.

However, as one would expect from all schemes that try to reduce inequality to a single number, the Gini coefficient has its limitations. This is true even in the original context of measuring income and wealth inequality across countries, but when the Gini coefficient is transplanted to other contexts (especially: cryptocurrencies), the situation is even more so. In this article, I will discuss some limitations of the Gini coefficient and propose some alternatives.

### What is the Gini coefficient?

The Gini coefficient is a measure of inequality introduced by Corrado Gini in 1912. It is usually used to measure the inequality of national income and wealth, but it is also increasingly used in other situations.

There are two equivalent definitions of the Gini coefficient:

Definition of the area above the curve: draw a function graph, f(p) is equal to the share of total income earned by the population with the lowest income (for example, f(0.1)) is the share of total income earned by the lowest 10% of income ). The Gini coefficient is the area between the curve and the y=x line as part of the entire triangle:

Definition of average difference: The Gini coefficient is half of the average income difference between all possible pairs of individuals divided by the average income.

For example, in the example chart above, the four incomes are [1, 2, 4, 8], so the 16 possible differences are [0, 1, 3, 7, 1, 0, 2, 6, 3, 2 , 0, 4, 7, 6, 4, 0]. Therefore, the average difference is 2.875 and the average income is 3.75, so the Gini coefficient = 2.875/(2*3.75)≈0.3833.

It turns out that the two are mathematically equivalent (proving this is an exercise for the reader)!

### What is the problem with the Gini coefficient?

The Gini coefficient is attractive because it is a fairly simple and easy-to-understand statistic. This may not seem simple, but believe me, almost all statistics dealing with populations of any size are so bad, and often worse. Here, look at basic formulas like standard deviation:

The following is the Gini coefficient:

So, what’s the problem? Well, there are a lot of problems, and people have written a lot of articles about various issues of the Gini coefficient. In this article, I will focus on a specific issue that I think is insufficiently discussed about the Gini coefficient as a whole, but it is particularly relevant to analyzing inequality in the Internet community (such as blockchain). The Gini coefficient combines two issues that actually seem completely different into a single inequality index: constrained by lack of resources and concentration of power.

In order to understand the difference between these two issues more clearly, let’s take a look at two dystopias:

• Dystopia A: Half of the population shares all resources equally, and the others have nothing
• Dystopia B: One person owns half of the resources, and the others share the remaining half equally

Here are the Lorentz curves of two dystopias (the fancy diagram we saw above):

Obviously, these two dystopias are not good places to survive. They are not great places to live in very different ways. Dystopia A makes each inhabitant toss a coin between the unimaginable horrific mass hunger (if they end up in the left half of the distribution) and egalitarian harmony (if they end up in the right half). If you are Thanos, you might really like this way! If not, it is worth avoiding this situation with the strongest strength. On the other hand, Dystopia B is similar to Brave New World: everyone leads a good life (at least when taking a snapshot of everyone’s resources), but at the high price of an extremely undemocratic power structure, it’s best The hope is that you have a good overlord. If you are Curtis Yarvin, you might really like it! If you are not, it is also worth avoiding this situation.

These two issues are very different and deserve to be analyzed and measured separately. This difference is also not only theoretical. The chart below shows the share of total income earned by the bottom 20% of the population (a decent agent avoiding dystopia A) and the share of total income of the top 1% (a decent agent close to dystopia B):

Sources: https://data.worldbank.org/indicator/SI.DST.FRST.20 (combined 2015 and 2016 data) and http://hdr.undp.org/en/indicators/186106

The two are clearly correlated (coefficient -0.62), but far from being completely correlated (statistics believe that 0.7 is the lower limit of “highly correlated”, and we are even below this threshold). There is an interesting second dimension in the chart that can be analyzed-the top 1% earn 20% of the total income and the bottom 20% earn 3% of the total income, and the top 1% earn 20% of the total income. The difference between countries where the bottom 20% earn 7% of total income. However, this kind of exploration is best left to other enterprising data and cultural explorers who are more experienced than me.

### Why is the Gini coefficient a problem in non-geographical communities (such as Internet/crypto communities)?

In particular, the concentration of wealth in the blockchain field is an important issue, which is worth measuring and understanding. This is important for the entire blockchain space, because many people (and the U.S. Senate hearing) are trying to figure out to what extent cryptocurrency is truly anti-elitism, and to what extent it is simply replaced by a new elite The old elite. This is also important when comparing different cryptocurrencies with each other.

In the initial supply of cryptocurrency, the share of tokens clearly allocated to specific insiders is an inequality. Please note that this Ethereum data is slightly wrong: insider and foundation shares should be 12.3% and 4.2% instead of 15% and 5%.

Given the level of attention to these issues, it is not surprising that many people try to calculate the Gini index of cryptocurrencies:

• Observed Gini Index of Staking EOS Tokens (2018)
• The Gini Coefficient of Cryptocurrencies (2018)‌
• Use multiple indicators and granularity to measure the decentralization of Bitcoin and Ethereum (2021, including Gini coefficient and 2 other indicators)‌
• Dr. Doomsday Nouriel Roubini compares Bitcoin’s Gini coefficient with North Korea (2018)‌
• On-chain insights into the cryptocurrency market (in 2021, use Gini to measure concentration)‌

Even earlier, we also see this sensational 2014 article :

In addition to the common methodological mistakes often made by this type of analysis (often confusing income and wealth inequality, confusing users and accounts, or both), there is a deep and subtle problem in using the Gini coefficient for such comparative analysis. . The problem lies in the key difference between a typical geographic community (e.g. city, country) and a typical Internet community (e.g. blockchain):

The typical residents of a geographic community spend most of their time and resources in the community, so the inequality measured in the geographic community reflects the inequality in the total resources available to people. However, in the Internet community, the measured inequality may come from two sources: (i) inequality in the total resources available to different participants, and (ii) inequality in the level of interest in participating communities.

Ordinary people with \$15 in fiat currency are poor and have lost the ability to lead a good life. Ordinary people with \$15 cryptocurrency are just hobbyists. They once opened a wallet for fun. Unequal interest levels are a healthy thing; every community has its own amateurs and full-time hardcore fans. Therefore, if a cryptocurrency has a very high Gini coefficient, but it turns out that this inequality is largely due to inequality in the level of interest, then the reality that this number points to is far less terrible than the title implies.

Cryptocurrencies, even those highly affluent cryptocurrencies, will not turn any part of the world into a dystopia. But unevenly distributed cryptocurrencies may look like Utopia B. If currency voting governance is used to make protocol decisions, the problem will be more complicated. Therefore, in order to detect the most worrying issue of the cryptocurrency community, we need an indicator to more specifically capture the closeness to Utopia B.

### Another option: measure the Dystopia A problem and the Dystopia B problem separately

Another way to measure inequality is to directly estimate the pain caused by the uneven distribution of resources (ie, the “Dystopia A” problem). First, start with some utility functions that indicate the value of a certain amount of money. Popular because it captures the intuitively appealing approximation that doubling income is equally useful at any level: from \$10,000 to \$20,000 and from \$5,000 to \$10,000 or from \$40,000 to \$80,000 The utility is the same). This score is a question of how much utility is lost compared to the average income earned by everyone:

The first item (average logarithm) is the utility that everyone will have if the currency is completely redistributed, so everyone gets an average income. The second term (logarithmic average) is the average utility of the economy today. If you narrowly view resources as things for personal consumption, then this difference represents the loss of utility caused by inequality. There are other ways to define this formula, but they are ultimately close to equivalent (for example, Anthony Atkinson’s 1969 paper proposed an “equal income level for equal distribution” indicator, in this case, it is just the above A monotonic function of, and Theil L index is mathematically equivalent to the above formula).

To measure concentration (or the “Dystopian B” problem), the Herfindahl-Hirschman index is a good starting point and has been used to measure the economic concentration of the industry:

Or for visual learners:

Herfindahl-Hirschman index: green area divided by total area

There are alternatives to this; Theil T index have some similar characteristics, but there are some differences. A simpler and dumber alternative is the Satoshi Nakamoto coefficient: the minimum number of participants required adds up to more than 50% of the total. Note that all three concentration indices are very focused on what happens at the top level (and deliberately): a large number of participants with a small amount of resources contribute little or no contribution to the index, and the combined behavior of two top participants may Make very big changes to the index.

For the cryptocurrency community, resource concentration is one of the biggest risks faced by the system, but people with only 0.00013 coins cannot prove that they are actually starving. Using such an index is an obvious method. But even for the country, it may be more worth discussing and weighing the pain of concentration of power and lack of resources.

In other words, at some point, we must even surpass these indexes. The harms of concentration do not only depend on the size of the participants; they also depend heavily on the relationships between the participants and their ability to collude with each other. Similarly, resource allocation depends on the network: if people who lack resources have an informal network to use, then the lack of formal resources may not be so harmful. But dealing with these issues is a more daunting challenge, so we do need simpler tools, while we still have a small amount of data to use.

Special thanks to Barnabe Monnot and Tinazhen for their feedback and review.