This page does **not** provide an overview of the myriad applications of the Pareto principle with regard to management, manufacturing flaws, software glitches, sales portfolios, online contributions, etc. This page is intended to shed some light on the principle's sketchy association with the eponymous economist.

"The Pareto principle states that for many outcomes, roughly

80%of consequences come from20%of causes (the "vital few"). Other names for this principle are the 80/20 rule, the law of the vital few, or the principle of factor sparsity." (Abridged from Wikipedia)

The Pareto principle is frequently quoted, and often overstretched, in economics, social sciences, psychology and popular culture. It is sometimes employed to substantiate the assertion that "**with 20% of the effort, one can achieve 80% of the maximum result**", which, of course, in real life rarely plays out verbatim.

The Pareto principle is pervasive on the internet. However, online resources teem with misleading abridgements and outright factual errors.

Myths are perpetuated, with some authors obviously copying from others without making any effort whatsoever to access the source material. It does not help that Pareto‘s „Cours d'Économie Politique“ is unavailable in English translation.

A typical confabulation can be found in a book by venture capitalist Peter Thiel („Zero To One“, Crown 2014):

„In 1906, economist Vilfredo Pareto discovered what became the “Pareto principle,” or the 80-20 rule, when he noticed that 20% of the people owned 80% of the land in Italy—a phenomenon that he found just as natural as the fact that 20% of the peapods in his garden produced 80% of the peas.“

These unsourced assertions are happily copied (obviously unchecked) by authors on investopedia.com and other economics websites. Even a moderate amount of fact-checking would uncover that:

- Pareto published a 900 page tome on economics, which extensively deals with socioeconomic inequality, in 1896/1897.
- Pareto did not discover the Pareto principle.
- Pareto derived his distribution from data on several countries, without any emphasis on Italy.
- The apocryphal pea pod anecdote cannot be substantiated.

**Often, the Pareto principle is mashed up with the mathematical distribution discovered by Pareto, leading to erroneous representations of both.**

I have endeavored, in a first step, to disentangle the Pareto principle from the Pareto distribution, before somewhat reuniting them, with the distribution informing a more flexible application of the principle.

**Vilfredo Federico Damaso Pareto**, an Italian citizen, born as Wilfried Fritz Pareto in 1848 in Paris, died at the age of 75 years in Switzerland. He chiefly can be regarded as an economist and sociologist. Based on of his range of interests and his broad education, he also contributed to the fields of political sciences, philosophy and statistics. He lectured at the University of Lausanne, Switzerland.

His most famous writings are

- "
**Trattato di sociologia generale**" (1916), which is regarded as his main work, and - "
**Cours d'Économie Politique**" (1896–97), which pertains to our topic.

In Book III of that work, a whole chapter is devoted to "The Income Curve".

"La répartition de la richesse peut dépendre de la nature des hommes dont se compose la société, de l’organisation de celle-ci, et aussi, en partie, du hasard (les conjonctures de Lassalle), c est-à-dire de cet ensemble de causes inconnues, agissant tantôt dans un sens, tantôt dans un autre, auxquelles, dans notre ignorance de leur vraie nature, nous donnons le nom de hasard."

"The distribution of wealth may depend on the nature of the men of which society is composed, on its organization, and also, in part, on chance (Lassalle's conjunctures), that is to say, on this set of unknown causes, acting sometimes in one direction, sometimes in another, to which, in our ignorance of their true nature, we give the name of chance."

It is from his enquiries into the distribution of wealth, that the proverbial 80/20 relation was deduced.

On one hand, Pareto ist sometimes seen as the father of welfare economy, on the other hand, he is sometimes vilified for his cynical views on democracy and his leanings towards totalitarianism. Gottfried Eisermann published an article in 1947, titled "Pareto, Godfather of Fascism", which seems a bit strong. In 1993, an article on Pareto in the German magazine 'Zeit' was titled "The Marx of the Bourgeoisie". In 1968, Johannes Agnoli wrote:

"Pareto hatte 1922 Mussolini den Rat erteilt, um der Stabilisierung der Macht willen das Parlament in gewandelter Form weiter am Leben zu lassen: Massen, die demokratischen Gefühlen zuneigen, seien am besten durch ein Organ neutralisierbar, das ihnen die Illusion einer Beteiligung an der staatlichen Macht vermittelt."

"Pareto had advised Mussolini in 1922 to keep parliament alive in a modified form for the sake of stabilizing power: masses inclined to democratic sentiments could best be neutralized by an organ that gave them the illusion of participation in state power."

Pareto was multi-talented, multi-faceted and self-opinionated. Please feel free to delve further into these matters, and make up your own mind.

Vilfredo Pareto neither coined the term, nor did he invent the principle for which posterity holds him famous. Yet there still is ample reason to hold him famous, indeed, from the works rightfully attributable to him.

In a nutshell, management consultant **Joseph M. Juran** developed the concept in the 1940ies, in the context of quality control and improvement after reading the works of Pareto, and later broadened the scope of application.

We will have a closer look at how Juran's principle is related to Pareto's own work, and also in what ways this relation might not hold true.

Famous mathematician **Benoit Mandelbrot** (1924-2010) describes Pareto's main economic finding with regard to our topic:

"One of Pareto's equations achieved special prominence, and controversy. He was fascinated by problems of power and wealth. How do people get it? How is it distributed around society? How do those who have it use it? The gulf between rich and poor has always been part of the human condition, but Pareto resolved to measure it. He gathered reams of data on wealth and income through different centuries, through different countries: the tax records of Basel, Switzerland, from 1454 and from Augsburg, Germany, in 1471, 1498 and 1512; contemporary rental income from Paris; personal income from Britain, Prussia, Saxony, Ireland, Italy, Peru. What he found – or thought he found – was striking. When he plotted the data on graph paper, with income on one axis, and number of people with that income on the other, he saw the same picture nearly everywhere in every era. Society was not a "social pyramid" with the proportion of rich to poor sloping gently from one class to the next. Instead it was more of a "social arrow" – very fat on the bottom where the mass of men live, and very thin at the top where sit the wealthy elite. Nor was this effect by chance; the data did not remotely fit a bell curve, as one would expect if wealth were distributed randomly. 'It is a social law', he wrote: something 'in the nature of man' " Benoit B. Mandelbrot & Richard L. Hudson: The (Mis)behavior of Markets: A Fractal View of Risk, Ruin and Reward. Perseus, Basic Books, 2004.

From a 1898 book review of Pareto's "Cours d'Économie Politique" by **Fred D. Merrit** in the *Journal of Political Economy*:

" 'La courbe des revenus' is the most original and suggestive chapter in the work. It presents the results of an exhaustive statistical study of the distribution of wealth in different epochs and countries. A striking similarity runs through this distribution; and from this the deduction is made that the causes which determine this distribution are to be sought in the very nature of man, not in variations of environment. [...] The author's facile use of history, statistics, and biology show his breadth of learning, and the fitness of the examples drawn from these subjects bears witness to his grasp of the subject. One cannot help admiring the skillful use of statistics, as premises and tests of the theories found in the work. [...] The volumes are a refutation of the idea that mathematical modes of thought are unprofitable in economic science."

The discussion on this page does not hinge on the mathematical equations. We will explain things verbally and visually. However, the basis of our discussion is the math behind the graphs. Thus we will not omit it.

*The properties of the Pareto distribution are*

**PDF** (Propability Density Function):

**CDF** (Cumulative Distribution Function):

denotes theαshapeparameter, being a real number > 0

denotes thexscaleparameter, being a real number > 0

Please bear in mind for further discussion, that the functions return *zero* for all values of *x* smaller than *x _{min}* which must be a positive real number.

In [1]:

```
# Import libraries
from matplotlib import pyplot as plt
import matplotlib.ticker as mtick
import numpy as np
%matplotlib inline
```

In [2]:

```
# This value for alpha is said to exemplify the 80/20 axiom
alpha = np.log(5)/np.log(4)
print("alpha=", alpha)
```

alpha= 1.160964047443681

In [3]:

```
# Set the graph limits
x_min = 1
x_max = 21
```

In [4]:

```
# Define Propability Density Function
def pareto_PDF(x, alpha):
y = (alpha*(x_min**alpha))/(x**(alpha+1))
return y
```

In [5]:

```
plt.figure(figsize=(8, 6))
ax = plt.subplot(111)
ax.set_xlim(left=1, right=(x_max/2))
#plt.grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
plt.xticks(np.arange(0, ((x_max/2)+1), step=1), fontsize = 10) # Set label locations.
plt.title("Pareto Distribution: Propability Density", color = "b")
# Plot linewise
x=np.linspace(x_min, (x_max/2) , num=1000)
y = pareto_PDF (x, alpha)
ax.plot(x, y, color='b', linewidth=2.0)
ax.set(yticklabels=[])
ax.set(xticklabels=[])
plt.show()
```

This graph nicely corresponds to Pareto's 1890ies illustration *(note that Pareto plotted income on the x-axis, and number of individuals on the y-axis, which is horizontal in his graph)*:

"On parle souvent de la pyramide sociale, dont les pauvres forment la base, les riches le sommet. À vrai dire, ce n’est pas d’une pyramide qu'il s’agit, mais bien, plutôt, d’un corps ayant laforme de la pointe d'une flèche ou, si lon préfère, de la pointe d’une toupie."

"We often talk about the social pyramid, of which the poor form the base, the rich the top. To tell the truth, it is not a pyramid that it is, but rather, a body in the shape of the tip of an arrow or, if one prefers, the tip of a spinning top."

Pareto seemed to believe, that he had found an immutable economic 'law of nature':

"[...]

la répartition de là richesse varie peu pour des contrées, des époques, des orga- nisations différentes, il nous faudra conclure que, sans vous loir négliger les autres causes, nous devons chercherdans la nature de l’hommela cause principale qui détermine le phénomène.

"[...] the distribution of wealth varies little for different countries, epochs, and organizations, we must conclude that, without neglecting the other causes, we must seek

in the nature of manthe principal cause which determines the phenomenon."

It is ironic from a contemporary perspective, when a steady increase in inequality has been documented for more than three decades by the OECD, that the data available to Pareto firmly suggested to him:

"Il n’est donc pas vrai que, dans les circonstances actuelles, l'inégalité des fortunes aille en augmentant, et toutes les dé- ductions qu'on a voulu tirer de cette proposition erronée, tombent dans le néant. Mais, d’autre part, rien ne nous as- sure que la diminution de l'inégalité des fortunes ou des revenus doive continuer indéfiniment. On a pu, de nos jours, observer cette diminution, parce que, grâce aux dé- couvertes qui ont été faites dans les sciences, les arts et l’industrie, la richesse a recu un accroissement qui a été plus considérable et plus rapide que la destruction qui en était faite par la protection douanière, les vols des politiciens et le socialisme d'Etat."

"It is therefore not true that, in the present circumstances, the inequality of fortunes is increasing, and all the inferences that have been drawn from this erroneous proposition fall into nothingness. But, on the other hand, there is no guarantee that the decline in wealth or income inequality must continue indefinitely. This decline has been observed in the present day because, thanks to the discoveries which have been made in the sciences, the arts and industry, wealth has received an increase which has been greater and faster than the destruction which was made of it by customs protection, the theft of politicians and state socialism."

Rather than state socialism (long collapsed in most of the world), and customs protection, both feared by Pareto, neo-liberalism and globalization are much more plausible culprits for the modern-day increase in inequality, besides the ever-present nepotism and theft by ruling elites.

Before elaborating on the above "arrow shape" of the distribution, Pareto explains, how he derived his equation.

Pareto analyzed a treasure trove of historical and contemporary income data from a broad variety of countries, taken from public registries and other researchers' empirical studies. As an example, table D gives a comparison of income distribution for Great Britain and Ireland respectively.

Pareto noted, that, when plotted on a **log-log** scale (i.e. both axes in logarithmic format), income being plotted on the x-axis, and propability (normalized number of people with said income) is plotted on the y-axis, the empirical distribution is best modeled by a **straight line** (Fig. 47).

"Nous sommes tout de suite frappé du fait que les points ainsi déterminés, ont une tendance très marquée à se disposer en ligne droite. Disons immédiatement que nous allons retrouver cette tendance dans les nombreux exemples que nous aurons encore à examiner. Un autre fait, tout aussi, et même plus remarquable, c’est que les courbes de la réparti- tion des revenus, en Angleterre et en Irlande, présentent un parallélisme à peu près complet. Ce fait est à rapprocher d’un autre, que nous allons bientôt constater : les inclinaisons des lignes mn, pq obtenues pour différents pays sont peu différentes."

"We are immediately struck by the fact that the points thus determined have a very marked tendency to arrange themselves in a straight line. Let us say immediately that we will find this trend in the many examples that we will still have to examine. Another fact, equally and even more remarkable, is that the income distribution curves in England and Ireland show an almost complete parallelism. This fact is to be compared with another, which we will soon see: the inclinations of the lines mn, pq obtained for different countries are little different."

In [24]:

```
# Source data
incomes = [150, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 10000]
citizens = [400648, 234185, 121996, 74041, 54419, 42072, 34269, 29311, 25033, 22896, 9880, 6069, 4161, 3081, 1104]
# Prepare graph
fig, axs = plt.subplots(1, 2, sharey=False, figsize=(9,4))
fig.suptitle("1893/1894 Income (Great Britain)")
# Linear plot
axs[0].plot(incomes, citizens, 'o', color='r')
axs[0].plot(incomes, citizens, '-', color='g', alpha=0.6, linewidth=3)
axs[0].set_title("linear scale", color = "b")
axs[0].grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
axs[0].set_xlabel("income", color='#999999')
axs[0].set_ylabel("number of citizens", color='#999999')
# Log-log plot
axs[1].set_yscale('log')
axs[1].set_xscale('log')
axs[1].plot(incomes, citizens, 'o', color='r')
axs[1].plot(incomes, citizens, '-', color='g', alpha=0.6, linewidth=3)
axs[1].set_title("log-log scale", color = "b")
axs[1].grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
axs[1].set_xlabel("income", color='#999999')
axs[1].set_ylabel("number of citizens", color='#999999')
plt.show()
```

**Chapeau!** As far as empirical data get, this is a nice correspondence to theory, indeed.

"C'est-à-dire que la courbe réelle est interpolée par une droite dont l'équation est:

"That is, the real curve is interpolated by a line whose equation is:

If we plot our general formulaic PDF on a log-log scale, we get:

In [6]:

```
plt.figure(figsize=(8, 6))
ax = plt.subplot(111)
ax.set_xlim(left=1, right=(x_max/2))
ax.set_yscale('log')
ax.set_xscale('log')
plt.grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
plt.title("Pareto Distribution: Log-Log", color = "b")
# Plot linewise
x=np.linspace(x_min, (x_max/2) , num=1000)
y = pareto_PDF (x, alpha)
ax.plot(x, y, color='b', linewidth=2.0)
ax.set(yticklabels=[])
ax.set(xticklabels=[])
plt.show()
```

Pareto goes on to examine different socieoeconomic subsets:

"La formule générale qui donne les répartions : 1° du revenu total, 2° de la fortune, 3° du produit du travail, est

"The general formula which gives the breakdowns: 1° of total income, 2° of wealth, 3° of work product, is

La constanteαest négative, quand il s’agit du produit du travail ; elle est positive quand il S'agit de la répartition de la fortune ; elle est nulle, ou généralement assez petite, quand il s'agit du revenu total."

The constant

αis negative, when it comes to the product of labor; it is positive when it comes to the distribution of wealth; It is zero, or usually quite small, when it comes to total income."

Now, to what extent do the proverbial **80 / 20** hold up to the Pareto distribution?

In [8]:

```
# Define Cumulative Distribution Function
def pareto_CDF(x, alpha):
y = 1 - ((x_min/x)**(alpha))
return y
```

In [9]:

```
plt.figure(figsize=(10, 6))
ax = plt.subplot(111)
ax.set_xlim(left=1, right=x_max)
plt.grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
plt.xticks(np.arange(0, (x_max+1), step=1), fontsize = 10) # Set label locations.
plt.title("Pareto Distribution: CDF", color = "b")
plt.xlabel("arbitrary units")
plt.axhline(y=0.8, linestyle=":", color='g')
plt.axvline(x=4, linestyle=":", color='r')
plt.axvline(x=20, linestyle=":", color='r')
ax.yaxis.set_major_formatter(mtick.PercentFormatter(1.0, None,'%'))
# Plot
x=np.linspace(x_min, x_max , num=1000)
y = pareto_CDF (x, alpha)
ax.plot(x, y, color='b', linewidth=2.0)
plt.show()
```

If we approach the famous **80/20** from the cumulative distribution, we see that, strictly speaking, it does not make any sense to say that "80 % of the result are achieved with 20 % of the effort":

The Pareto distribution is a so-called long-tailed distribution which asymptotically approaches, but never reaches the maximum until infinity.

Since maximum effort is infinite, "20% of infinity" is not applicable in the real world.

At four arbitrary units of effort, we achieve 80 %. Thus, our four units should correspond to the proverbial 20 % effort. Now, an effort of 5 • 4 = 20 should correspond to the proverbial maximum result.

**We know, that there is no 100 % result achievable with finite effort, but we will see, how close we will come:**

In [10]:

```
print("4: ", pareto_CDF (4, alpha))
print("20: ", pareto_CDF (20, alpha))
```

4: 0.8 20: 0.9691289820417199

The (in)famous **80/20** were derived from Pareto's modeling of some of his empirical socioeconomic data, and are conventionally rendered by ** α = log(5) / log(4)** [~1.16]. However, the Pareto distribution can take many forms, depending on the numeric value of the shape parameter

In [11]:

```
plt.figure(figsize=(10, 6))
ax = plt.subplot(111)
ax.set_xlim(left=1, right=x_max)
plt.grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
plt.xticks(np.arange(0, (x_max+1), step=1), fontsize = 10) # Set label locations.
plt.title("Pareto Distribution: Shape Parameters", color = "b")
plt.xlabel("arbitrary units")
plt.axhline(y=0.8, linestyle=":", color='g')
plt.axvline(x=4, linestyle=":", color='r')
plt.axvline(x=20, linestyle=":", color='r')
ax.yaxis.set_major_formatter(mtick.PercentFormatter(1.0, None,'%'))
alphas_list = np.array([ 0.5, 0.75, 1.16, 2])
# Plot
x=np.linspace(x_min, x_max , num=1000)
i=0
while i < len(alphas_list):
y = pareto_CDF (x, alphas_list[i])
ax.plot(x, y, label=('alpha='+str(alphas_list[i])))
plt.legend()
i=i+1
plt.show()
```

** α**, the cumulative distribution curve can take very different forms, which will impact any numeric abridgements (i.e., instead of 80/20, 90/10 or 70/5, or any other relationship might ensue).

**As a heuristic, the Pareto principle is rather useful. As such, it should be taken with a grain of salt.**

Although originally inspired by Vilfredo Pareto's work, the principle was proposed and coined by Joseph M. Juran in the context of quality control. Unfortunately, the principle has often been subject to indiscriminate application and overgeneralization.

Pareto's own work with regard to the power distribution discovered by him was strictly related to the mathematical modelling of socieoeconomic inequality. He never proposed a general application, nor did he suggest applications in cause/effect or input/output relations, which much later were proposed by Juran.

The synonym "80/20 priciple" should best be eradicated, as there is no finite 100% to which these numbers could be related. Furthermore, even glossing over the asymptote, the Pareto distribution, as well as real life applications can take many different numeric forms. By using the misnomer "80/20 principle", one might be enticed to overfit reality to one's conceptual expectations.

Examples from the business consulting company *Attain Partners | Juran*'s website:

- The top 15 percent of our customers account for 68 percent of our total revenues
- Our top five products or services account for 75 percent of our total sales
- A few employees account for the majority of absences
- In a typical meeting, a few people tend to make the majority of comments, while most people are relatively quiet.

Thus, Juran's own term *vital few / useful many* better captures the scope of application. Even Juran was adamant that, although the principle is helpful for prioritizing efforts, the *useful many* should not be completely discarded.

Applying the **Pareto principle** at face value entails the danger of oversimplifying complex problems, tasks and relationships. As a flexible starting point, as a reminder on the necessity for **prioritizing**, or simply as a wake-up call against wasting limited resources on unimportant details, it is quite helpful. Bearing in mind the asymptotic nature of the cumulative distribution function, it might even serve as a reminder that **perfection** can never be attained, and that striving for perfection will incur an ever increasing **effort**, that verges on the infinite.

Although Vilfredo Pareto did not dream up the principle named after him, he has left behind an impressive body of work of unusually broad scope, which is a controversial, thought-provoking and rewarding read, even 125 years later.

Pareto argued that, if the distribution of wealth were due to

, as some postulated on theoretical or political grounds, it should follow achancenormal distribution('la courbe des erreurs'), which it obviously did not.

In order to emphasise the striking properties of the socieoeconomic data studied by Pareto, we will plot both distributions side by side for clarity.

In [12]:

```
# Import normal distribution method
from scipy.stats import norm
mean, var, skew, kurt = norm.stats(moments='mvsk')
fig, axs = plt.subplots(2, 2, sharey=True, figsize=(7,6))
fig.suptitle("Pareto Distribution vs. Normal Distribution", color = "r")
# Plot upper left: PDF Pareto
axs[0,0].set_xlim(left=0, right=(x_max/2))
axs[0,0].grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
plt.xticks(np.arange(0, ((x_max/2)+1), step=1), fontsize = 10) # Set label locations.
axs[0,0].set_title("PDF: Pareto", color = "b")
# Function
x=np.linspace(x_min, (x_max/2) , num=1000)
y = pareto_PDF (x, alpha)
axs[0,0].plot(x, y, 'b-', lw=2, alpha=1, label='pareto pdf')
# Plot upper right: PDF Normal
axs[0,1].set_xlim(left=-3, right=3)
axs[0,1].grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
plt.xticks(np.arange(-3*var, 3*var, step=var), fontsize = 10) # Set label locations.
axs[0,1].set_title("PDF: Normal", color = "g")
# Function
x = np.linspace(norm.ppf(0.01), norm.ppf(0.99), 100)
axs[0,1].plot(x, norm.pdf(x), 'g-', lw=2, alpha=1, label='norm pdf')
#Plot lower left: CDF Pareto
axs[1,0].set_xlim(left=0, right=x_max/2)
axs[1,0].grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
plt.xticks(np.arange(0, ((x_max/2)+1), step=1), fontsize = 10) # Set label locations.
axs[1,0].set_title("CDF: Pareto", color = "b")
# Function
x=np.linspace(x_min, (x_max/2) , num=1000)
y = pareto_CDF (x, alpha)
axs[1,0].plot(x, y, 'b-', lw=2, alpha=1, label='norm pdf')
#Plot lower right: CDF Normal
axs[1,1].set_xlim(left=-3, right=3)
axs[1,1].set_ylim(bottom=0, top=1.05)
axs[1,1].grid(visible=True, which='both', axis='both', color='darkgrey', linestyle='-', linewidth=0.25)
plt.xticks(np.arange(-3*var, 3*var, step=var), fontsize = 10) # Set label locations.
axs[1,1].set_title("CDF: Normal", color = "g")
# Function
x = np.linspace(norm.ppf(0.01), norm.ppf(0.99), 1000)
axs[1,1].plot(x, norm.cdf(x), 'g-', lw=2, alpha=1, label='norm pdf')
for ax in axs.flat:
ax.set(xlabel=None, ylabel=None)
ax.set(yticklabels=[])
ax.set(xticklabels=[])
plt.show()
```

This notebook is published under the Creative Commons CC BY-NC-SA 4.0 license by mathias.elsner.

You may share and adapt this material for non-commercial purposes under the same license, giving attribution.