When I was looking at data on the wealth of UK universities recently I plotted the value of institutions net assets on the y-axis using a log scale. Using a log scale means that equal differences refer to a constant multiplicative factor, rather than, as is more usually the case, a constant additive factor. For example, the distance on the y-axis from 500000 to 1000000 is the same as the distance from 1000000 to 2000000. It is standard to use a log scale when there are a significant number of observations which are more than three standard deviations above the mean. There are quite a few variables which have this type of distribution with lots of low values but only a few high values, including household wealth, city populations and the number of citations for academic papers.
Distributions which have a long tail are almost always the result of processes which are multiplicative rather than additive. Phenomenon which result from multiplicative processes can be modelled using a power law distribution which has the form:
where c and α are constants. In the social sciences, variables that have a power law distribution are often produced by what are termed processes of cumulative advantage. These refer to processes in which patterns of cumulative causation lead to small initial differences between people or between institutions becoming magnified over time. Multiplicative phenomena may also be described, however, using the lognormal distribution. The log-normal distribution is the natural way to model phenomena, like investment funds, which grow by a small multiplicative factor each period.
The difference between the type of process that lead to power law and lognormal distributions can be illustrated with an example. If we put a given sum of money in the bank and reinvest the interest, the amount by which the sum accumulates depends on the initial investment. If the interest rate is 10 percent then a sum of £1000 grows by £100 but a £10000 sum grows by £1000. The difference in the value of the two investments grows exponentially over time from £9900 at the start to £14400 after 5 years and £23300 after 10 years. The ratio of the value of the two investments remains constant over time (at 10:1), however. If instead the interest rate depends on the initial sum (for example, 5 percent interest on £1000 but 10 percent interest on £10000) then both the difference and the ratio of the value of the two investments would grow over time.
One way for determining whether a variable follows a power law or a lognormal distribution is to plot P[X > x] on a log-log scale. Both power law and lognormal distributions should follow a straight line but the power law distribution should have a shallower slope because of its longer tail (see Mitzenmacher 2005). For an empirical example, I decided to look at whether the size of universities assets followed either a power law or lognormal distribution which might tell us something about the kind of processes influencing inequalities in higher education. I used data from 2013 for the value of the endowment funds for US and Canadian higher education institutions (www.nacubo.org). The US institution with the largest endowment fund is Harvard ($32 billion) followed by Yale ($20 billion) and the University of Texas ($20 billion). In contrast, over 90 percent of institutions have an endowment fund with a value of less than $1 billion. The figure above shows a log-log plot of P[X > x] for the endowment funds of US institutions with fitted lines for both a power law distribution and a lognormal distribution. The figure suggests that the lognormal distribution is a better fit than the power law distribution. Although the universities with the largest endowments are wealthier than other institutions, inequalities in endowments between institutions seem to be described by differences in inherited wealth and constant proportional growth, rather than by the ability of the most wealthy institutions to benefit preferentially from their financial assets.