## Wednesday, July 20, 2011

### Power Law in Wealth Distribution? NO...

Recently I have been teaching myself to estimate power laws from empirical data using the improved techniques developed in Clauset et al (2009) with the hopes of applying them in my own research.  What follows is an example where I apply these techniques to some data on the wealth of the 400 Richest Americans from 2003.  This example is partially covered in one of the co-author's of Clauset et al (2009) (i.e., Cosma Shalizi) lecture notes.

Below, I expand on the example given in those lecture notes by
1. Estimating the power-law threshold parameter, xmin, from the data
2. Constructing standard errors and confidence intervals for the scaling parameter, α, and for the threshold parameter, xmin (that take into account the fact that I have estimated both α and xmin).
3. Testing an alternative hypothesis for the distribution of wealth (i.e, log-normal)
Always plot your data! I think that it is a good idea to start any data analysis with a plot of some description.  I will start this one with a plot of the probability of an individual having wealth greater than wealth (W), against wealth (W)...or more succinctly: a Survival Plot.
If the data follow a power law, then this relationship should be roughly linear.  The converse, however, is not true!  Linearity is not a sufficient condition for a power law.  Also, lots of things might look linear on a log-log plot even when they aren't.

Recipe for Fitting Power Laws:

Parameter Estimation: Following Clauset et al (2009), I estimated the scaling parameter, α, using the method of maximum likelihood whilst using the suggested KS minimization procedure for estimating the threshold parameter, xmin.  Results:
• α: 2.34
• xmin: 9e8 ($900 million) • obs.above.threshold: 302 (out of 400) The code for estimating power laws can be found on Aaron Clauset's website. There are two different implementations of the estimation procedure in R: pareto.fit() and plfit(). I have experimented with both, and although they give identical results for the above estimation, this is not true in general. From what I can tell, the (slight) differences in estimation results stem from differing implementations of the KS test. The pareto.fit() method uses R's ks.test() method, while plfit() hand-codes the KS test. I have not (yet) conducted a Monte-Carlo study to see if the differences between the implementations are worth worrying about. Standard Errors and Confidence Intervals: To calculate the standard errors and confidence intervals for the parameters I used a non-parametric procedure outlined in Cosma's lecture notes. Cosma also graciously provides the code for his lecture notes as well! To calculate the standard errors and 95% confidence intervals for α and xmin, I had to alter (trivially) the provided code to take into account the fact that I was not given the threshold parameter and instead estimated it from the data. Results: • α.se: 0.10 • xmin.se: 364e6 ($364 million)
• α.ci: (2.13, 2.53)
• xmin.ci: (-2e8, 1.05e9)
Obvious one cannot really have negative wealth (at least not at the level of \$-200 million), so just note that the 95% confidence interval for xmin includes zero.

Survival Plot (fitted power-law included):
Looks pretty good!  But while you might be able to rule out a particular model just by looking at it, I think you would be hard pressed to rule a model "in" by "eye-balling it" (unfortunately, I am sure that people do).

Goodness-of-fit Test: Continuing to follow Clauset et al (2009), I again used a Monte Carlo KS test procedure to conduct a goodness-of-fit test for the power law model.  The procedure basically generates synthetic data by re-sampling the data itself below xmin, and then generating simulated data from a power law with the estimated parameters above xmin.   A power-law is then re-estimated to each synthetic data set and the KS statistic is stored.  The result is a p-value that represents the fraction of simulations whose KS statistic was at least as large as the one obtained from fitting the power law model to the wealth data.  A small p-value (i.e., a p-value < 0.10) indicates that only a very small fraction of the simulations had a KS stat as large as the wealth data and is evidence against the plausibility of the power-law model.  Results: p-value of 0.0012!  It would seem that the data reject the power-law as a plausible model.  This is perhaps not surprising...

Testing Alternative Hypotheses:  So the wealth distribution does not follow a power law!  What about log-normal? I test this using the likelihood ratio procedure recommended in Clauset et al (2009)

The null hypothesis for comparing two non-nested distributions is that both distributions are equally far (in a Kullback-Leibler relative entropy sense...note to self: understand what the hell Kullback-Leibler relative entropy means!) from one another.  I report the Vuong test statistic and the two-sided p-value:
• Vuong Statistic: 0.21
• p-value: 0.83
Conclusion: fail to reject the null hypothesis that the power-law and the log-normal distributions are equally bad (again in the Kullback-Leibler relative entropy sense).

Other alternatives (for which  Clauset et al (2009) publish results) are the exponential, stretched exponential (Weibull), and the power-law with exponential cut-off.  I continue to have trouble getting the code for fitting the exponential distribution to work (keep getting an error in the nlm() method for minimizing the negative log-likelihood function...it keeps returning non-finite values...which makes me think that it is probably user error on my part).  I have also not yet figured out how to get the code for estimating the power-law with exponential cutoff to work.  This is at the top of my to-do list as I feel it is a very plausible alternative distribution (perhaps not necessarily for wealth...but definitely for some other data sets that I am working with at the moment).

I can however report results for the stretched exponential (Weibull).  I was able to get the code to work after I learned how to suppress warnings in R (i.e., using the suppressWarnings() method).  Again I report the Vuong test statistic and the two-sided p-value:
• Vuong Statistic: 8.05
• p-value: 0.00
The fact that the sign on the Vuong Statistic is positive indicates that the log-likelihood for the power-law model was greater than the log-likelihood for the stretched exponential (Weibull).   The test statistic of 8.05 is so large that the p-value is actually 8.88e-16!  Thus we can reject the stretched exponential in favor of the power-law for this data set (but given that we have already rejected the power law...this test is somewhat superfluous).  I hope I have interpreted the above test correctly.  If anyone feels otherwise definitely speak up!

The real objective of this post was to replicate results from fitting a power-law to empirical data (in a rigorous way that does not abuse linear regression), so that I feel confident in applying these tools in my own research.  Thus I purposefully did not stray too far from Clauset et al (2009).  I think I succeeded (at least in replicating the work of Clauset et al (2009) for this data set).  Hopefully, I have not made any glaring errors...but if I did, they are my own!