## Sunday, February 26, 2012

### Can a better choice of threshold save the power-law?

In my previous post, I provided evidence against the power-law as a plausible model for the tails of equity returns.  In my analysis I took as given that the "true" value of the threshold parameter for the power-law model was 2 in all three cases.  In this post I am going to do away with the assumption of a threshold of 2 and see if it changes the analysis in anyway.  After all it could be that  Gabaix et al (Nature, 2003) consulted a sub-standard Oracle who gave them bad information regarding the true power-law threshold!

I will now re-estimate α while choosing the threshold parameter to minimize the KS distance between the data and a true power-law as suggested in Clauset et al (SIAM, 2009).  As a baseline, the KS distance for the best-fit power-law model from my previous post where xmin =2 were D=0.01994, 0.0293, and 0.0199 for the combined, positive, and negative tails, repsectively.

The following are my new parameter estimates for α and xmin along with a minimal KS distances, D, using the Clauset et al (SIAM, 2009) procedure to choose the threshold parameter:

α = 4.51(2); xmin = 4.282; D = 0.00836
α+ = 5.19(7); x+min = 6.267; D = 0.0141
α- = 4.25(2); x-min = 4.112; D = 0.00758

Again, the numbers in parenthesis indicate the amount of uncertainty in the final digit, and again I use a parametric bootstrap to estimate 95% confidence intervals for the above estimates of the scaling exponent this time taking the estimated value of the threshold parameter as given.1

95% CI for α is (4.477, 4.551);  α+: (5.067, 5.325); α-: (4.207, 4.298)

These estimates for the scaling exponents are wildly different then those reported in Gabaix et al (Nature, 2003) and there is now a huge difference between the estimated scaling exponents for the positive and negative tails of the combined return distributions!  Looks like the Oracle gave Gabaix et al (Nature, 2003) a value for the threshold parameter that was much too low!  The result: significantly biased estimates of the scaling exponents in all three cases.  Here is a plot of the best-fit power-law model for the negative tail...
The next, and perhaps most important, question to ask is whether or not this improved estimate of the scaling exponent alters the results of the goodness-of-fit tests. For the combined tails using 2500 replications, the p-value for the KS goodness-of-fit test is 0.0004; for the positive tail only the p-value is 0.1472; finally, for the negative tail the p-value is 0.0644.  Thus the power-law is rejected as plausible for the combined tails, remains plausible for the positive tail of the distribution, and is border-line rejected for the negative tail of the distribution.2

Finally, when I test the power-law model against a log-normal alternative using likelihood ratio tests, I fail to reject the two-sided null hypothesis (i.e., that both the power-law and the log-normal are "equally far" from the truth) when I combine both the positive and negative tails into a single data set, and when I analyze the positive tail separately.  More plainly, given the data at hand I simply can not distinguish between the power-law model and the log-normal in either of these cases.

However, for the negative tail of the return distribution, I am still able to distinguish between the power-law and the log-normal (Vuong statistic: -2.17, two-sided p-value: 0.03), and reject the power-law in favor of the log normal (one-sided p-value: 0.015).

At this point I feel like I need to reiterate that combining all of the returns from all equities listed on the Russell 1000 index and analyzing the distribution as if all of the observations came from a single company is of questionable value, and that the use of plots of the combined tails in Gabaix et al (Nature, 2003) and Gabaix et al (QJE, 2006) encourage readers of those papers to conclude that the support for the power-law model is much stronger than is justified (at least by my analysis).  In future posts I will assess support for the power-law model by analyzing each tail of each equity separately.  You might guess that the power-law does not come out well...and you would be right!

1Note that these confidence intervals are likely too narrow as they ignore the uncertainty in my estimate of the threshold parameter (it would be ideal to use non-parametric bootstrap to derive standard errors and confidence intervals for the above estimates, but alas my 4 year old MacBook is too slow to handle so much data!).
2It is worth noting that although the power-law is a plausible model for the positive tail of the return distribution, there are comparatively few, only 4178 observations, above the optimal threshold. For comparison, the best-fit power-law for the negative tail had almost 19,000 observations above its optimal threshold.

## Thursday, February 23, 2012

### Power-laws, equity returns, and the fallacy of division...

This recent post from the Three-Toed Sloth about led me to this Science article by Mason Porter et al provided much needed incentive to break from preparing to present my first-year research paper to write this post questioning the "stylized fact" of power-law tails in equity return distributions...
The above plot below shows the empirical upper cumulative distribution of the absolute values of normalized daily returns taken from CRSP for the constituents of the Russell 1000 index (i.e., largest 1000 U.S. companies as measured by market capitalization) at the end of Q2 2011 (roughly 5.7 million observations).  I normalize the returns of each constituent stock so that the normalized returns have mean of 0 and a standard deviation of 1.  Specifically, I first construct a daily logarithmic return series, ri, for stock i using the closing price (adjusted for splits and dividends!).  I then define the normalized return on date t as follows:

r'i,t = (ri,t - ri) / σi

where ri,t is the logarithmic return on date t, ri is the mean and σi is the standard deviation of the daily logarithmic returns series for stock i.  This plot is my version, using CRSP data instead of TAQ data (only because my university does not subscribe to TAQ!), of Figure 1 in Gabaix et al (Nature, 2003) and Figure I in Gabaix et al (QJE, 2006).

I have several issues with this plot.  First, Gabaix et al (Nature, 2003) and Gabaix et al (QJE, 2006) lay out a theory about the tail of the return distribution of individual stocks, not a population of stocks.  The above plot suggests that the tails of the return distribution of a population of stocks (in some average sense?), might be well approximated by a power-law.  It suggests nothing about how well the power-law model approximates reality for each on the constituent stocks individually.
Second, in plotting the empirical upper cumulative distribution of the absolute value of returns, one is assuming that the negative and positive tails are well approximated, by the same probability distribution.  But why should this be true?  My prior would be that the economic mechanism generating large negative returns, although perhaps related, would fundamentally different than the mechanism generating large positive returns.  Specifically, my prior would be that there would be, on average, negative skewness in the return distribution: the negative tail of stock returns would be "heavier" than the positive tail (on average at least).1  Gabaix et al (QJE, 2006) makes the claim that the power-law holds for both the positive and negative tails separately based on previous empirical research: Gopikrishnan et al (Physical Review E, 1999) and Plerou et al (Physical Review E, 1999) among others.  Gopikrishnan et al (Physical Review E, 1999)  analyse the return distributions of various global stock market indices and find evidence of power-law tails with estimated scaling exponents of ζ ≈ 3 (or α ≈ 4) for both tails.  Plerou et al (Physical Review E, 1999) analyse the positive and negative tails of the distribution of combined normalized returns for 1000 U.S. companies and report significantly different estimates of the scaling exponent for the positive and negative tails.  Plerou et al (Physical Review E, 1999) also report significant variation in estimated scaling exponents across stocks for both the positive and negative tails around an "average" value of ζ ≈ 3 (or α ≈ 4).  N.B. the estimates of the scaling exponents relied predominately on OLS methods, with occasional use of the Hill estimator popular in the quantitative finance literature.  No tests of goodness-of-fit of the power-law model.

Finally, it seems a bit odd to simply combine all of the normalized returns for all stocks listed on the Russell 1000 index and analyse the distribution of these returns "as if" they represented returns of a single company.  Especially given that you can not treat returns from different companies as being independent as there are likely complex dependencies between returns of different stocks on any given day tPlerou et al (Physical Review E, 1999) seems to be a source of this practice (which they justify based on the visual similarities between empirical distribution functions and in order to get "better statistics").

Just for kicks, I decided to apply the now familiar Clauset et al (SIAM, 2009) method for fitting power-law model to the above data.  First, I  consult the Oracle (which in this case was the same Oracle consulted by Gabaix et al (Nature, 2003)), and am told that 2 is a reasonable threshold for the power-law model.  Maximum likelihood estimation (AKA the Hill estimator) yields the following estimates of of the scaling exponent:

α = 3.943(5); α+ = 3.976(7); α- = 3.908(7)

where the numbers in parenthesis indicate the amount of uncertainty in the final digit.  The figure below shows the best-fit power-law using the Gabaix et al (Nature, 2003) threshold.

I use a parametric bootstrap to estimate 95% confidence intervals for the above estimates (taking a value of the threshold parameter of 2 as given):

95% CI for α is (3.932, 3.953);  α+: (3.963, 3.988); α-: (3.894, 3.922)

All three confidence intervals exclude α=4. The estimated scaling exponent for the negative tail is less than the estimate for the positive tail, and the confidence intervals for α+ and α- are disjoint (a sufficient condition to insure that the estimates are significantly different), implying that the negative tail of the return distribution is significantly "heavier" than the positive tail.  Are these differences in parameter estimates economic meaningful?  I argue they are.

Suppose that you are a portfolio manager.  You hold all 1000 stocks on the Russell 1000 in your portfolio, and are interested in estimating the probability that at least one of the stocks in your portfolio will experience a negative return of at least 80 standard deviations in magnitude.2  First suppose that you are well versed in the economics/econo-physics literature on power-laws and believe that the scaling exponent is definitely α = 4. Using an estimate for the scaling exponent of  α = 4, the probability of such an event is roughly 8.63e-7.  Now suppose you are well versed in the statistical literature on actually estimating power-laws from data (which means that you also know better than to abuse OLS!) and are able to estimate α = 3.943.  In this case the probability of such an event occurring is roughly 1.06e-6.  Finally, suppose that you know enough stats to estimate the scaling exponent correctly and enough economics to suspect that you should treat negative and positive returns differently.  You estimate a scaling exponent for the negative tail of α- = 3.908, and from this you estimate the probability of observing a negative 80 standard deviation event to be roughly 1.21e-6.  If probability of observing a negative 80 standard deviation return using α- = 3.908 is roughly 14% larger than the estimated probability obtained using α = 3.943 and is roughly 40% larger than the estimated probability obtained using α = 4.3  The point is that small changes in the estimate of the scaling parameter can lead to large changes in out-of-sample forecasting, and that these large differences in predicted probabilities are economically meaningful.

So I have fit a power-law model to the data and used the results to argue against applying a "generic" power-law model to both tails of the return distribution.  Fine.  But, I have said nothing about goodness-of-fit.  Perhaps the power-law is a rubbish fit to the data, in which case the above probabilities are likely over-estimates of the true probability of a negative 100 standard deviation event. Using the same KS goodness-of-fit testing procedure advocated in Clauset et al (SIAM, 2009) method I find p-values of 0, 0, and 0 using 2500 replications in each case.  Looking into the bowels of the goodness-of-fit test, under the power-law null hypothesis the typical KS distance between the model and the synthetic data is over an order of magnitude smaller than the KS distance between the observed data and the best-fit power-law model (which was D=0.01994).  It would seem that the power-law is just not a good fit.

If the power-law model is a poor fit to the data, then what better models might there be?  How about a log-normal?  Using likelihood ratio tests I am able to reject the power-law in favour of the log-normal ( Vuong statistic: -12.85; p-value: 8.09e-38!).
In case you are wondering, the probability of observing an 80 standard deviation event under the best-fit log-normal distribution is 6.89e-9.   Using the negative tail of the return distribution, an 80 standard deviation event is 175 times more likely under the best-fit power-law model than under the best-fit log-normal!

At some level, my biggest complaint about the figures in Gabaix et al (Nature, 2003) and Gabaix et al (QJE, 2006) is that they unintentionally encourage readers to engage in a kind of fallacy of division, and in doing so make it seem that the support for the power-law hypothesis is stronger than is clearly justified.  I plan to follow with additional posts related to my own work in this area (much of which forms the first chapter of my PhD thesis).  The next post will see how (if!) the results change when I apply methods from Clauset et al (SIAM, 2009) to select the optimal threshold (instead of consulting the Oracle).  Comments and constructive criticism are always welcome...

1One plausible economic mechanism for generating asymmetry in return distributions would be some sort of leverage effect.  For example, suppose that an investor is trading on margin using his current asset holdings as collateral.  If the value of his collateral drops, he may be faced with a margin call which might force him to sell some of his assets to reduce his leverage.  This "fire-sale" of assets will likely further depress asset prices.  Meanwhile, if the value of the underlying collateral rises then this investor is relatively un-constrained: he may choose to borrow more (against his now more valuable collateral) or he may simply choose to consume some of the proceeds, etc.  The basic idea is that if collateral values drop, and you are a highly levered investor, then your behaviour may be significantly constrained; while if you are highly levered, and the value of your underlying collateral rises, then you are relatively un-constrained.  I would suspect that this type of behavioural asymmetry would generate asymmetry in the return distributions of assets.
2Gabaix et al (QJE, 2006) reports observing returns of over 80 standard deviations in magnitude in the TAQ data set.  I just wanted some reasonable number that was larger than any return I observe in my data.
3Although I think that these differences are suggestive, it would be better if I could translate these differences in probabilities into differences in something like bank capital requirements.