The above plot below shows the empirical upper cumulative distribution of the absolute values of normalized daily returns taken from CRSP for the constituents of the Russell 1000 index (i.e., largest 1000 U.S. companies as measured by market capitalization) at the end of Q2 2011 (roughly 5.7 million observations). I normalize the returns of each constituent stock so that the normalized returns have mean of 0 and a standard deviation of 1. Specifically, I first construct a daily logarithmic return series, ri, for stock i using the closing price (adjusted for splits and dividends!). I then define the normalized return on date t as follows:
r'i,t = (ri,t - ri) / σi
where ri,t is the logarithmic return on date t, ri is the mean and σi is the standard deviation of the daily logarithmic returns series for stock i. This plot is my version, using CRSP data instead of TAQ data (only because my university does not subscribe to TAQ!), of Figure 1 in Gabaix et al (Nature, 2003) and Figure I in Gabaix et al (QJE, 2006).
I have several issues with this plot. First, Gabaix et al (Nature, 2003) and Gabaix et al (QJE, 2006) lay out a theory about the tail of the return distribution of individual stocks, not a population of stocks. The above plot suggests that the tails of the return distribution of a population of stocks (in some average sense?), might be well approximated by a power-law. It suggests nothing about how well the power-law model approximates reality for each on the constituent stocks individually.
Second, in plotting the empirical upper cumulative distribution of the absolute value of returns, one is assuming that the negative and positive tails are well approximated, by the same probability distribution. But why should this be true? My prior would be that the economic mechanism generating large negative returns, although perhaps related, would fundamentally different than the mechanism generating large positive returns. Specifically, my prior would be that there would be, on average, negative skewness in the return distribution: the negative tail of stock returns would be "heavier" than the positive tail (on average at least).1 Gabaix et al (QJE, 2006) makes the claim that the power-law holds for both the positive and negative tails separately based on previous empirical research: Gopikrishnan et al (Physical Review E, 1999) and Plerou et al (Physical Review E, 1999) among others. Gopikrishnan et al (Physical Review E, 1999) analyse the return distributions of various global stock market indices and find evidence of power-law tails with estimated scaling exponents of ζ ≈ 3 (or α ≈ 4) for both tails. Plerou et al (Physical Review E, 1999) analyse the positive and negative tails of the distribution of combined normalized returns for 1000 U.S. companies and report significantly different estimates of the scaling exponent for the positive and negative tails. Plerou et al (Physical Review E, 1999) also report significant variation in estimated scaling exponents across stocks for both the positive and negative tails around an "average" value of ζ ≈ 3 (or α ≈ 4). N.B. the estimates of the scaling exponents relied predominately on OLS methods, with occasional use of the Hill estimator popular in the quantitative finance literature. No tests of goodness-of-fit of the power-law model.
Finally, it seems a bit odd to simply combine all of the normalized returns for all stocks listed on the Russell 1000 index and analyse the distribution of these returns "as if" they represented returns of a single company. Especially given that you can not treat returns from different companies as being independent as there are likely complex dependencies between returns of different stocks on any given day t. Plerou et al (Physical Review E, 1999) seems to be a source of this practice (which they justify based on the visual similarities between empirical distribution functions and in order to get "better statistics").
Just for kicks, I decided to apply the now familiar Clauset et al (SIAM, 2009) method for fitting power-law model to the above data. First, I consult the Oracle (which in this case was the same Oracle consulted by Gabaix et al (Nature, 2003)), and am told that 2 is a reasonable threshold for the power-law model. Maximum likelihood estimation (AKA the Hill estimator) yields the following estimates of of the scaling exponent:
α = 3.943(5); α+ = 3.976(7); α- = 3.908(7)
where the numbers in parenthesis indicate the amount of uncertainty in the final digit. The figure below shows the best-fit power-law using the Gabaix et al (Nature, 2003) threshold.
I use a parametric bootstrap to estimate 95% confidence intervals for the above estimates (taking a value of the threshold parameter of 2 as given):
95% CI for α is (3.932, 3.953); α+: (3.963, 3.988); α-: (3.894, 3.922)
All three confidence intervals exclude α=4. The estimated scaling exponent for the negative tail is less than the estimate for the positive tail, and the confidence intervals for α+ and α- are disjoint (a sufficient condition to insure that the estimates are significantly different), implying that the negative tail of the return distribution is significantly "heavier" than the positive tail. Are these differences in parameter estimates economic meaningful? I argue they are.
Suppose that you are a portfolio manager. You hold all 1000 stocks on the Russell 1000 in your portfolio, and are interested in estimating the probability that at least one of the stocks in your portfolio will experience a negative return of at least 80 standard deviations in magnitude.2 First suppose that you are well versed in the economics/econo-physics literature on power-laws and believe that the scaling exponent is definitely α = 4. Using an estimate for the scaling exponent of α = 4, the probability of such an event is roughly 8.63e-7. Now suppose you are well versed in the statistical literature on actually estimating power-laws from data (which means that you also know better than to abuse OLS!) and are able to estimate α = 3.943. In this case the probability of such an event occurring is roughly 1.06e-6. Finally, suppose that you know enough stats to estimate the scaling exponent correctly and enough economics to suspect that you should treat negative and positive returns differently. You estimate a scaling exponent for the negative tail of α- = 3.908, and from this you estimate the probability of observing a negative 80 standard deviation event to be roughly 1.21e-6. If probability of observing a negative 80 standard deviation return using α- = 3.908 is roughly 14% larger than the estimated probability obtained using α = 3.943 and is roughly 40% larger than the estimated probability obtained using α = 4.3 The point is that small changes in the estimate of the scaling parameter can lead to large changes in out-of-sample forecasting, and that these large differences in predicted probabilities are economically meaningful.
So I have fit a power-law model to the data and used the results to argue against applying a "generic" power-law model to both tails of the return distribution. Fine. But, I have said nothing about goodness-of-fit. Perhaps the power-law is a rubbish fit to the data, in which case the above probabilities are likely over-estimates of the true probability of a negative 100 standard deviation event. Using the same KS goodness-of-fit testing procedure advocated in Clauset et al (SIAM, 2009) method I find p-values of 0, 0, and 0 using 2500 replications in each case. Looking into the bowels of the goodness-of-fit test, under the power-law null hypothesis the typical KS distance between the model and the synthetic data is over an order of magnitude smaller than the KS distance between the observed data and the best-fit power-law model (which was D=0.01994). It would seem that the power-law is just not a good fit.
If the power-law model is a poor fit to the data, then what better models might there be? How about a log-normal? Using likelihood ratio tests I am able to reject the power-law in favour of the log-normal ( Vuong statistic: -12.85; p-value: 8.09e-38!).
In case you are wondering, the probability of observing an 80 standard deviation event under the best-fit log-normal distribution is 6.89e-9. Using the negative tail of the return distribution, an 80 standard deviation event is 175 times more likely under the best-fit power-law model than under the best-fit log-normal!
At some level, my biggest complaint about the figures in Gabaix et al (Nature, 2003) and Gabaix et al (QJE, 2006) is that they unintentionally encourage readers to engage in a kind of fallacy of division, and in doing so make it seem that the support for the power-law hypothesis is stronger than is clearly justified. I plan to follow with additional posts related to my own work in this area (much of which forms the first chapter of my PhD thesis). The next post will see how (if!) the results change when I apply methods from Clauset et al (SIAM, 2009) to select the optimal threshold (instead of consulting the Oracle). Comments and constructive criticism are always welcome...
1One plausible economic mechanism for generating asymmetry in return distributions would be some sort of leverage effect. For example, suppose that an investor is trading on margin using his current asset holdings as collateral. If the value of his collateral drops, he may be faced with a margin call which might force him to sell some of his assets to reduce his leverage. This "fire-sale" of assets will likely further depress asset prices. Meanwhile, if the value of the underlying collateral rises then this investor is relatively un-constrained: he may choose to borrow more (against his now more valuable collateral) or he may simply choose to consume some of the proceeds, etc. The basic idea is that if collateral values drop, and you are a highly levered investor, then your behaviour may be significantly constrained; while if you are highly levered, and the value of your underlying collateral rises, then you are relatively un-constrained. I would suspect that this type of behavioural asymmetry would generate asymmetry in the return distributions of assets.
2Gabaix et al (QJE, 2006) reports observing returns of over 80 standard deviations in magnitude in the TAQ data set. I just wanted some reasonable number that was larger than any return I observe in my data.
3Although I think that these differences are suggestive, it would be better if I could translate these differences in probabilities into differences in something like bank capital requirements.
No comments:
Post a Comment