Thursday, July 28, 2011

Some Confused Hypothesis Testing...

Returning to a previous post on fitting a power-law to Metropolitan Statistical Area (MSA) population data.  First, I reported results the following inferences based on Vuong LR tests (see previous post for Vuong stats and p-values):
  • Two Sided Tests (Null hypothesis is that both distributions are equally far from the "truth"):
    • Fail to reject null hypothesis for power-law and log-normal.
    • Reject the null hypothesis for the Weibull.
    • Reject the null hypothesis for the exponential.
  • One-sided Tests (Null hypothesis is a power-law):
    • Fail to reject the null hypothesis of power-law compared with Weibull.
    • Fail to reject the null hypothesis of a power-law compared with an exponential.
  • Vuong LR test for nested-models (used only for comparing a power-law and a power-law w/ exponential cut-off):
    • Reject null hypothesis of a power-law in favor of a power-law with exponential cut-off.
Additionally, if you compare the following log-likelihoods for the various models, you would also select the power-law with exponential cut-off as being the preferred model:
  • Power-law with exponential cut-off: -2143.809
  • Log-normal: -2144.513
  • Power-law: -2146.368
  • Weibull: -2173.089
  • Exponential: -2182.766
Later (after starring at the above plot for quite awhile and convincing myself that the log-normal "looks" better than the power-law with cut-off) I came back to my work, and decided to try and write some code to conduct a non-nested Vuong LR test to compare a power-law with exponential cut-off and a log-normal distribution.  I then reported that I was able to rejected the null hypothesis for the two-sided test (that both the power-law with exponential cut-off and log-normal distributions are equally far from the "truth"); and reject the null-hypothesis of a power-law w/ exponential cut-off for the one-sided test in favor of the log-normal.

I may have been a bit hasty in that conclusion.  Upon further review, I think that there is either a bug in my code, or that I have made some conceptual error in implementing the test.  For one, I certainly should have noticed that the p-values for both tests were suspiciously small (they were both 0.00!) given the difference in their respective LR of less than 1! 

As always comments are very much encouraged...

No comments:

Post a Comment