This is my first cut of lecture notes on the Geometry of Linear Regression...FYI the b and beta are the same...having html issues. Hopefully I have not made any egregious errors...
The Geometry of Linear Regression
Suppose we have the following system of equations:
y=Xb
Here the dependent variable y is a vector of length m, X is our (m x n) matrix (i.e., m rows and n columns, typically m>n) of independent variables, b is a vector of coefficients of length n. Why are we going to start by talking about the geometry of solutions to systems of linear equations? Well, because at a fundamental level linear regression is really all about "solving" a system of linear equations when there is no true solution. Linear regression finds a solution b to our system of equations that is the "best" because it is "closest" in a very specific way to the vector y.
Now our system of m equations with n unknowns (the n coefficients which comprise the vector b) tells us that the vector y (our dependent variable) is a linear combination of the columns of X (our independent variables)….
Now our system of m equations with n unknowns (the n coefficients which comprise the vector b) tells us that the vector y (our dependent variable) is a linear combination of the columns of X (our independent variables)….
y= b1x1 + b2x2 + … + bnxn
Here xi i=1,…n are the column vectors of length m that make up the matrix X. This means that the vector y is in the column space, col(X), of our matrix X. In pictures with 2 independent variables…notice that the our independent variable, the vector y, lies in the plane corresponding to the col(X)
Remember from its definition that the col(X) is the vector space spanned by the column vectors of X, which is simply a fancy way of saying that the col(X) includes all linear combinations of the column vectors of X (which includes y at this point). If the column vectors, our dependent variables, also happen to be linearly independent of one another then our column vectors form a basis for the col(X). Normally this will be the case…but it is crucial that our set of dependent variables be independent of one another!
If we have nice case: X is an (m x n) matrix with m>n and that our columns of X, which span the col(X) by definition, are linearly independent of one another and thus also form a basis for the col(X). This implies that the rank of X (which as you will remember is simply the number of linearly independent columns of X) and the dimension of col(X) (which is simply the number of vectors needed to form the basis of col(X)) are both equal to n. Our matrix has full column rank! We are off to a good start…
Let’s Talk About Correlation…
Geometrically, correlation between two variables (which we are representing as vectors) is related to the angle between two variables/vectors via the following formula…
Cosine! Theta! Dot products and Euclidian Norms! Boo! Let’s draw pictures…In this first picture our two independent variables are positively (negatively) correlated because the angle between their two corresponding vectors in the col(X) is acute (obtuse). I draw the positively correlated case below…
In this second picture, the two vectors are at right angles with one another and are therefore uncorrelated. This is an extremely important case…when you are learn about OLS, IV and GLS the question of whether or not your error term is uncorrelated with your explanatory (i.e., independent) variables will come up again and again…remember, geometrically, uncorrelated mean vectors at right angles!
Finally what does it look like if the two vectors are perfectly positively (negatively) correlated with one another? Although I will leave it up to you to draw your own picture, for the perfectly positively correlated case look at the picture of the acute case and think about what happens as the angle gets really, really small. Once you figure that out and get your picture, the perfectly negatively correlated case is simply the 180-degree (hint) opposite…
No comments:
Post a Comment