Quantcast
Channel: Statalist
Viewing all articles
Browse latest Browse all 73299

Dummy variables regression- avoiding multicollinearity

$
0
0
I am currently running a completely flexibile regression, i.e a regression of y only on dummy variables. It is of very high dimension (more than 200 dummy variables). A snapshot of my data looks like the following:

Code:
clear
input float(Y D1 D2 D3 D4 D5 D6)
 1 1 0 0 0 0 0
 2 0 1 0 0 0 0
 3 1 0 0 0 0 0
 4 0 0 0 0 1 0
56 1 0 0 0 0 0
 1 0 0 0 0 0 1
21 0 0 1 0 0 0
end

where Y is the regressand and D1 is a full set of dummy variables that go on till D200. I have thousands of observations. To avoid the dummy variable trap, I have dropped the constant. Something strange happens- when I force the constant to be dropped, the coefficients are identified, as (X'X) is a full rank matrix. However, when I estimate the model with FE and no constant, it still drops one variable! The thing is, because there are so many observations, even when I drop a variable, the determinant is still close to 0. But the only time it should drop a variable is when the determinant is exactly 0. Any ideas? Thanks!

Viewing all articles
Browse latest Browse all 73299

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>