Quantcast
Channel: Statalist
Viewing all articles
Browse latest Browse all 73268

How to first difference a panel data set with many dummy variables?

$
0
0
Dear all,

I am analyzing the impact of 3rd and 4th division soccer teams (1), their stadiums (2), and their affiliation to 1st & 2nd division teams VS. independence (3) on per capita GDP on county-level. My data set (strongly balanced) includes 266 counties from 1995-2012 with around 30 independent variables (many of them dummies). I am using a linear reduced form model:
yit= β1 Xit+ β2 Zit+ ϑi+ μt+ εit

yit is the per capita GDP in county i at time t; β1 is the corresponding vector of parameters to be estimated
Xit is a vector of local market variables for each county i at time t; β2 is the corresponding vector of parameters to be estimated
Zit is a vector of third and fourth league team as well as stadium variables in county i at time t
ϑi is a county i specific fixed effect
μt is a time t specific fixed effect
εit is a random disturbance

Since the data set is heteroskedastic, autocorrelated, shows contemporaneous correlation and includes a lagged dependent variable, I thought that taking first differences would eliminate autocorrelation, explicit fixed effects and the correlation of the lagged dependent variable with the disturbances. Then I would run the command xtpcse which, I think, accounts for heteroskedasticity and contemporaneous correlation. As first differencing (and then symplifying) the model above doesn't change the parameters, I would just interprete them like before first-diffrencing.

Questions:
(a) Is there anything to argue about my approach from an econometrics (and/or statistics) point of view?
(b) Can first-differencing be done with binary variables? Intuitively, this isn't as easy as it seems. I did some research but couldn't find an entirely satisfying answer.
(c) What are the Stata commands to get first-differences? All I found seems to violate the boundaries of each panel; i.e. the last year of county 1 seems to be substracted from the first year of county 2 and so on.
(d) Concerning the command xtpcse, which of the options (correlation(ar1) and correlation(psar1) ) is suitable for which type of data? The Stata manual wasn't really a help to me here.

Best regards,
Alex


Note: I am using Stata 12.

Viewing all articles
Browse latest Browse all 73268

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>