Quantcast
Channel: Statalist
Viewing all articles
Browse latest Browse all 72764

Regressing separately on state, or combined in one regression gives different estimates (Diff-in-Diff)

$
0
0
Dear all,

I'm doing research to the effect of coffeeshops. The problem I encouter is the following:

When I use the following code, I expect that the treatment effect is given for every municipality
Code:
//INPUT:
generate MunicipalityXTreatment = MunicipalityCode if coffeeshop1 == 1
replace MunicipalityXTreatment = 0 if missing(MunicipalityXTreatment)

set matsize 800
set more off
local housecharacteristics i.HouseSubType OpenPorch LNHouseSize nRooms nFloors Pool Garden i.ParkFacility i.MaintenanceIn i.MaintenanceEx Monument ///
CostsForSeller ConstrPer_1906_1930 ConstrPer_1931_1944 ConstrPer_1945_1959 ConstrPer_1960_1970 ConstrPer_1971_1980 ConstrPer_1981_1990 ///
ConstrPer_1991_2000 ConstrPer_after2000

areg LNPrice i.MunicipalityXTreatment `housecharacteristics' i.Year i.Month if control5000 == 1, absorb(PC4) cluster(PC4)
whereby 'coffeeshop1' is the treatment effect.

The results are the following:
Code:
//OUTPUT:
Linear regression, absorbing indicators    Number of obs   =     416052
    F( 115,    313) =     951.73
    Prob > F        =     0.0000
    R-squared       =     0.9021
    Adj R-squared   =     0.9020
    Root MSE        =     0.1889

    (Std. Err. adjusted    for 314 clusters in PC4)
        
Robust
LNPrice       Coef.   Std. Err.      t    P>t    [95% Conf. Interval]
        
MunicipalityXTreatment
114     .1578126   .0245783     6.42   0.000    .109453    .2061723
202     .0160582    .024453     0.66   0.512    -.0320548    .0641713
268    -.0124073   .0126619    -0.98   0.328    -.0373206     .012506
307      .040688   .0224847     1.81   0.071    -.0035523    .0849282
344    -.0128316   .0166819    -0.77   0.442    -.0456545    .0199913
363     -.083449   .0200483    -4.16   0.000    -.1228955   -.0440025
518    -.0376285   .0114111    -3.30   0.001    -.0600807   -.0151764
687     .0131509   .0097345     1.35   0.178    -.0060024    .0323043
748     .0434363   .0139777     3.11   0.002    .0159342    .0709384
855      .077392   .0228927     3.38   0.001    .032349    .1224351
983     .1277879   .0208445     6.13   0.000    .0867748    .1688011
1674     .0602467   .0103444     5.82   0.000    .0398933    .0806001
                                       
//I have not inserted the whole list of estimates
                                      
_cons    7.610939   .1025506    74.22   0.000    7.409163    7.812714
        
PC4    absorbed    (314 categories)
But when I regress every municipality separately, the results are much different as can be seen below:

Code:
//INPUT:
areg LNPrice i.MunicipalityXTreatment `housecharacteristics' i.Year i.Month if control5000 == 1 & MunicipalityCode == 363, absorb(PC4) cluster(PC4)

//OUTPUT:
Linear regression, absorbing indicators           Number of obs   =     122129
F(  62,     62) =          .
Prob > F        =          .
R-squared       =     0.9318
Adj R-squared   =     0.9318
Root MSE        =     0.1652

(Std. Err. adjusted for 63 clusters in PC4)

Robust
LNPrice       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]

363.MunicipalityXTreatment   -.0273632   .0174325    -1.57   0.122    -.0622103    .0074839
                                       
//Again not all estimates are copied, for simplicity reasons. Controls are the same for every regression.
                                       
_cons    7.186877   .1328596    54.09   0.000     6.921294    7.452459

PC4    absorbed                                      (63 categories)





INPUT:
areg LNPrice i.MunicipalityXTreatment `housecharacteristics' i.Year i.Month if control5000 == 1 & MunicipalityCode == 1674, absorb(PC4) cluster(PC4)

OUTPUT:
note: 2.MaintenanceEx omitted because of collinearity

Linear regression, absorbing indicators           Number of obs   =       5615
F(   9,      9) =          .
Prob > F        =          .
R-squared       =     0.9082
Adj R-squared   =     0.9064
Root MSE        =     0.1621

(Std. Err. adjusted for 10 clusters in PC4)

Robust
LNPrice       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]

MunicipalityXTreatment
1674    -.0060665    .010335    -0.59   0.572    -.0294459     .017313
                                       
//Control variables not shown
                                       
_cons    7.520199   .2312275    32.52   0.000     6.997126    8.043272

PC4    absorbed                                      (10 categories)





INPUT:
areg LNPrice i.MunicipalityXTreatment `housecharacteristics' i.Year i.Month if control5000 == 1 & MunicipalityCode == 748, absorb(PC4) cluster(PC4)

OUTPUT:
Linear regression, absorbing indicators           Number of obs   =       6693
F(  10,     10) =          .
Prob > F        =          .
R-squared       =     0.9012
Adj R-squared   =     0.8995
Root MSE        =     0.1669

(Std. Err. adjusted for 11 clusters in PC4)

Robust
LNPrice       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]

748.MunicipalityXTreatment   -.0371785   .0133704    -2.78   0.019    -.0669695   -.0073875
                                       
//Again control variables not copied into this post
                                       
_cons    7.784693   .1013415    76.82   0.000      7.55889    8.010496

PC4    absorbed                                      (11 categories)
So the results of the second (separate) regressions, would suggest that municipality 748 and 1674 responds negatively on coffeeshops. But the first regression suggest otherwise. My question is, how can this be the case and how can I solve it? Or if it is insolvable, what is the best approach?

Much thanks in advance,

With kind regards,

Jeroen

Viewing all articles
Browse latest Browse all 72764

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>