Dear all,
Brief overview: I' m trying to estimate the impact of intrawar presence(1) interwar presence(2) and economic sanctions(3) on exports, using a gravity model.
When it comes to estimate gravity equation, PPML is the new benchmark. All previous studies on the topic though, use OLS, therefore it might be interesting to see if conventionl wisdom holds, using this new approach. That's why before any inference on my main 3 variables of interest, I am running a sensitivity analysis to compare different OLS specification with different PPML specification.
What's the problem ?
My main concern is about the PPML with time-varying country dummies specification.
To be more specific I use dummies for every origin country and every destination country, on a three year basis (following a previous paper by Ruiz and Villarubia, which also use OLS, not PPML). To be more explicit, Germany has 14 dummies in total: Germany as EXporter for the years 1989-1991, Germany as IMporter for the years 1989-2001, Germany as EXporter for the years 1992-1995 and so on...
I need to use a 3 years-country dummy because my dataset is made of 89 countries (covering 92% of World Export) for a 21 years time-span, from 1989 to 2009, resulting in a balanced panel of 164472 observations, which would require 89x21x2 = 3738 dummies on a 1 year base, way too much for the computational power at my disposal.
What's my Stata code ?
I create the dummies using
I drop time invariant country-dummies and time-dummies automatically created by the previous code and i run PPML
Then I run a RESET test:
Results are as follows
From a qualitative point of view results are in line with previous studies, but the RESET test p-value is a bit too low.
My plan is to run the same model including my variables of interest (intrawar, interwar, economic sanctions).
And to repeat everything subsetting for Heterogenous products, Reference Priced products and Differentiated Products following Rauch classification, to see what products are more sensitive to unstable conditions.
My questions are:
May the RESET test alone undermine the reliability of my results ?
May the RESET test of the others models undermine the reliability of those those results too ?
Am I overthinking this ?
Any comment on the code, on the RESET test in particular and on the project in general, would be much appreciated.
Brief overview: I' m trying to estimate the impact of intrawar presence(1) interwar presence(2) and economic sanctions(3) on exports, using a gravity model.
When it comes to estimate gravity equation, PPML is the new benchmark. All previous studies on the topic though, use OLS, therefore it might be interesting to see if conventionl wisdom holds, using this new approach. That's why before any inference on my main 3 variables of interest, I am running a sensitivity analysis to compare different OLS specification with different PPML specification.
What's the problem ?
My main concern is about the PPML with time-varying country dummies specification.
To be more specific I use dummies for every origin country and every destination country, on a three year basis (following a previous paper by Ruiz and Villarubia, which also use OLS, not PPML). To be more explicit, Germany has 14 dummies in total: Germany as EXporter for the years 1989-1991, Germany as IMporter for the years 1989-2001, Germany as EXporter for the years 1992-1995 and so on...
I need to use a 3 years-country dummy because my dataset is made of 89 countries (covering 92% of World Export) for a 21 years time-span, from 1989 to 2009, resulting in a balanced panel of 164472 observations, which would require 89x21x2 = 3738 dummies on a 1 year base, way too much for the computational power at my disposal.
What's my Stata code ?
I create the dummies using
Code:
*where year3 is categorical from 1 to 7 for the years *origin is the origin country id and destination is the destination country id xi, prefix(_G) noomit i.origin*i.year3 i.destination*i.year3
Code:
drop _Gorigin* _Gyear* _Gdestin* ppml export2 lndistwces contig comlang_off colony _G* if year < 2010, cluster(dyad) *Where: export2 is export in billion of 2005 US$ (to allow a quicker computation) FROM Feenstra/UN comtrade *lndistwces is weighted distance from CEPII *contig is 1 for contiguity from CEPII *comlang_off is 1 for a common language from CEPII * colony is 1 for previous colonial ties from CEPII
Code:
predict XB,xb gen XB2 = XB^2 quietly ppml export2 lndistwces contig comlang_off colony XB2 _G* if year < 2010, keep cluster(dyad) test XB2 = 0
Code:
Number of parameters: 1243 Number of observations: 164472 Pseudo log-likelihood: -61261.918 R-squared: .91971331 Option strict is: off (Std. Err. adjusted for 7,832 clusters in dyad) -------------------------------------------------------------------------------- | Robust export2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------------+---------------------------------------------------------------- lndistwces | -.7634372 .0258945 -29.48 0.000 -.8141894 -.712685 contig | .3082213 .0659453 4.67 0.000 .1789708 .4374718 comlang_off | .2199701 .0614091 3.58 0.000 .0996105 .3403298 colony | -.0989539 .1018637 -0.97 0.331 -.298603 .1006952 test XB2 = 0 ( 1) XB2 = 0 chi2( 1) = 6.23 Prob > chi2 = 0.0125
My plan is to run the same model including my variables of interest (intrawar, interwar, economic sanctions).
And to repeat everything subsetting for Heterogenous products, Reference Priced products and Differentiated Products following Rauch classification, to see what products are more sensitive to unstable conditions.
My questions are:
May the RESET test alone undermine the reliability of my results ?
May the RESET test of the others models undermine the reliability of those those results too ?
Am I overthinking this ?
Any comment on the code, on the RESET test in particular and on the project in general, would be much appreciated.