I need to estimate a production function using the olley-pakes semiparametric estimation procedure. I am using the routine -opreg- (developed by Yasar, Raciborski, & Poi, Stata Journal 2008, YSR from now). Since I am using Compustat as a dataset, I tried to replicate the findings of YRP at page 228, where they report the estimation results for OLS and OP, however I get very different numbers. The problem is that the dofile "opreg.do" provided by YRP does not contain any useful info on how to make the sample selection and to construct the relevant variables. In particular it is not clear at all how to construct the variable lnm (log of materials) and firm age that are used in the paper. I write below my code (after that I simply copied and pasted the code provided by YSR):
MY CODE TO DEFINE THE VARIABLES
clear all
use "volatility annual.dta" // Compustat annual
** Compustat data cleaning
destring gvkey, replace
bys gvkey fyear: gen duplicate = _N
drop if duplicate>1
drop duplicate
rename fyear year
* Firm age
bys gvkey (year): gen first_year=min(year)
gen age = year-first_year
replace age = age+1
keep if year>=1995 & year<=2002
xtset gvkey year
*labor, kap, inv, age * All variables but age are in logs
drop if sale<=0
drop if cogs<=0
drop if emp<=0
drop if xsga<0
drop if dp<0
replace xsga=0 if xsga==.
gen materials = cogs+xsga+dp
drop if materials>=.
gen lny = log(sale)
gen lnm = log(materials)
gen lnm1 = log(cogs)
gen lnkop =log(ppent)
gen lninv = log(capx)
gen lnl = log(emp)
xtset gvkey year
YSR'S CODE TO CREATE EXIT DUMMY AND DO THE ESTIMATION
gen firmid=gvkey sort firmid year by firmid : gen count = _N gen survivor = count == 8 gen has90 = 1 if year == 2002 sort firmid has90 by firmid : replace has90 = 1 if has90[_n-1] == 1 replace has90 = 0 if has90 == . sort firmid year by firmid : gen has_gaps = 1 if year[_n-1] != year-1 & _n != 1 sort firmid has_gaps by firmid : replace has_gaps = 1 if has_gaps[_n-1] == 1 replace has_gaps = 0 if has_gaps == . sort firmid year // this line was missing in the original article by firmid : generate exit = survivor == 0 & has90 == 0 & has_gaps != 1 & _n == _N replace exit=0 if exit==1 & year==2002 * Time trend gen t=year-1994 ** ESTIMATION ************************************************** **************** *** OLS Regression
reg lny lnl lnm lnk age t *** Olley and Pakes (1996) with OPREG command
xtset gvkey year
opreg lny, exit(exit) state(age lnkop) proxy(lninv) free(lnl lnm) cvars(t) second vce(bootstrap, seed(1) rep(2))