Quantcast
Channel: Statalist
Viewing all articles
Browse latest Browse all 73301

Multivariate imputation for a panel negative binomial outcome variable

$
0
0
Hello there,
I carried out a complete case analysis via negative binomial GLM/GEE model to evaluate the impact of the policy of restricting the number o hours of alcohol sale on the prevention of road traffic death in the districts of the provinces of Lima and Callao, Peru.
The outcome variable is the yearly road traffic death rates (RTDR) of 49 districts from 2003 to 2010. 24 districts out of 49 did not implemented the alcohol control policy, 18 out of 49 implemented this policy in 2007, one in 1999, one in 2004, two in 2005, one in 2006 and two in 2008. From these 49 districts, I dropped out the Santa Rosa district because it implemented the alcohol policy in 1999 and I have only RTDR of this district from 2003 to 2010, which is the RTDRs after four years of launched its policy. As most of the districts (19) launched the alcohol policy in 2007, I took this date as the year of the intervention. For the other districts with the date of implementation of this policy different than 2007, I moved the RTDR of the year of implementation to 2007 and so on with the other RTDRs. For instance, the district that implemented the alcohol policy in 2008, I moved the RTDR of 2008 to 2007 and so on for the others RTDRs. In consequence, all the RTDRs moves one cell to the left, then I got a missing value for the RTDR of the year 2010. And so on for all the other districts with date of implementation of the alcohol policy different than 2007. As these missing values depends of the observed data of the year of implementation of the alcohol policy and of the available RTDRs from 2003 to 2010, I considered these missing a Missing at Random (MAR).
In this situation, I am trying to study the effect of these missing values in my estimation.
As the panel outcome variable RTDR (yy0, yy1, ... , yy7) is a negative binomial distributed variable with the data in wide form and with 12% of missing values, I performed a multivariate imputation via "mi impute chained" in the following way:

. mi set wide
mi register imputed yy0 yy1 yy2 yy7
. mi impute chained (nbreg) yy0 yy1 yy2 yy7 = yy3 yy4 yy5 yy6 policy weekdayshr weekendshr pbnu, add(20) rseed(2232) dots

Where:
yy0, yy1, ... , yy7 is the outcome panel variable in wide form. This is a negative binomial distributed variable.
policy is a dichotomous yes/no variable regarding if they implemented the alcohol policy
weekdayshr is the number of hours of restriction of alcohol sale from Monday to Thursday.
weekendshr is the number of hours of alcohol sale restriction from Friday to Sunday
pbnu is the district's percentage of poverty via unmet basic needs such as the absence of enough food

After run, the above syntax of mi imputed chained with nbreg I got the following error:
" yy2: missing imputed values produced
This may occur when imputation variables are used as independent variables or when independent variables contain missing values.
You can specify option force if you wish to proceed anyway.
r(498);"
When I force the mi imputed chained with nbreg I got imputed values just for some of the missing values.
Nonetheless, when I run mi imputed chained with regress I do not have any problem:
. mi impute chained (regress) yy0 yy1 yy2 yy7 = yy3 yy4 yy5 yy6 policy weekdayshr weekendshr pbnu, add(20) rseed(2232) dots replace

Please, could you tell me what is going on? I was thinking that maybe my Stata 12 does not support multivariate imputations for negative binomial distributed variable via mi impute chained but I am not sure. By the way, I would like to know from you if it is correct to carry out multiple imputations for an outcome variable and if you agree that my missing are MAR.
I attached the stata dataset just in case; please keep it as a private data.
Thank you,

Victor Cruz-Campos

Viewing all articles
Browse latest Browse all 73301

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>