Fix autocorrelation in 2sls

August 25, 2016, 12:34 pm

≫ Next: Identifying each subject's first visit

≪ Previous: Getting around listwise deletion

I´m running a 2sls that includes a demand equation and a supply equation. To correct autocorrelation,I would like to insert a AR(1) in demand equation.
Someone could help me about how to do it?

I know the command to run the 2sls, I don´t know how to include the AR process.

Regards

↧

Identifying each subject's first visit

August 25, 2016, 1:29 pm

≫ Next: Tobit with multiple right censored data

≪ Previous: Fix autocorrelation in 2sls

In the simulated data below, I wish to tag each subject's first visit (by date).
The last six lines of code show two methods:
- method 1: doesn't properly identify the first visit date for each subject, and
- method 2: flag all visits as the first
I'd like to understand what I have done incorrectly.

*simulate data
clear
input id str9 datetext
1 01Feb2014
2 01Jan2014
3 01Mar2014
2 01Jun2014
2 01May2014
2 01Apr2014
3 01Jul2014
3 01Sep2014
3 01Aug2014
4 01Oct2014
4 01Nov2014
4 01Dec2014
end
gen day = regexs(0) if regexm(datetext, "^[0-9]+") // ^ extracts 1st 4 nos
gen month = regexs(0) if regexm(datetext, "[a-zA-Z]+") // extracts any letters
gen year = regexs(0) if regexm(datetext, "[0-9]*$") // *$ extracts last 4 nos
destring day year, replace
gen mo=1 if month=="Jan"
replace mo=2 if month=="Feb"
replace mo=3 if month=="Mar"
replace mo=4 if month=="Apr"
replace mo=5 if month=="May"
replace mo=6 if month=="Jun"
replace mo=7 if month=="Jul"
replace mo=8 if month=="Aug"
replace mo=9 if month=="Sep"
replace mo=10 if month=="Oct"
replace mo=11 if month=="Nov"
replace mo=12 if month=="Dec"
gen edate=mdy(mo,day,year) // all vars must be numeric
format edate %d
drop datetext day month year mo
l, noo sepby(id)

*identify first occurrence (by date) of each id, method 1
sort edate
bys id: gen byte firstid=_n==1
l id edate firstid, noo sepby(id)
*identify first occurrence (by date) of each id, method 2
bys id edate: gen byte Firstid=_n==1
l id edate Firstid, noo sepby(id)

↧

Tobit with multiple right censored data

August 25, 2016, 1:47 pm

≫ Next: Creating Propensity Scores after imputation

≪ Previous: Identifying each subject's first visit

I want to know if is possible to calculate a tobit regression with multiple rigth censored data, and only one left censored measure (only zero) . I was reviewing mvtobit but I'm not sure if it could be the answer. Please let me now if you have some idea about it. Also i read about Survival Analysis with censored intervals
Thanks to everyone and sorry for my too general question.

↧

Creating Propensity Scores after imputation

August 25, 2016, 1:55 pm

≫ Next: Mice

≪ Previous: Tobit with multiple right censored data

I have a dataset containing 4,601 records (3,242 controls and 1,359 treatments) with 15 covariates. Approximately half of the covariates have missing data ranging from 8% to 25% missing. After running several models, the variables were eventually imputed using chained equations (M=30). My data are stored in wide format Now that my covariates have been imputed, I want to create propensity scores to match control and treatment subjects to come up with treatment effects. The literature I have reviewed suggest creating pscores and matching within each completed set resulting in m estimates of treatment effects and then averaging these effects (within approach). The alternative (between approach) involves averaging each subjects propensity score across m data sets and matching treated and controls from a single data set. My question is how do I create propensity scores and then run the matching strategy within the framework of mi set data? My intuition tells me to create 30 separate datasets by re-running my chained equation imputation in flongsep format, and then running the pscores and matching in each dataset. I then would compute an average outside of Stata. Is there a better way? Thanks in advance for your help!

↧

Mice

August 25, 2016, 2:13 pm

≫ Next: melogit versus country dummies

≪ Previous: Creating Propensity Scores after imputation

I am trying to use multiple imputation by chained equation for my missing at random data.
I have a dataset of 15500 children with hearing and sight measurements and questionnaires taken at different timepoints, and from these I have derived variables - for example 'any sight problem'
I understand the MICE process in stata but my question is a basic starting question - Do I need to impute for missing data on the original sight measurements, or the derived 'any sight problem' variable?

↧

melogit versus country dummies

August 25, 2016, 3:38 pm

≫ Next: Where can I download the 'tkcomp group' function?

≪ Previous: Mice

Hi all,

I am working on a project using a 20-country survey of some 20,000 respondents. My dependent variable is a binary measure. I have between 10 to 12 independent variables depending upon the estimation.

As a first take at the data, I ran a logit regression, with 19 country dummies (one omitted) to control for country-level effects. I got some interesting results.

I then tried a mixed effect (melogit) regression, with the individuals identified by country.

The results are basically the same, with only one variable that was initially significant at 90 percent being insignificant in the second estimation. What I am wondering is whether, given the absence of country-level variables and over time variation on the individuals in the sample, melogit is worth while or whether I am basically just capturing the same thing with the country dummies that I tried initially.

This is my first foray into mixed effect models of this sort, so I would really appreciate everyone's thoughts.

Sincerely,

Eric

↧

Where can I download the 'tkcomp group' function?

August 25, 2016, 4:17 pm

≫ Next: Comparing coefficients after -qregpd-

≪ Previous: melogit versus country dummies

Hi all,

I am trying to do some simple post-hoc comparisons and have decided that tukey-kramer is most appropriate for my sample size. Using the 'findit tkcomp group' produced:

Web resources from Stata and other users

(contacting http://www.stata.com)
http://www.stata.com/websee.cgi?r=2&...k=tkcomp+group could not be opene
> d for read by copytextfile

r(601);

I have also tried 'findit tkcomp' and directly copying & pasting the web address directly into an internet explorer, to no avail. Does anyone know where I can download this function!?

Many thanks!!
Hannah

↧

Comparing coefficients after -qregpd-

August 25, 2016, 4:18 pm

≫ Next: Can the contents of a Stata dataset (or a -list- from that dataset) be saved to a do-file?

≪ Previous: Where can I download the 'tkcomp group' function?

Hello dear forum members,

I am running a quantile regression analysis with panel data considerations using -qregpd- command:

Code:

qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.99) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.95) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.75) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.50) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.25) instruments (x1 x2 x3 z1 z2 z3 z4 z5)

What would be the proper way to test if the regression coefficients x1/x5 are significantly different across quantiles? I do observe the difference in the unstandardized coefficients, but it might be misleading, I guess.

Thank you in advance for help.

↧

Can the contents of a Stata dataset (or a -list- from that dataset) be saved to a do-file?

August 25, 2016, 4:34 pm

≫ Next: Converting a string variable to a time

≪ Previous: Comparing coefficients after -qregpd-

Can the contents of a Stata dataset (or a -list- of that dataset) be saved to a do-file?

↧

Converting a string variable to a time

August 25, 2016, 5:10 pm

≫ Next: Controlling the smoothing function in Generalized Additive Model (GAM) and add a linear term into the model

≪ Previous: Can the contents of a Stata dataset (or a -list- from that dataset) be saved to a do-file?

Hi all,
Sorry, I'm still quite new to Stata so bear with me!
I have a variable with a large series of times in the format HH:MM:SS,milliseconds. For example, 08:56:01,070. I'm trying to convert this variable to a format recognisable by stata to use in a time series.
I've tried variations on:
generate double time2 = clock(time, "hms") but am just getting a series of dots in the new variable column.
I would appreciate any help you could give me on this!

↧

Controlling the smoothing function in Generalized Additive Model (GAM) and add a linear term into the model

August 25, 2016, 8:36 pm

≫ Next: cannot use the stcurve command after stcox with tvc option

≪ Previous: Converting a string variable to a time

Hi. I am studying Generalized Additive Model (GAM) and try to practice with STATA. I can run the GAM model with my data but I do not know how to switch the smoothing function from the cubic smoothing splines (default) to either kernel smoother or LOWESS. And if it is possible, I would like to know how to change the window size for the LOWESS as well.

My second question is, How can I add another linear term (not using a smoother) into the GAM model in STATA.

I will appreciate your help so much. Thank you!

↧

cannot use the stcurve command after stcox with tvc option

August 26, 2016, 1:04 am

≫ Next: Plot divergence from projected trend in time series data

≪ Previous: Controlling the smoothing function in Generalized Additive Model (GAM) and add a linear term into the model

I am estimating a cox model and handling the issue of non proportionality by allowing the model to interact time and treatment. e.g. stcox treatment x1 x2, tvc(treatment)

The problem is that I cannot use the postestimation command "stcurve, hazard" afterwards as it does not work in combination with the tvc option.

Is there a way I can plot the hazard so that they take into account the interaction between time and treatment? I am using STATA 13

Thanks

↧

Plot divergence from projected trend in time series data

August 26, 2016, 1:41 am

≫ Next: Identify the first nonmissing value within a varlist by groups

≪ Previous: cannot use the stcurve command after stcox with tvc option

Hello everyone,

I have hourly time-series data of dependent variable Y and independent variable X.

a) I expect the relationship between X and Y to change gradually over time
b) I am interested in the evolution of the residuals of Y and the predicted values of Y, specifically after time t₁ in the relationship
c) While the data is hourly, I collapse the residuals after running the regression by day/ week (to decrease some noise as I am not interested in hourly residuals per se but rather divergence from a trend)

I reg Y on X (without using year FE or a trend) for the time-period before t₁and calculate the residuals for the whole time-period to analyse the evolution of the residuals. One sees that the residuals are not Gauss-Markov but follow a trend over time (as expected by see a) ). While I can visually spot divergence of the residuals from the clearly visible trend, I would prefer to Thus, I cannot observe b).

My question to you is how to manipulate the plotted residuals to address b).
The trend must be based calculated for the time-period until t₁and used for the whole time-period. After detrending the residuals, I want to be able to observe how the residuals diverge from the trend after t₁.

Thank you very much for your help.

Kind regards,

Lukas

↧

Identify the first nonmissing value within a varlist by groups

August 26, 2016, 8:04 am

≫ Next: `xtivreg2, cueoptions() cueinit()` not working?

≪ Previous: Plot divergence from projected trend in time series data

Dear Statalist,

I'm having trouble with a data set where I have the movements (deposits and withdrawals) of several bank accounts in a period of 2 years. There is a regular deposit that's made every 2 months to each of the accounts (the amount of the deposit varies) and I want to know the percentage of this deposit that is withdraw in the first withdrawal after the deposit is made. An example of the data looks like this:

account number	balance	date	withdrawal	deposit
67122127	2260	06jan2014		2260
67122127	2260.11	13jan2014
67122127	2267.01	24jan2014
67122127	2199.89	06-feb-14
67122127	2253.09	06-feb-14	13.92
67122127	68.56993	06-feb-14	2131.32
67122127	69.62993	11-feb-14
67122127	1919.63	05-mar-14		1850
67122127	1919.72	12-mar-14
67122127	88.39999	10apr2014	1831.32
67122127	89.40998	11apr2014
67122127	-25.97001	14apr2014
67122127	1824.03	02-may-14		1850
67122127	1824.13	12-may-14
67122127	786.8101	05-jun-14	1037.32
67122127	41.57003	05-jun-14	13.92
67122127	55.49003	05-jun-14	731.32
67122127	-28.52997	11-jun-14
67122127	-29.42997	11-jun-14
67122127	1821.47	04-jul-14		1850
67122127	1821.56	11-jul-14

Any ideas on how to get the percentage withdrawed of the bimestral deposit in the first withdrawal of each bimester?

Thanks a lot!!

Santiago.

↧

`xtivreg2, cueoptions() cueinit()` not working?

August 26, 2016, 8:17 am

≫ Next: Egen calculation, means/sd

≪ Previous: Identify the first nonmissing value within a varlist by groups

Dear Statalist

I am confused about some options for the ssc packages `xtivreg2` and `ivreg2`...The helpfile for `xtivreg2` mentions the `cueinit()` and `cueoptions()` options in the syntax description. However, they are not described anywhere else in the helpfile. The helpfile for `ivreg2` mentions neither options (but mentions the `b0()` option). In the 2007 Baum/Schaffer/Stillman paper on ivreg2 in the Stata J, the `cueinit()` and `cueoptions()` options are mentioned in section 4.1, p 479. However, none of the options seem to work for me. For example

Code:

use http://fmwww.bc.edu/ec-p/data/macro/abdata.dta, clear

// works fine
xtivreg2 ys k (n=l2.n l3.n), fe

// works fine
xtivreg2 ys k (n=l2.n l3.n), fe cue

// gives error "option cueinit() not allowed", r(198)
matrix b=e(b)
xtivreg2 ys k (n=l2.n l3.n), fe cue cueinit(b)

// gives error "option cueoptions() not allowed", r(198)
xtivreg2 ys k (n=l2.n l3.n), fe cue cueoptions(difficult)

// ivreg2
use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta, clear

// also r(198)
ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age mrt), cue cueoptions(difficult)

Is this a mistake in the documentation, or is something else going on? When searching in the code for `ivreg2` and `xtivreg2`, there are no mentions of the options either, so it seems that they simply don't exist anymore.

I'm using Stata version 12.1. My version of xtivreg2 is 1.0.17 19Feb2015. My version of ivreg2 is 4.1.10 9Feb2016. I've retried removing and reinstalling both packages.

↧

Egen calculation, means/sd

August 26, 2016, 9:05 am

≫ Next: Creating normally distriuted random variable given mean and standard deviation

≪ Previous: `xtivreg2, cueoptions() cueinit()` not working?

Hi everyone,

I'm trying to manually calculate a value for each household in a survey. Using three dummy variables in the dataset (X, Y, Z), I'd like to generate a new variable that takes on the following value for each household:

Code:

egen index = 0.02*((X-mean(X))/sd(X)) + 0.05*((Y-mean(Y))/sd(Y)) + 0.04*((Z-mean(Z))/sd(Z))

where 0.02, 0.05, and 0.04 are constants I've defined; X, Y, and Z are either 0 or 1, depending on the household; and the means and standard deviation should be the weighted ones obtained using sum with analytic weights:

Code:

summarize X Y Z [aw=wgt]

However, the egen code returns "unknown function" errors.

Thank you.

EDIT:

As an aside, the reason that I’m doing this is because I used factor analysis on an older dataset to calculate the first principal components (which are the 0.02, 0.05, and 0.04).

I'm now trying to caculate index scores (which is the equation above) for households in a more recent survey, but using the principal components I obtained for the older survey. In the older survey I simply used the "predict" function after having run the PCA analysis to calculate the scores.

Please let me know if there is an easier solution to this problem. I'm trying to avoid having to merge the two files, which would be a bit tedious given how differently the datasets are organized.

↧

Creating normally distriuted random variable given mean and standard deviation

August 26, 2016, 9:07 am

≫ Next: Divide a matrix

≪ Previous: Egen calculation, means/sd

Hi all
I would like to create a random variable (X) with 100 observations. The variable should follow the normal distribution with mean 15 and standard deviation 5. Can anybody suggest a code for this? Thanks.

↧

Divide a matrix

August 26, 2016, 9:12 am

≫ Next: how to create a squared variable in Stata?

≪ Previous: Creating normally distriuted random variable given mean and standard deviation

Hello,

How can I devide a matrix by an other matrix (not a division by scalar) ?
By the way, I tried : mat xxx = zz * 1/yy , without résult...

Thank you for your answer
Marc

↧

how to create a squared variable in Stata?

August 26, 2016, 10:42 am

≫ Next: How to run a quadratic OLS regression?

≪ Previous: Divide a matrix

Hello folks,
I have a age variable and I want to create its square. I would be very grateful if someone could explain me how can I do that. Thank you so much

↧

How to run a quadratic OLS regression?

August 26, 2016, 10:58 am

≫ Next: Different heterogeneity statistics for proportions in the same subgroup using -metan- vs -metaprop_one-

≪ Previous: how to create a squared variable in Stata?

I have been asked to run a quadratic OLS regression. Do we have such a regression in STATA? I would be very grateful if someone could help me out.

↧