Quantcast
Channel: Statalist
Viewing all 72802 articles
Browse latest View live

Fix autocorrelation in 2sls

$
0
0
I´m running a 2sls that includes a demand equation and a supply equation. To correct autocorrelation,I would like to insert a AR(1) in demand equation.
Someone could help me about how to do it?

I know the command to run the 2sls, I don´t know how to include the AR process.

Regards

Identifying each subject's first visit

$
0
0
In the simulated data below, I wish to tag each subject's first visit (by date).
The last six lines of code show two methods:
- method 1: doesn't properly identify the first visit date for each subject, and
- method 2: flag all visits as the first
I'd like to understand what I have done incorrectly.

*simulate data
clear
input id str9 datetext
1 01Feb2014
2 01Jan2014
3 01Mar2014
2 01Jun2014
2 01May2014
2 01Apr2014
3 01Jul2014
3 01Sep2014
3 01Aug2014
4 01Oct2014
4 01Nov2014
4 01Dec2014
end
gen day = regexs(0) if regexm(datetext, "^[0-9]+") // ^ extracts 1st 4 nos
gen month = regexs(0) if regexm(datetext, "[a-zA-Z]+") // extracts any letters
gen year = regexs(0) if regexm(datetext, "[0-9]*$") // *$ extracts last 4 nos
destring day year, replace
gen mo=1 if month=="Jan"
replace mo=2 if month=="Feb"
replace mo=3 if month=="Mar"
replace mo=4 if month=="Apr"
replace mo=5 if month=="May"
replace mo=6 if month=="Jun"
replace mo=7 if month=="Jul"
replace mo=8 if month=="Aug"
replace mo=9 if month=="Sep"
replace mo=10 if month=="Oct"
replace mo=11 if month=="Nov"
replace mo=12 if month=="Dec"
gen edate=mdy(mo,day,year) // all vars must be numeric
format edate %d
drop datetext day month year mo
l, noo sepby(id)

*identify first occurrence (by date) of each id, method 1
sort edate
bys id: gen byte firstid=_n==1
l id edate firstid, noo sepby(id)
*identify first occurrence (by date) of each id, method 2
bys id edate: gen byte Firstid=_n==1
l id edate Firstid, noo sepby(id)

Tobit with multiple right censored data

$
0
0
I want to know if is possible to calculate a tobit regression with multiple rigth censored data, and only one left censored measure (only zero) . I was reviewing mvtobit but I'm not sure if it could be the answer. Please let me now if you have some idea about it. Also i read about Survival Analysis with censored intervals
Thanks to everyone and sorry for my too general question.

Creating Propensity Scores after imputation

$
0
0
I have a dataset containing 4,601 records (3,242 controls and 1,359 treatments) with 15 covariates. Approximately half of the covariates have missing data ranging from 8% to 25% missing. After running several models, the variables were eventually imputed using chained equations (M=30). My data are stored in wide format Now that my covariates have been imputed, I want to create propensity scores to match control and treatment subjects to come up with treatment effects. The literature I have reviewed suggest creating pscores and matching within each completed set resulting in m estimates of treatment effects and then averaging these effects (within approach). The alternative (between approach) involves averaging each subjects propensity score across m data sets and matching treated and controls from a single data set. My question is how do I create propensity scores and then run the matching strategy within the framework of mi set data? My intuition tells me to create 30 separate datasets by re-running my chained equation imputation in flongsep format, and then running the pscores and matching in each dataset. I then would compute an average outside of Stata. Is there a better way? Thanks in advance for your help!

Mice

$
0
0
I am trying to use multiple imputation by chained equation for my missing at random data.
I have a dataset of 15500 children with hearing and sight measurements and questionnaires taken at different timepoints, and from these I have derived variables - for example 'any sight problem'
I understand the MICE process in stata but my question is a basic starting question - Do I need to impute for missing data on the original sight measurements, or the derived 'any sight problem' variable?

melogit versus country dummies

$
0
0
Hi all,

I am working on a project using a 20-country survey of some 20,000 respondents. My dependent variable is a binary measure. I have between 10 to 12 independent variables depending upon the estimation.

As a first take at the data, I ran a logit regression, with 19 country dummies (one omitted) to control for country-level effects. I got some interesting results.

I then tried a mixed effect (melogit) regression, with the individuals identified by country.

The results are basically the same, with only one variable that was initially significant at 90 percent being insignificant in the second estimation. What I am wondering is whether, given the absence of country-level variables and over time variation on the individuals in the sample, melogit is worth while or whether I am basically just capturing the same thing with the country dummies that I tried initially.

This is my first foray into mixed effect models of this sort, so I would really appreciate everyone's thoughts.

Sincerely,

Eric



Where can I download the 'tkcomp group' function?

$
0
0
Hi all,

I am trying to do some simple post-hoc comparisons and have decided that tukey-kramer is most appropriate for my sample size. Using the 'findit tkcomp group' produced:

Web resources from Stata and other users

(contacting http://www.stata.com)
http://www.stata.com/websee.cgi?r=2&...k=tkcomp+group could not be opene
> d for read by copytextfile

r(601);


I have also tried 'findit tkcomp' and directly copying & pasting the web address directly into an internet explorer, to no avail. Does anyone know where I can download this function!?

Many thanks!!
Hannah

Comparing coefficients after -qregpd-

$
0
0
Hello dear forum members,

I am running a quantile regression analysis with panel data considerations using -qregpd- command:

Code:
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.99) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.95) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.75) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.50) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
qregpd y x1 x2 x3 x4 x5, id(id_name) fix(year) optimize(mcmc) noisy draws(1000) burn(100) arate(.5) quantile(0.25) instruments (x1 x2 x3 z1 z2 z3 z4 z5)
What would be the proper way to test if the regression coefficients x1/x5 are significantly different across quantiles? I do observe the difference in the unstandardized coefficients, but it might be misleading, I guess.

Thank you in advance for help.

Can the contents of a Stata dataset (or a -list- from that dataset) be saved to a do-file?

$
0
0
Can the contents of a Stata dataset (or a -list- of that dataset) be saved to a do-file?

Converting a string variable to a time

$
0
0
Hi all,
Sorry, I'm still quite new to Stata so bear with me!
I have a variable with a large series of times in the format HH:MM:SS,milliseconds. For example, 08:56:01,070. I'm trying to convert this variable to a format recognisable by stata to use in a time series.
I've tried variations on:
generate double time2 = clock(time, "hms") but am just getting a series of dots in the new variable column.
I would appreciate any help you could give me on this!

Controlling the smoothing function in Generalized Additive Model (GAM) and add a linear term into the model

$
0
0
Hi. I am studying Generalized Additive Model (GAM) and try to practice with STATA. I can run the GAM model with my data but I do not know how to switch the smoothing function from the cubic smoothing splines (default) to either kernel smoother or LOWESS. And if it is possible, I would like to know how to change the window size for the LOWESS as well.

My second question is, How can I add another linear term (not using a smoother) into the GAM model in STATA.

I will appreciate your help so much. Thank you!

cannot use the stcurve command after stcox with tvc option

$
0
0
I am estimating a cox model and handling the issue of non proportionality by allowing the model to interact time and treatment. e.g. stcox treatment x1 x2, tvc(treatment)

The problem is that I cannot use the postestimation command "stcurve, hazard" afterwards as it does not work in combination with the tvc option.

Is there a way I can plot the hazard so that they take into account the interaction between time and treatment? I am using STATA 13

Thanks

Plot divergence from projected trend in time series data

$
0
0
Hello everyone,

I have hourly time-series data of dependent variable Y and independent variable X.

a) I expect the relationship between X and Y to change gradually over time
b) I am interested in the evolution of the residuals of Y and the predicted values of Y, specifically after time t1 in the relationship
c) While the data is hourly, I collapse the residuals after running the regression by day/ week (to decrease some noise as I am not interested in hourly residuals per se but rather divergence from a trend)

I reg Y on X (without using year FE or a trend) for the time-period before t1 and calculate the residuals for the whole time-period to analyse the evolution of the residuals. One sees that the residuals are not Gauss-Markov but follow a trend over time (as expected by see a) ). While I can visually spot divergence of the residuals from the clearly visible trend, I would prefer to Thus, I cannot observe b).

My question to you is how to manipulate the plotted residuals to address b).
The trend must be based calculated for the time-period until t1 and used for the whole time-period. After detrending the residuals, I want to be able to observe how the residuals diverge from the trend after t1.

Thank you very much for your help.

Kind regards,

Lukas

Identify the first nonmissing value within a varlist by groups

$
0
0
Dear Statalist,

I'm having trouble with a data set where I have the movements (deposits and withdrawals) of several bank accounts in a period of 2 years. There is a regular deposit that's made every 2 months to each of the accounts (the amount of the deposit varies) and I want to know the percentage of this deposit that is withdraw in the first withdrawal after the deposit is made. An example of the data looks like this:
account number balance date withdrawal deposit
67122127 2260 06jan2014 2260
67122127 2260.11 13jan2014
67122127 2267.01 24jan2014
67122127 2199.89 06-feb-14
67122127 2253.09 06-feb-14 13.92
67122127 68.56993 06-feb-14 2131.32
67122127 69.62993 11-feb-14
67122127 1919.63 05-mar-14 1850
67122127 1919.72 12-mar-14
67122127 88.39999 10apr2014 1831.32
67122127 89.40998 11apr2014
67122127 -25.97001 14apr2014
67122127 1824.03 02-may-14 1850
67122127 1824.13 12-may-14
67122127 786.8101 05-jun-14 1037.32
67122127 41.57003 05-jun-14 13.92
67122127 55.49003 05-jun-14 731.32
67122127 -28.52997 11-jun-14
67122127 -29.42997 11-jun-14
67122127 1821.47 04-jul-14 1850
67122127 1821.56 11-jul-14
Any ideas on how to get the percentage withdrawed of the bimestral deposit in the first withdrawal of each bimester?

Thanks a lot!!

Santiago.

`xtivreg2, cueoptions() cueinit()` not working?

$
0
0
Dear Statalist

I am confused about some options for the ssc packages `xtivreg2` and `ivreg2`...The helpfile for `xtivreg2` mentions the `cueinit()` and `cueoptions()` options in the syntax description. However, they are not described anywhere else in the helpfile. The helpfile for `ivreg2` mentions neither options (but mentions the `b0()` option). In the 2007 Baum/Schaffer/Stillman paper on ivreg2 in the Stata J, the `cueinit()` and `cueoptions()` options are mentioned in section 4.1, p 479. However, none of the options seem to work for me. For example

Code:
use http://fmwww.bc.edu/ec-p/data/macro/abdata.dta, clear

// works fine
xtivreg2 ys k (n=l2.n l3.n), fe

// works fine
xtivreg2 ys k (n=l2.n l3.n), fe cue

// gives error "option cueinit() not allowed", r(198)
matrix b=e(b)
xtivreg2 ys k (n=l2.n l3.n), fe cue cueinit(b)

// gives error "option cueoptions() not allowed", r(198)
xtivreg2 ys k (n=l2.n l3.n), fe cue cueoptions(difficult)

// ivreg2
use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta, clear

// also r(198)
ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age mrt), cue cueoptions(difficult)
Is this a mistake in the documentation, or is something else going on? When searching in the code for `ivreg2` and `xtivreg2`, there are no mentions of the options either, so it seems that they simply don't exist anymore.

I'm using Stata version 12.1. My version of xtivreg2 is 1.0.17 19Feb2015. My version of ivreg2 is 4.1.10 9Feb2016. I've retried removing and reinstalling both packages.

Egen calculation, means/sd

$
0
0
Hi everyone,

I'm trying to manually calculate a value for each household in a survey. Using three dummy variables in the dataset (X, Y, Z), I'd like to generate a new variable that takes on the following value for each household:

Code:
egen index = 0.02*((X-mean(X))/sd(X)) + 0.05*((Y-mean(Y))/sd(Y)) + 0.04*((Z-mean(Z))/sd(Z))
where 0.02, 0.05, and 0.04 are constants I've defined; X, Y, and Z are either 0 or 1, depending on the household; and the means and standard deviation should be the weighted ones obtained using sum with analytic weights:

Code:
summarize X Y Z [aw=wgt]
However, the egen code returns "unknown function" errors.

Thank you.


EDIT:

As an aside, the reason that I’m doing this is because I used factor analysis on an older dataset to calculate the first principal components (which are the 0.02, 0.05, and 0.04).

I'm now trying to caculate index scores (which is the equation above) for households in a more recent survey, but using the principal components I obtained for the older survey. In the older survey I simply used the "predict" function after having run the PCA analysis to calculate the scores.

Please let me know if there is an easier solution to this problem. I'm trying to avoid having to merge the two files, which would be a bit tedious given how differently the datasets are organized.

Creating normally distriuted random variable given mean and standard deviation

$
0
0
Hi all
I would like to create a random variable (X) with 100 observations. The variable should follow the normal distribution with mean 15 and standard deviation 5. Can anybody suggest a code for this? Thanks.

Divide a matrix

$
0
0
Hello,

How can I devide a matrix by an other matrix (not a division by scalar) ?
By the way, I tried : mat xxx = zz * 1/yy , without résult...

Thank you for your answer
Marc

how to create a squared variable in Stata?

$
0
0
Hello folks,
I have a age variable and I want to create its square. I would be very grateful if someone could explain me how can I do that. Thank you so much

How to run a quadratic OLS regression?

$
0
0
I have been asked to run a quadratic OLS regression. Do we have such a regression in STATA? I would be very grateful if someone could help me out.
Viewing all 72802 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>