Is it possible to alter the stata default factor variable list {1 0} coding to {1 -1} ?

November 17, 2016, 11:51 pm

≪ Previous: Separating my minimum value?

Hi all,

I would like to know if it is possible to alter the stata default factor variable list {1 0} to {1 -1}? The reason is that I want to run a factor effects ANOVA model via regress command. For example, for a two-factor study, factor A with 3 levels and factor B with 2 levels, I would usually generate the indicator variables manually:
X1 = {1 if case is level 1 for factor A, -1 if case is level 3, 0 otherwise}
X2 = {1 if case is level 2 for factor A, -1 if case is level 3, 0 otherwise}
X3 = {1 if case is level 1 for factor B, -1 if case is level 2}

The model is Y = u.. + aX1 + aX2 + bX3 + (ab)X1X3 + (ab)X2X3 + e <Symbol: u is miu; a is alpha; b is beta; e is epsilon>

The problem with manual coding of indicator variables is that I cannot utilise fully the margin function. Therefore I would like to know if it is possible to alter the default stata factor variable list?

Thank you in advance!

↧

Maptile geography problem

November 18, 2016, 12:14 am

≫ Next: When editing is disabled in Statalist?

≪ Previous: Is it possible to alter the stata default factor variable list {1 0} coding to {1 -1} ?

Hi,

I recently installed Maptile, but I can't seem to get any of the templates to load. I tried "maptile_install <file>", which installed the template, and I ran the accompanying do-file, which produced the test maps as expected. But every time I run maptile_geolist, I get that no geographies are found, and when I try to make a map, it keeps saying that the geography I want to run is specified but not installed. I've tried moving the various map files to the folder where I'm working, or moving my data to the folder where the geography was unzipped, but I keep getting the same problem, and I'm unsure what to try next. Thanks for any help!

John

↧

When editing is disabled in Statalist?

November 18, 2016, 1:14 am

≫ Next: Interactive term in non-linear model (Code problem and interpretation)

≪ Previous: Maptile geography problem

I still haven't understood the editing rule in this forum.
I first though previous posts can no longer be edited when a new message is posted in the thread, but I've then noticed that this is not a general rule.
Sometimes I still can edit a post which is not the last one the thread, sometimes I can't even edit the last one...

Anyone to explain me the real editing rules?

Best,
Charlie

↧

Interactive term in non-linear model (Code problem and interpretation)

November 18, 2016, 1:25 am

≫ Next: Propensity score matching - output variable pre-treatment

≪ Previous: When editing is disabled in Statalist?

Hi, Experts:

I have questions about interactive term in non-linear model. After reading papers from Norton (2004) and Greene(2010), I realized the interpretation of interactive term is different in non-linear model. The Stata code"inteff" gives a correct way to see the effect of interactive term as well as graph illustration. I am doing a research about the impact of SES on heart treatment choices. Since my outcome variable is treatment choice (binary), I used logit model. I also have an intention to see the interactive term between two variables (whether talking to health provider and gender). The following is my code for "inteff" after runing logit model. I am using STATA12.

Code:

inteff treat i.healthprovider i.gender inter1 age i.edu i.religion,savedata(C:\Users\GMSGNI\Desktop\acp,replace) savegraph1(C:\Users\GMSGNI\Desktop\figure1,replace) savegraph2(C:\Users\GMSGNI\Desktop\figure2,replace)

I have three questions:

1. The code does not run because it showed"factor variable and time-series operators not allowed". Does that mean "inteff" does not allowed categorical variables? Is there some recommendation to solve such issue? Is there other code suggested?

2. When I ran the logit model, I first ran model without interactive term, then ran the model with interactive term. I noticed my main effect (whether talking to health provider) changed from significant to insignificant after adding interactive term. Some people said the "main effect" in the model with interactive term is no longer main effect but conditional effect. Is that right?

3. In this research, I also ran other logit models with different binary outcomes. But I assume the errors of those models are correlated with each other, so I plan to use Generalized SEM. If I use this model, can I still use "inteff" to see the interactive term? In other words, whether "inteff" is applicable in GSEM model. If not, do you have some suggestions?

Thank you in advance!

Connie

↧

Propensity score matching - output variable pre-treatment

November 18, 2016, 1:32 am

≫ Next: Error when trying to import all txt files from a catalog

≪ Previous: Interactive term in non-linear model (Code problem and interpretation)

Hi all,

I am supposed to use propensity score matching to compare a treatment and control group. However, the outcome variable in question is not after the treatment but rather already in place pre-treatment. I am not interested in whether the outcome variable changes pre- and post-treatment but rather whether treatment and control group differ with respect to the pre-treatment variable. Is PSM in that case a valide approach?

Thanks in advance,
Felix

↧

Error when trying to import all txt files from a catalog

November 18, 2016, 1:35 am

≫ Next: How to make a Rivers and Vuongs (2002) test?

≪ Previous: Propensity score matching - output variable pre-treatment

Dear all,

I am trying to import all txt files from the folder and save them as Stata files using Stata 14. I am using the following code:

local files : dir "C:\......." files "*.txt"
cd "C:\......."
foreach file in `files' {
import delimited `file', clear
save `file'
}

As a result, I get an error message:
using required
r(100);

How can I fix the code?

↧

How to make a Rivers and Vuongs (2002) test?

November 18, 2016, 1:52 am

≫ Next: storing database table names list after "odbc query <my dsn>, dialogue(complete)

≪ Previous: Error when trying to import all txt files from a catalog

Dear all,
Does someone know how to make Rivers and Vuongs (2002) generalized test for comparing non-nested models? I use non-linear gmm and there are three equations in the model. So, if anyone has made similar test any help would be appreciated.
Here's the original article for the test: http://onlinelibrary.wiley.com/doi/1...1-1-00071/full
Thanks in advance
Best
Antonis Rezitis

↧

storing database table names list after "odbc query <my dsn>, dialogue(complete)

November 18, 2016, 2:06 am

≫ Next: Using adjustrcspline after mixed

≪ Previous: How to make a Rivers and Vuongs (2002) test?

I would like to store the names of the tables in an access database that I am accessing in Stata through odbc. I want a work around that does not require granting permissions to mysystables to Admin in access. I am thinking since stata lists the table names, there is a way to storew the list of table name. any pointers?

↧

Using adjustrcspline after mixed

November 18, 2016, 2:48 am

≫ Next: New version of cprdlinkutil on SSC

≪ Previous: storing database table names list after "odbc query <my dsn>, dialogue(complete)

Hi!
I want to graph the results after having run a linear mixed model for repeated measures that includes cubic splines for age. I have done:

Code:

mkspline2 agespline = age, cubic knots(0 18 36 60 84 96)
mixed bmi agespline* cov1 cov2 ....... || ID: age, mle cov(un) residuals(exp, t(time))
adjustrcspline, link(identity)

However, I get the following message:
variable _cons not found

What should I do to overcome this "problem"?

Kjell Weyde

↧

New version of cprdlinkutil on SSC

November 18, 2016, 3:46 am

≫ Next: Boxplots

≪ Previous: Using adjustrcspline after mixed

Thanks to Kit Baum, a new version of the cprdlinkutil package (described as below on my website) is now available for download from SSC. In Stata, use the ssc command to do this, or adoupdate if you already have an old version of cprdlinkutil. The new version fixes a few confusing typos in the online help.

Best wishes

Roger

---------------------------------------------------------------------------

TITLE
cprdlinkutil: Inputting CPRD linkage-source datasets into Stata

DESCRIPTION/AUTHOR(S)
The cprdlinkutil package is designed for use with the cprdutil
package, which creates Stata datasets from text data files
produced by the Clinical Practice Research Datalink (CPRD).
cprdlinkutil is a suite of utility programs for inputting
linkage-source text data files produced by CPRD for linkage to
one or more non-CPRD sources of data on the same patients, and
creating equivalent Stata datasets in the memory. Possible
linkage-data sources include the Hospital Episodes System (HES)
for data on hospitalisations, and the Office of National
Statistics (ONS) for data on deaths. CPRD can carry out data
retrievals to provide linkage datasets, with data about
patients known to CPRD, and with one observation per event of
the type recorded by the linkage-data source. Datasets
produced by cprdlinkutil contain information on patients and
practices known to CPRD, and on the times at which patients in
these practices can be said to be at risk of experiencing
recorded events of the types recorded by each linkage-data
source. cprdlinkutil uses the SSC packages keyby and lablist,
which need to be installed for cprdlinkutil to work.

Author: Roger Newson
Distribution-Date: 14november2016
Stata-Version: 13

INSTALLATION FILES (click here to install)
cprdlink_linkage_eligibility.ado
cprdlink_linkage_coverage.ado
cprdlink_linked_practices.ado
cprdlink_linkage_eligibility.sthlp
cprdlink_linkage_coverage.sthlp
cprdlink_linked_practices.sthlp
cprdlinkutil.sthlp
---------------------------------------------------------------------------
(click here to return to the previous screen)

↧

Boxplots

November 18, 2016, 4:08 am

≫ Next: xtpmg (panel error correctio model) error messages

≪ Previous: New version of cprdlinkutil on SSC

Hi, I am relatively new to stata,

I was wondering if anyone had an advice on how to get box plots next to each other?

I have 4 repeated measures scores over time

I would like the box plot for each score 1 for smokers and none smokes next to each other, then score 2 nor smokers next to each other and so on on the same plot.

I have tried -graph box tscore1 tscore2 tscore3 tscore 4, over( msmoke)- but this displays score1,2,3,4 for non smokers next to each other then a new graph of the same for smokers

thanks

↧

xtpmg (panel error correctio model) error messages

November 18, 2016, 1:18 pm

≫ Next: Pedroni test for cointegration in panel (xtpedroni) with variables with quadratic time trend

≪ Previous: Boxplots

I am trying to run xtpmg

xtpmg d_ln_imports d_ln_REER d_ln_exports d_ln_domesticdemand gfc gfc_ln_dd t, ec(ec) lr(l.ln_TM_R l.ln_REER l.ln_dd l.ln_TX_R) replace mg

trying to see how changes in real effective exchange rate and exports and domestic demand affect imports in SR and LR.
i have panel data.

i get an error msg saying
expression (-_b[L.ln_REER]/_b[L.ln_TM_R]) evaluates to missing

When i try to run "pmg" i get the following error msg
initial values not feasible

Why is this happening? Regression using DFE estimator does run though.

Thanks

↧

Pedroni test for cointegration in panel (xtpedroni) with variables with quadratic time trend

November 18, 2016, 1:31 pm

≫ Next: enter dummy vs. continuous measure after entering ---is this similar to piecewise regression?

≪ Previous: xtpmg (panel error correctio model) error messages

Dear all,
I ran -xtpedroni- test for two I(1) variables. The panel is N=27 T=24 balanced and heterogeneous. My first I(1) var. has linear time trend and the other quadratic time trend. The -trend- option in the -xtpedroni- test accounts only for a a linear time trend. Would de-trending both vars. be a good solution? Currently, when I ran the -xtpedroni- test with the variables at levels and with -trend- option the results are inconclusive.
Thanks for your time,
Anat

↧

enter dummy vs. continuous measure after entering ---is this similar to piecewise regression?

November 18, 2016, 1:54 pm

≫ Next: scatter plot for panel time series correlation between variables

≪ Previous: Pedroni test for cointegration in panel (xtpedroni) with variables with quadratic time trend

Dear all,

I coded a variable X1 that calculates the percentage of new entrants in a group, with continuous value from 0 to 1. However, since I want to capture the effect of this variable GIVEN there is any new entrants, I coded another dummy variable X2 that equals one if there is NO new entrants in a group. Therefore, X1==0 if X2==1, and X2==0 if X1>0. The correlation of these two variables is -0.88. When I included X2 without X1 in the regression, X2 has a significant negative effect, when I included X1 in the model as well, X2 becomes insignificant. This is not surprising due to the high correlation. However, my question is:

X2 is a dummy variable coded based on X1. Usually X1 and X2 just two different ways (dummy vs. continuous variables) of measuring the similar thing. Is it ok to include these two variables all together in the same model? If so, can you give me a citation?
Is inclusion of X2 the correct way to ensure we capture effect of continuous X1 “GIVEN there is any new entrants”? I wonder if this is similar to piecewise regression, although not exactly the same.
I want to test a hypothesis of X2 as well. However, due to the flip of the sign and its high correlation with X1, I don't know how to interpret effect of X2 in a model including both X1 and X2.

Thanks a lot!

↧

scatter plot for panel time series correlation between variables

November 18, 2016, 2:07 pm

≫ Next: Conformability Problem with synth

≪ Previous: enter dummy vs. continuous measure after entering ---is this similar to piecewise regression?

i have panel time series data.

some of the variables include exports, reer, etc.

let's say i want to see a scatter plot of correlation between exports and reer by group.

could you help me with the code pls?

Thanks

↧

Conformability Problem with synth

November 18, 2016, 2:20 pm

≫ Next: Referencing a value's label

≪ Previous: scatter plot for panel time series correlation between variables

I am using the synth package (Hainmueller et al. 2011) and getting a 'conformability error' when the code reaches the line that it applies the row labels (which I believe are the unitids of the non-treated units) to the X matrices immediately before the program tries to optimize. I thought this might be related to missing observations, but the problem is not resolved when I fill in any missing values with 0.

I'm happy to provide code, if it would help. However, I don't think its a problem what my specification of the synth command as it has worked on other data. Any ideas? Thanks very much.

↧

Referencing a value's label

November 18, 2016, 2:38 pm

≫ Next: Panel data or Pooled Cross Section?

≪ Previous: Conformability Problem with synth

I have a race variable that's coded 1-6 and has labels. Is there any way to store the labels in a local?

I want a series of .log files containing some tabulations for each race, and I'd like the file names to reflect which race they contain tables for. I'm currently using:

Code:

forvalues i = 1/6 {
    use `temp', clear
    log using clean/`i'.log, replace
    keep if race == `i'
        foreach y of local vars {
        tab `y'
        }
    log close    
    }

I'd like to use something like

Code:

forvalues i = 1/6 {
    use `temp', clear
    keep if race == `i'
    local r = race[1]
    log using clean/`r'.log, replace
        foreach y of local vars {
        tab `y'
        }
    log close    
    }

But this would just give me files named 1.log, 2.log etc like the code above. Is there a way to alter race[1] to refer to the value's label?

I know I could just replace the values with a string containing the text but that seems messy if there's a way to more easily reference the labels.

↧

Panel data or Pooled Cross Section?

November 18, 2016, 3:41 pm

≫ Next: Bootstrapping standard errors with eteffects - one specific replication has missing values, does not converge

≪ Previous: Referencing a value's label

Hello. I am not sure if the data I am working on are pooled cross section or panel data.The database includes 60 national elections of 14 countries in which non-residents were allowed to vote. The elections were held among 1993 and 2016. Variables include aggregate data such as electoral turn out of citizens living abroad, for each election in each country. My ID variable is country and my time variable is year. In some countries emigrants could vote once during that period while in other countries they could vote more than ten times. For most of the countries my database includes every election held were emigrants could vote, while for others I have missing data. Understanding if I have a panel or pooled cross sections is important to see if I have to deal or not with serial correlation residuals. Can anyone help me?

↧

Bootstrapping standard errors with eteffects - one specific replication has missing values, does not converge

November 18, 2016, 3:55 pm

≫ Next: Extract digits from a number

≪ Previous: Panel data or Pooled Cross Section?

Hello Statalist,

I have recently upgraded to Stata 14.2 and am using the new eteffects command. I have been bootstrapping standard errors with eteffects successfully to this point; however, this specific model repeatedly will run for about 250 iterations and then never move forward for an hour or more, although Stata's "working" indicator (the spinning wheel at the bottom right) runs.

Code:

eteffects (uscnoer AGE3X i.female i.race i.ratehealth K6SUM4 i.degree i.marriedyes i.household i.poverty i.region i.urban i.employment i.recession, probit) (switch i.ivfour i.ivfive), vce(bootstrap, reps(1000) seed(51113))

I checked it step by step using the noisily command and found that the problem iteration returns the following notes:

Code:

note: 6790 missing values returned for equation 5 at initial values
note: 6790 missing values returned for equation 6 at initial values
note: 6790 missing values returned for equation 7 at initial values
note: 6790 missing values returned for equation 8 at initial values
note: 6790 missing values returned for equation 9 at initial values

#Iteration 0:   EE criterion =  3.2510106  (not concave)
Iteration 1:   EE criterion =  3.0020691  (not concave)
Iteration 2:   EE criterion =  2.8847664  (not concave)
Iteration 3:   EE criterion =  2.7962922  (not concave)
Iteration 4:   EE criterion =  2.7957563  (not concave)
Iteration 5:   EE criterion =   2.795542  (not concave)
Iteration 6:   EE criterion =  2.7953705  (not concave)
Iteration 7:   EE criterion =  2.7953534  (not concave)
Iteration 8:   EE criterion =  2.7953517  (not concave)
Iteration 9:   EE criterion =  2.7953516  (not concave)
Iteration 10:  EE criterion =  2.7953516  (not concave)

If left to its own devices, it will keep repeating the last value (2.7953516) for more than 100 iterations. The preceding runs of this model converge in one or two iterations at most, and I strongly suspect that the problem is the 6790 missing values alluded to in the notes - they never show up until the problem run. However, I'm not sure why these values are missing in this one case, or indeed what it refers to when it says that they are missing in equations 5, 6, 7, etc. The sample in the other runs of the model is 8,466 individuals, and there does not seem to be any variable that is missing for anywhere near 6,790 respondents.

Here is a sample of my dataset:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(uscnoer switch) byte AGE3X float(female race ratehealth) byte K6SUM4 float(degree marriedyes household poverty region urban employment recession switch ivfour ivfive)
0 0 55 0 0 3  0 0 0 2 5 2 1 2 2 0 0 0
1 1 58 1 1 2 -1 2 0 2 4 1 1 0 2 1 0 0
1 0 47 0 0 3  8 1 0 1 5 2 0 2 2 0 0 0
1 0 20 0 2 3 17 1 0 2 3 3 1 0 2 0 0 0
1 0 44 1 1 2  2 1 0 2 3 4 1 2 2 0 0 0
1 0 45 1 0 5 20 1 0 2 1 1 1 2 2 0 0 0
1 1 60 0 0 3  1 1 1 2 4 4 1 2 2 1 0 0
1 0 47 1 2 4 18 0 0 2 1 1 1 0 2 0 0 0
1 0 58 0 2 3  2 0 1 2 4 4 1 0 2 0 0 0
0 1 52 0 0 4  0 1 1 2 4 2 1 2 2 1 0 0
end
label values AGE3X H1560135X
label values K6SUM4 H1561688X
label def H1561688X -1 "-1 INAPPLICABLE", modify

When I set a different seed, the endlessly-iterating problem still occurs, but after fewer replications.

I would greatly appreciate any feedback the list can provide on strategies for diagnosing, understanding, and/or fixing this issue.

Sincerely,

Liz Wood

↧

Extract digits from a number

November 18, 2016, 5:46 pm

≫ Next: How to math and merge data for a family from different subfiles in CHNS database?

≪ Previous: Bootstrapping standard errors with eteffects - one specific replication has missing values, does not converge

We have a variable total_response_time formatted in an hour_minute_second way. For example 12301 means 1hour 23 minutes and 01 second, 423 means 4 minutes and 3 seconds, 113456 means 11 hours and 34 minutes and 56 seconds. How can we convert them into seconds? For example 12301 = 1*3600+23*60+1. The tricky part is that some of them have 6 digits, some 5 digits... Therefore, the following codes don't work

gen x = totalresponsetime
tostring x, gen(str_x)
gen xhour=substr(str_x, 1, 2) //pull out the hours(1 means start at the first character and 2 means pull out 4 characters)
gen xminutes=substr(str_x, 3, 2) //pull out the minutes (5 means start at the fifth character and 2 means pull out 2 characters)
gen xseconds = substr(str_x, 5, 2) //pull out the seconds (8 means start at the eight character and 2 means pull out 2 characters)

↧