Quantcast
Channel: Statalist
Viewing all 72764 articles
Browse latest View live

How to Create Maps in Stata

$
0
0
Hi All,

I am trying to use the data attached to create 2 maps in Stata. Data is comprised of 2 variables, student zip codes (HOMEZIP) and student types (StudentType). I would like to "map" zip codes as dots onto the state of Texas to see whether student locations cluster. I would like to do this for 2 student types, gtex students in gtex.dta and TCEDH students in tcedh.dta. What are the commands necessary to construct these 2 maps (again, just zip-code-based dots mapped onto the state of Texas)? Here are links to my data (I was not able to upload them into Statalist):

https://drive.google.com/file/d/0Bz-...ew?usp=sharing

https://drive.google.com/file/d/0Bz-...ew?usp=sharing

Thanks in advance,

Adam



Source of Stata's "school" dataset

$
0
0
Does anyone know the source the dataset that is loaded by typing
Code:
webuse school
I tried typing "notes", but none were provided. I am interested in definitions of variables as well as the original source (if possible).

year, quarter, month, and week: Creating a date? Sample Data provided.

$
0
0
Dear Statalist participants

I have year, quarter, month, week in my data and I want to create a Stata date variable on a weekly basis and sort my data accordingly.

the quarter runs from 1 to 4 , But the month variable runs from 1 to 3 (i.e. during the quarter) and the week variable runs from 1 to 4 (i.e. during the month).

I need:
1- to create a date variable
2- to sort my data weekly using the date variable in order to run weekly regressions

Another related question: in order to run my analysis on a weekly basis, would it be easier if the month variable runs from 1 to 12 and the the week variable runs from 1 to 48 ( weeks are stylized so there are 48 weeks indeed for each given year). Or it does not make a difference?

I upload a sample using dataex for you
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int year byte(quarter month week) float earnings
1987 1 1 1   1.7705997
1987 1 1 2    .4728113
1987 1 1 3    .3941394
1987 1 1 4   .11440884
1987 1 2 1     .160652
1987 1 2 2    .3961538
1987 1 2 3   .45841575
1987 1 2 4    .5288332
1987 1 3 1    .8589164
1987 1 3 2    .8497481
1987 1 3 3    .8618858
1987 1 3 4     .961287
1987 2 1 1    1.345534
1987 2 1 2   1.1195575
1987 2 1 3    .3546848
1987 2 1 4    .4636721
1987 2 2 1    .7540874
1987 2 2 2    .8403442
1987 2 2 3    .8561469
1987 2 2 4    .8706375
1987 2 3 1    1.383695
1987 2 3 2   1.3878306
1987 2 3 3   1.3569516
1987 2 3 4   1.2848142
1987 3 1 1    1.087582
1987 3 1 2    .7876818
1987 3 1 3     .200033
1987 3 1 4   .16993275
1987 3 2 1    .2436485
1987 3 2 2    .2782035
1987 3 2 3   .32997075
1987 3 2 4    .3608698
1987 3 3 1    .5761423
1987 3 3 2    .6065645
1987 3 3 3    .6020274
1987 3 3 4    .6339163
1987 4 1 1   1.3598517
1987 4 1 2   .56407243
1987 4 1 3    .3157399
1987 4 1 4   .22502215
1987 4 2 1    .3535852
1987 4 2 2    .3888265
1987 4 2 3     .425247
1987 4 2 4    .4753116
1987 4 3 1    .7817153
1987 4 3 2    .7864803
1987 4 3 3    .7770564
1987 4 3 4     .810554
1988 1 1 1   1.9881122
1988 1 1 2    .9616702
1988 1 1 3    .9560576
1988 1 1 4  -.02433528
1988 1 2 1   .06566841
1988 1 2 2    .1756358
1988 1 2 3    .2455934
1988 1 2 4   .29687575
1988 1 3 1    .4343888
1988 1 3 2    .4354226
1988 1 3 3    .4378992
1988 1 3 4    .4880208
1988 2 1 1   2.0682137
1988 2 1 2    .7863678
1988 2 1 3   .16332495
1988 2 1 4    .1944389
1988 2 2 1    .3070929
1988 2 2 2    .4097694
1988 2 2 3    .4642496
1988 2 2 4    .4707479
1988 2 3 1     .755556
1988 2 3 2    .7392501
1988 2 3 3    .7286718
1988 2 3 4    .7465883
1988 3 1 1   1.2585707
1988 3 1 2    .4756329
1988 3 1 3   .17472464
1988 3 1 4  -.10320017
1988 3 2 1   -.1050039
1988 3 2 2  -.06223335
1988 3 2 3   -.0396569
1988 3 2 4 -.001310848
1988 3 3 1  .010440214
1988 3 3 2   .03693339
1988 3 3 3   .03542653
1988 3 3 4   .07352171
1988 4 1 1   1.9921095
1988 4 1 2   1.0118054
1988 4 1 3   -.1854238
1988 4 1 4   .08538733
1988 4 2 1   .20397782
1988 4 2 2    .2374128
1988 4 2 3   .28562233
1988 4 2 4   .28901428
1988 4 3 1   .47351405
1988 4 3 2    .4622011
1988 4 3 3    .4750741
1988 4 3 4      .48286
1989 1 1 1   2.0214815
1989 1 1 2   1.0627728
1989 1 1 3    .6280478
1989 1 1 4  -.03181221
end


Thanks
Mike




survival analysis with overlapping records probable error

$
0
0
Dear Stata team,
hi again,
I am trying to run the survival analysis. I have multiple records per subject. the participant come to the clinic and the specimen is taken to measure the outcome (it takes many days for the outcome to be detected) and some times the participant may come to the next visit when the second sample is collected from him before the results of the first sample shows up.
these overlapping between the visit2 and the outcome of visit1 should be independent meaning that it doesn't matter if they result of visit1 comes before sampling in visit2. what matters is the duration that we observe for each sampling to be detected.
is there a way where I can get rid of this overlapping issue? does it affect the analysis?

I have these variables:
date of sampling
date of detection
detection result (0 for positive and 1 for negative)

I used the following
stset date of detection, id(id) failure(detection result==1) time0(date of sampling)

any help on this ?

thank you very much

best regards
Umama

Transforming Time Series Data

$
0
0
I have a Time Series on the price of coffee and 450 observations. Is there any particular reason I see articles log transforming the data?

Thanks in advance

New package descgen on SSC

$
0
0
Thanks once again to Kit Baum, a new package descgen is now downloadable from SSC. In Stata, use the ssc command to do this.

The descgen package is described as below on my website. It is designed to be used with the existing SSC package xdir to create meta-datasets, with 1 observation for each of a list of Stata dataset files and data on an assortment of Stata dataset attributes. Meta-datasets can be very useful in documenting big databases concisely.

Best wishes

Roger

---------------------------------------------------------------------------
package descgen from http://www.imperial.ac.uk/nhli/r.newson/stata10
---------------------------------------------------------------------------

TITLE
descgen: Add Stata dataset attribute variables to a xdir resultsset

DESCRIPTION/AUTHOR(S)
descgen is intended for use in output datasets (or resultssets)
produced by the xdir package, which have one observation for each
of a list of files, and data on directory names and file names.
It inputs the file name variable and (optionally) the directory
name variable, and generates a list of new variables, containing,
in each observation, Stata dataset attributes describing the
Stata dataset stored in the corresponding file. These attributes
include numbers of variables and observations and sizes of an
observation or of the dataset, and (optionally) lists of
variables in the dataset. If the corresponding file is not a
Stata dataset, then these dataset attribute variables will have
missing values.

Author: Roger Newson
Distribution-Date: 02march2017
Stata-Version: 10

INSTALLATION FILES (click here to install)
descgen.ado
descgen.sthlp
---------------------------------------------------------------------------
(click here to return to the previous screen)

Which code for my mixed effects model should I be using

$
0
0
Hi All,

I have longitudinal data (16 cycles) and I am trying to estimate changes in an outcome (padif) with the four variables: limp himp girl income2. Note: the max shown in output is 15 because my outcome is a computation of changes from current cycle with previous cycle (2-1, 3-2, 4-3...) This would exclude cycle 1 from estimation.
Here are my questions:
1) Model 1, as far as i understand this part of my code || ID allows for a random intercept for each individual (i.e. Stata computes the within variance [in output var(_cons)]) and therefore also computes the between variance (in output its called var(Residual))
2) Model 1 including cycle as a covariate provides an estimate of changes over time. Therefore my interpretation of that coefficient would be: A 1 cycle increase is associated with a 0.016 descrease in padif, adjusting for all covariates of course. But I am wondering if my syntax in model 1 is correct and does not answer my question. Reading the Stata documentation for xtreg and mixed made me consider the 2nd model that I present below.
3) Does the use of ||ID: cycle adjust appropriately for time trends? In the output for model 2 what is each reported random effect parameter? And how come Stata doesn't output std err or confidence intervals?
4) Which syntax would you use and why? Do you recommend that I use anything else?

Thank you for your time and help


Code:
*Model 1
. mixed padif limp himp girl income2 cycle ||ID:, covariance(uns)
Note: single-variable random-effects specification in ID equation; covariance structure set to identity

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -11352.708  
Iteration 1:   log likelihood = -11334.197  
Iteration 2:   log likelihood = -11333.829  
Iteration 3:   log likelihood = -11333.829  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =      5,965
Group variable: ID                              Number of groups  =        745

                                                Obs per group:
                                                              min =          1
                                                              avg =        8.0
                                                              max =         15

                                                Wald chi2(5)      =      19.32
Log likelihood = -11333.829                     Prob > chi2       =     0.0017

------------------------------------------------------------------------------
       padif |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        limp |  -.0367974   .0130298    -2.82   0.005    -.0623352   -.0112595
        himp |   .0282593   .0241522     1.17   0.242    -.0190782    .0755968
        girl |   .0146252   .0427318     0.34   0.732    -.0691277     .098378
     income2 |   1.23e-06   2.49e-06     0.50   0.621    -3.65e-06    6.12e-06
       cycle |  -.0168879    .005265    -3.21   0.001    -.0272071   -.0065686
       _cons |   .1108534   .1045692     1.06   0.289    -.0940983    .3158052
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
ID: Identity                 |
                  var(_cons) |   5.76e-22   9.01e-22      2.69e-23    1.23e-20
-----------------------------+------------------------------------------------
               var(Residual) |   2.617536   .0479297      2.525261    2.713183
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 0.00          Prob >= chibar2 = 1.0000

. 
end of do-file

. do "C:\Users\ppa3381\AppData\Local\Temp\STD00000000.tmp"

*Model 2
mixed padif limp himp girl income2 || ID: cycle, covariance(uns)

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -11375.475  
Iteration 1:   log likelihood = -11339.859  
Iteration 2:   log likelihood = -11338.969  
Iteration 3:   log likelihood = -11338.969  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =      5,965
Group variable: ID                              Number of groups  =        745

                                                Obs per group:
                                                              min =          1
                                                              avg =        8.0
                                                              max =         15

                                                Wald chi2(4)      =       9.02
Log likelihood = -11338.969                     Prob > chi2       =     0.0606

------------------------------------------------------------------------------
       padif |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        limp |  -.0387707   .0130264    -2.98   0.003    -.0643021   -.0132393
        himp |   .0366838   .0240297     1.53   0.127    -.0104135    .0837811
        girl |   .0128198    .042765     0.30   0.764     -.070998    .0966376
     income2 |   4.80e-07   2.49e-06     0.19   0.847    -4.39e-06    5.35e-06
       _cons |  -.0009103   .0986782    -0.01   0.993     -.194316    .1924953
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
ID: Unstructured             |
                  var(cycle) |   8.40e-11          .             .           .
                  var(_cons) |   8.62e-10          .             .           .
            cov(cycle,_cons) |  -7.73e-11          .             .           .
-----------------------------+------------------------------------------------
               var(Residual) |   2.622051          .             .           .
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 0.00                  Prob > chi2 = 1.0000

Note: LR test is conservative and provided only for reference.

. 
end of do-file

oaxaca command for changing the gap in two years differences .

$
0
0
  • oaxaca command for changing the gap

    Hello,
    i did oaxaca for cross-section data. In this stage I want the command related to the differences:
    1/ the changes in gender pay gap in two years (2000-2005).
    (wage F -wage M )2000- (wage F- wage M ) 2005 = ( ....other variable decomposed)

    2/ the different in gender pay gap between two regime for example union and nonunion firms >
    (wage F -wage M )union - (wage F- wage M ) non-union= ( ....other variable decomposed)
    I fined the theoretical back ground but I do not see the command related to this.

    with regard
    Tahani

The least tedious method for calculating many autocorrelations?

$
0
0
Hi!

Lets say that I have time series data of returns for one stock over a period of 30 years and want to calculate the first autocorrelation (in yearly stock returns) figure by running an AR(1)-regression. Then I could create a new variable (which I will name "persistence in stock returns" ) using the coefficient value from the regression. This is the first autocorrelation figure.

Now lets assume that I have data for 1000 stocks instead. For each stock, I want to calculate this "persistence/autocorrelation measure" from above. The thing is that I want to store this data nicely in a variable because later, I'm gonna use this as the dependent variable in a cross sectional regression...

So, how do I do this? Do I really have to run 1000 regressions??

/Alex

not estimable margins after OLS on panel data

$
0
0
Dear Statalisters,

I have a problem in computing the margins of an OLS regression with a panel data that I have not managed to solve by looking at previous posts - I apologies in case I overlooked similar issues that had already been discussed.

Background
I have a panel data of crime incidence (from now on, "main_rate") observed for each of the 600 municipalities of a country across 13 years and I evaluated successfully the impact of a reform that occurred in the 6th years. Now I would like to explore some heterogeneous effects, and in particular how the magnitude of the treatment coefficient changes with the size of each of the police districts the country is divided into. By "size" I mean the number of municipalities composing the police district.

The dummy "Treat" is equal to 1 for those municipalities were the reform was implemented and the dummy d turns on for the years during which the reform was enforced.

Code:
treatment=Treat*d
(using, instead, the interaction form ## would not change what follows)

The distribution of the size of the police (variable's name "sizeZP") district based on the value of the treatment variable is the following.

Code:
           |       treatment
    sizeZP |         0          1 |     Total
-----------+----------------------+----------
         1 |       504        120 |       624
         2 |       858        208 |     1,066
         3 |     1,446        192 |     1,638
         4 |       996        512 |     1,508
         5 |       915        320 |     1,235
         6 |       306        240 |       546
         7 |        91          0 |        91
         8 |        80        128 |       208
         9 |       135        216 |       351
        10 |       230        160 |       390
-----------+----------------------+----------
     Total |     5,561      2,096 |     7,657
Due to the fact that there are some imbalances in the distribution (in particular, police districts with 7 municipalities are only in the control group) and to other rather conceptual reasons, I decided to regroup the police districts in this way (however, I don't believe this is actually the source of my problem. Even by using "sizeZP" I would encounter the problem described below):

Code:
gen sizePD=1 if sizeZP==1
replace sizePD=2 if sizeZP>1 & sizeZP <=4
replace sizePD=3 if sizeZP>4 & sizeZP<=7
replace sizePD=4 if sizeZP>7
The interaction term
I then proceed computing the interaction term

Code:
xtreg main_rate Treat d treatment##i.sizePD $controls i.year, fe vce(cluster INS)
This are the results I get:

Code:
Fixed-effects (within) regression               Number of obs      =      7655
Group variable: INS                             Number of groups   =       589

R-sq:  within  = 0.0943                         Obs per group: min =        11
       between = 0.2051                                        avg =      13.0
       overall = 0.1403                                        max =        13

                                                F(21,588)          =     17.38
corr(u_i, Xb)  = -0.6185                        Prob > F           =    0.0000

                                      (Std. Err. adjusted for 589 clusters in INS)
----------------------------------------------------------------------------------
                 |               Robust
       main_rate |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
             WAL |          0  (omitted)
               d |  -.0441572   .0209879    -2.10   0.036    -.0853775   -.0029369
     1.treatment |  -.1234831   .0265136    -4.66   0.000     -.175556   -.0714102
                 |
          sizePD |
              2  |          0  (omitted)
              3  |          0  (omitted)
              4  |          0  (omitted)
                 |
treatment#sizePD |
            1 2  |   .0595935   .0275074     2.17   0.031     .0055688    .1136182
            1 3  |   .0610467   .0297444     2.05   0.041     .0026285    .1194648
            1 4  |   .0920392   .0321078     2.87   0.004     .0289793    .1550992
                 |
             pop |  -2.26e-06   1.91e-06    -1.18   0.237    -6.02e-06    1.49e-06
         density |  -.0000497   .0000186    -2.67   0.008    -.0000863   -.0000132
     meanyxdecla |  -3.00e-06   3.94e-06    -0.76   0.446    -.0000107    4.73e-06
           unemp |   .0016622   .0049068     0.34   0.735    -.0079748    .0112993
         edu_low |   .0080593   .0069817     1.15   0.249    -.0056527    .0217713
                 |
            year |
           2001  |  -.0124893   .0083754    -1.49   0.136    -.0289387      .00396
           2002  |   .0212187   .0107612     1.97   0.049     .0000836    .0423537
           2003  |  -.0054469   .0142556    -0.38   0.703    -.0334449    .0225512
           2004  |  -.0272049   .0168966    -1.61   0.108      -.06039    .0059802
           2005  |          0  (omitted)
           2006  |  -.0055406   .0066912    -0.83   0.408    -.0186822    .0076009
           2007  |   .0142893   .0107971     1.32   0.186    -.0069164    .0354949
           2008  |   .0312251   .0146257     2.13   0.033     .0025001      .05995
           2009  |   .0184093   .0173059     1.06   0.288    -.0155796    .0523983
           2010  |   .0089712    .019733     0.45   0.650    -.0297845     .047727
           2011  |   .0410702   .0246737     1.66   0.097    -.0073891    .0895295
           2012  |   .0293179   .0309947     0.95   0.345    -.0315559    .0901918
                 |
           _cons |   3.700322   .2984158    12.40   0.000     3.114231    4.286412
-----------------+----------------------------------------------------------------
         sigma_u |  .51188845
         sigma_e |  .14790359
             rho |  .92294798   (fraction of variance due to u_i)
----------------------------------------------------------------------------------
The margins... and the problem
In computing margins, I tried several combinations of

Code:
margins treatment##sizePD
and obtained this:

Code:
. margins treatment##sizePD

Predictive margins                                Number of obs   =       7655
Model VCE    : Robust

Expression   : Linear prediction, predict()

----------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
       treatment |
              0  |          .  (not estimable)
              1  |          .  (not estimable)
                 |
          sizePD |
              1  |          .  (not estimable)
              2  |          .  (not estimable)
              3  |          .  (not estimable)
              4  |          .  (not estimable)
                 |
treatment#sizePD |
            0 1  |          .  (not estimable)
            0 2  |          .  (not estimable)
            0 3  |          .  (not estimable)
            0 4  |          .  (not estimable)
            1 1  |          .  (not estimable)
            1 2  |          .  (not estimable)
            1 3  |          .  (not estimable)
            1 4  |          .  (not estimable)
----------------------------------------------------------------------------------
I have the feeling I am missing something really basic detail, but I have been digging into it so much that I don't manage to step back and find a solution anymore.

Does anyone of you have a solution to this oddity? If you need more information about the type of data, please do not hesitate to ask me below.

Thank you in advance!

Andrea

How to account for large number of FE when using suest command

$
0
0
I am attempting to use suest with 10-20 linear equations, each of which have a large number of fixed effects and, not surprisingly, running into matsize problems. I believe that one viable alternative would be to first apply a "within transformation" and then estimate the models on the transformed X and Y without the FE. I have done this in the past, and realize that in order to conduct inference, one must adjust for degrees of freedom. In past work, I have simply manually adjusted s.e., multipling by a factor of { (NT-K)/ [ N(T-1)-K] } ^ 0.5.

But, I'm not entirely clear what, if anything, in addition, I should do in the context of suest.

I *suspect* that I should manually adjust the covariance matrix after each equation, and if/when these adjusted covariance matrices are fed into suest, all will be good. But this might not be correct. Also, I am not particularly facile at Mata or other matrix commands in Stata so I'm not sure how to do this manual adjustment.

Any suggestions appreciated.

Multiple commands within a loop (clonevar, recode, label define...)

$
0
0
Hello!

Here is my setup:

foreach var of varlist homeschlx schoicex-ssamsc seadplcx serepeat ///
sesusout-seexpel snetcrs sinstfee fssportx-fscounslr fsnotesx-fsphonchx ///
fhplace fostory2x-fohistx folibrayx-fosprtevx hdlearnx-hddeviepx hdspcled ///
hdlearn-hdfrnds cenglprg p1scint p1wrmtl p1hispan p1enrl p2guard p2scint ///
p2wrmtl p2hispan p2enrl p2lkwrk hwelftan-hsecn8 hvintrnt {
clonevar `var' = clone_`var'
recode clone_`var' = (1==2) (2==1)
label define clone_`var' 1 "No" 2 "Yes"
label value clone_`var' clone_`var'
}

I'm attempting to clone, recode, and create a new label for the cloned variables all within one loop. How close is my setup and what am I missing (I'm pretty sure a lot). I'm new to understanding how to loop with multiple commands. There is much to learn, but thank you for your help!

Use local in global

$
0
0
Dear statalists,

I am not entirely sure whether "use local in global" is an appropriate description for my problem but here is what I would like to do:

I am running several regressions for different dependent variables. The dependent variables differ in the year they refer to.

Code:
        foreach year of numlist 1950(10)1990 {

            global control_dist1_`year'     weighted_dist_y`year'
            global control_dist2_`year'     weighted_dist_y`year' c.weighted_dist_y`year'#c.weighted_dist_y`year'
            global control_dist3_`year'     weighted_dist_y`year' c.weighted_dist_y`year'#c.weighted_dist_y`year' c.weighted_dist_y`year'#c.weighted_dist_y`year'#c.weighted_dist_y`year'

        }

        foreach year of numlist 1950(10)1990 {

            reg mig_total_`year' weighted_sc_initial10_y`year' $control_dist1_`year' $control_pop_`year', robust            // linear dist control
            reg mig_total_`year' weighted_sc_initial10_y`year' $control_dist2_`year' $control_pop_`year', robust            // quadratic d control
            reg mig_total_`year' weighted_sc_initial10_y`year' $control_dist3_`year' $control_pop_`year', robust            // cubic dist. control
            
        }
First, I would like to define for globals for different control variables. With a foreach loop, I define these for every decadal year between 1950 and 1990. Then, when running the actual regressions, I would like to use these controls but only for the year in the dependent variable.

The result is that I try to call upon a local (from foreach loop) within a global (previously defined). For example,
Code:
$control_dist1_`year'
This does not work. Stata gives the error:
1950 [i.e. the first year in the foreach loop] invalid name
.

Did I miss something obvious? Is this possible at all?

Many thanks,
Milan

stata course

How do you make pie chart by combining two variables in stata?

$
0
0

How do you make pie chart by combining two variables in stata?. For example, make pie charts by sex (male, female) and studies (yes, no).

Markov switching model for panel data--Help Please

$
0
0
Dear Statalists,

I need help with Markov switching model for panel data. Does any one know how can i do this?

thanks

arshad

marginsplot after mimrgns

$
0
0
Hello everyone,
I was trying to use marginsplot command after mimrgns, but it shows the following error "previous command was not margins". Does anyone know where I did wrong. Thanks.
Lei

Interpreting poisson regression coefficients

$
0
0
Hi,

I would like to understand how I could interpret the coefficients generated by poisson regression (and zero-inflated poisson if different from poisson). Is it simply exp (beta coeff) as the multiplication factor of the mean dependent variable? The regression equation and results is as follow:

dependent variable=treatment + after + treatment*after + error

Both treatment and after is a binary indicator for being in treatment group and after a service introduction. Thanks.
Weekly Visits 2.6403***
(0.0794)
Weekly Quantity 2.6168***
(0.1049)

Filling missing values

$
0
0
Hi dear all,

I'm new to stata and I’m having a problem with filling the missing data here. I'm looking into the relation of crime rates and the share of the young / share of the young male, and got dozens of data like this:

.
year cr_uk y ym
1800 0.9 0.2073  
1810 1.1  
1820 0.9  
1830 1.1 0.1518 0.0837
1840 1.0  
1850 0.9 0.2985 0.1236
1860 1.0 0.3060 0.1209
1870 0.9 0.3566  
1880 0.8  
1890 0.8  
1900 0.6  
1910 0.6  
1920 1.0  
1930 0.9  
1940 0.7  
1950 0.5 0.2140 0.1053
1960 0.8 0.1910 0.0955
1970 1.5 0.1997 0.1009
1980 1.7 0.2142 0.1083
1990 1.6 0.2327 0.1166
2000 1.6 0.2071 0.1030
2010 1.1 0.1989 0.1001


or like this:

.
year cr_dk y ym
1800 0.2385 0.1194
1810  
1820  
1830  
1840  
1850 0.2096 0.1042
1860 0.2596 0.1229
1870  
1880  
1890  
1900  
1910  
1920 0.6  
1930 0.7  
1940 0.9  
1950 0.8 0.2169 0.1079
1960 0.6 0.1895 0.0947
1970 0.8 0.2218 0.1136
1980 1.2 0.2262 0.1161
1990 1.2 0.2274 0.1170
2000 0.9 0.2106 0.1071
2010 0.8 0.1782 0.0899

where cr stands for crime rates, uk & dk for two countries, y for the share of age 20-29 of total population and ym for the share of male aged 20-29 of total population.

I need to fill in the blanks so as to run a simple regression subsequently (reg cr y; reg cr ym). I was told to use some interpolation. So I read some solutions about mi impute and linear interpolation, but only get more confused in front of various alternatives. What method and which option should I best obtain here, for example, for the above two datasets? Should I distinguish the two kinds or would it be simpler if I pool all the countries together and interpolate? I'm totally lost.

For y and ym, I have complete UN data from 1950-2015 by every 5 year. To match with the timeline of cr, only data of every 10 year are listed above. Somehow i feel it wrong. But should I include all the y and ym data, I would have more gaps to fill in crime rates (or is it better?).

I'm terribly sorry for my English expression. If there is anything unclear please do address it and throw it at my face. By the way I am using Stata 12 for Windows. Some syntax I found on line sadly does not work on my laptop.

Can anyone drop a hint? Many thanks in advance!



What I have found
http://wlm.userweb.mwn.de/Stata/wstamiss.htm
http://www.stata.com/manuals13/mimii...pdf#mimiimpute
http://www.stata.com/manuals13/mimixeq.pdf
http://stats.idre.ucla.edu/stata/sem...stata_pt1_new/
http://www.stata.com/support/faqs/da...issing-values/

continuous variable in logistic regression

$
0
0
Hi Statalist.
I'm running a mixed-effects logistic regression (melogit) with a continuous predictor (which is a volume).

Using the untransformed variable, the OR I got is the OR for one-unit increase of this volume. If I want a 10-units increase, I include my variable/10 in my model.
The fact is that I need to log-transform this variable
So, my question has two aspects:
- in Stata, how can I get the OR for different points of this continuous variable. I know the xblc command, but it seems to allow only "real" values of the variable
- how can I combine rescaling and transforming the variable. My understanding is that I should rescale (i.e. x/10) before transforming. But how can I back transform then?
I'm using Stata 14.1.
Thanks

Viewing all 72764 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>