stcrreg _ proportional assumption

February 11, 2020, 8:14 am

≫ Next: Heteroskedasticity and finding number of deals over time period - panel data

≪ Previous: odbc connection extremely slow on another pc

Hi all:
i have a question regarding competing risk regression, "stcrreg". like Cox model, stcrreg also has the proportionality assumption. in the stata survival analysis reference manual, it talks about how to detect if the assumption is violated (using tvc()), but does not discuss how to correct the violation, if there is one. so i assume like Cox model, in estimating stcrreg, i should include an interaction between a variable and log(_t), if that variable violates the assumption. However, i think this might not be the correct way to correct the violation. sometimes, once i include such interaction variables, the stcrreg estimation may run forever. so i am asking in estimating the stcrreg model, how to correct the violation of proportionality assumption, if there is one? Thanks in advance

↧

Heteroskedasticity and finding number of deals over time period - panel data

February 11, 2020, 8:22 am

≫ Next: Pseudo panel models and the measurement error problem ?

≪ Previous: stcrreg _ proportional assumption

Dear Forum,

We are struggling with this issue: we have panel data of acquirers and date of announcement of deals and we need to create in STATA subsamples of occasional (>= 2 deals over a 3 year period) and frequent (>=5 deals over a 3 year period) acquirers. We are not sure which approach to take and which code to implement.

Moreover, we are also having issues in checking for heteroskedasticity in our xtreg with year fixed effects.

We also encounter issues with collinearity as we have quite a few dummies, mainly 5 (including interaction effects between dummies).

Any help/advice?

P.S. we are trying to replicate the paper "Extraordinary acquirers" by Golubov et al. in 2015.

↧

Pseudo panel models and the measurement error problem ?

February 11, 2020, 8:33 am

≫ Next: Gravity models, ppmlhdfe and different sets of HDFEs

≪ Previous: Heteroskedasticity and finding number of deals over time period - panel data

I am currently working on a research which has 10 repeated cross sectional waves. I have panellised the data by constructing cohorts ( state of residency and gender ) and calculating their group means in each available cross sections and treating these cohort averages as observations in the panel. Trying to encompass the measurement error problem which is created in calculating the efficient Wald estimator. I am trying to adopt the mechanics of Devereux (2007) and Verbeek (2008). Currently using Stata as the statistical software. Are there any guidelines for using codes for the aforementioned models ?
It is important to note that the use of FE/RE models would still carry the measurement error problem therefore I have adopted the above models by those authors as errors in variables estimator. I am not quite sure whether I adopted the correct methodology. Please do advice me on the model selection and the Suitable Stata codes that I have to use.

↧

Gravity models, ppmlhdfe and different sets of HDFEs

February 11, 2020, 9:06 am

≫ Next: Substring matching function?

≪ Previous: Pseudo panel models and the measurement error problem ?

To whom it may concern,

I have a panel dataset of exports between country pairs at industrial sector level of disaggregation.
On that, I am actually performing two different gravity estimates using ppmlhdfe in STATA 16.

Code:

ppmlhdfe export c.DWP_jt##b24.sector1, absorb(hcountry_year pcountry_year sector_year, savefe) vce(cluster hcountry)

ppmlhdfe export c.DWP_jt##b24.sector1, absorb(hcountry_pcountry hcountry_year pcountry_year sector_year, savefe) vce(cluster hcountry)

The variable DWPjt has dimension time*pcountry (that is the importer, in my case).
The two specifications differ because for the second one I add an HDFE more in absorb: the "individual" flow FE hcountry_pcountry (or exporter-importer).
However, the results (and the linear predictions (xb)) of the two models are exactly the same.

For the first specification the output is:

Code:

 ppmlhdfe export c.DWP_jt##b24.sector1, absorb(hcountry_year pcountry_year sector_year, savefe) vce(cluster hcountry
> )
note: 24 variables omitted because of collinearity: DWP_jt 1bn.sector1 2bn.sector1 3bn.sector1 4bn.sector1 5bn.sector
> 1 6bn.sector1 7bn.sector1 8bn.sector1 9bn.sector1 10bn.sector1 11bn.sector1 12bn.sector1 13bn.sector1 14bn.sector1
> 15bn.sector1 16bn.sector1 17bn.sector1 18bn.sector1 19bn.sector1 20bn.sector1 21bn.sector1 22bn.sector1 23bn.sector
> 1
Iteration 1:   deviance = 3.3655e+05  eps = .         iters = 6    tol = 1.0e-04  min(eta) =  -8.33  P  
Iteration 2:   deviance = 1.9181e+05  eps = 7.55e-01  iters = 4    tol = 1.0e-04  min(eta) =  -9.41      
Iteration 3:   deviance = 1.5619e+05  eps = 2.28e-01  iters = 3    tol = 1.0e-04  min(eta) = -10.33      
Iteration 4:   deviance = 1.4721e+05  eps = 6.10e-02  iters = 3    tol = 1.0e-04  min(eta) = -12.86      
Iteration 5:   deviance = 1.4501e+05  eps = 1.52e-02  iters = 3    tol = 1.0e-04  min(eta) = -15.34      
Iteration 6:   deviance = 1.4450e+05  eps = 3.52e-03  iters = 3    tol = 1.0e-04  min(eta) = -17.49      
Iteration 7:   deviance = 1.4438e+05  eps = 8.18e-04  iters = 2    tol = 1.0e-04  min(eta) = -19.45      
Iteration 8:   deviance = 1.4436e+05  eps = 1.82e-04  iters = 2    tol = 1.0e-04  min(eta) = -21.32      
Iteration 9:   deviance = 1.4435e+05  eps = 3.66e-05  iters = 2    tol = 1.0e-04  min(eta) = -23.09      
Iteration 10:  deviance = 1.4435e+05  eps = 6.55e-06  iters = 2    tol = 1.0e-05  min(eta) = -24.98      
Iteration 11:  deviance = 1.4435e+05  eps = 1.02e-06  iters = 2    tol = 1.0e-06  min(eta) = -26.86   S  
Iteration 12:  deviance = 1.4435e+05  eps = 1.34e-07  iters = 2    tol = 1.0e-06  min(eta) = -28.54   S  
Iteration 13:  deviance = 1.4435e+05  eps = 1.38e-08  iters = 2    tol = 1.0e-07  min(eta) = -29.86   S  
Iteration 14:  deviance = 1.4435e+05  eps = 1.03e-09  iters = 2    tol = 1.0e-09  min(eta) = -30.57   S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out   s: exact solver   h: step-halving   o: epsilon below tolerance)
Converged in 14 iterations and 38 HDFE sub-iterations (tol = 1.0e-08)

HDFE PPML regression                              No. of obs      =   13593600
Absorbing 3 HDFE groups                           Residual df     =        209
Statistics robust to heteroskedasticity           Wald chi2(23)   =     262.89
Deviance             =  144348.8241               Prob > chi2     =     0.0000
Log pseudolikelihood =  -181068.746               Pseudo R2       =     0.6349

Number of clusters (hcountry)=       210
                                 (Std. Err. adjusted for 210 clusters in hcountry)
----------------------------------------------------------------------------------
                 |               Robust
          export |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
          DWP_jt |          0  (omitted)
                 |
         sector1 |
             10  |          0  (omitted)
             11  |          0  (omitted)
             12  |          0  (omitted)
             13  |          0  (omitted)
             14  |          0  (omitted)
             15  |          0  (omitted)
             16  |          0  (omitted)
             17  |          0  (omitted)
             18  |          0  (omitted)
             19  |          0  (omitted)
             20  |          0  (omitted)
             21  |          0  (omitted)
             22  |          0  (omitted)
             23  |          0  (omitted)
             24  |          0  (omitted)
             25  |          0  (omitted)
             26  |          0  (omitted)
             27  |          0  (omitted)
             28  |          0  (omitted)
             29  |          0  (omitted)
             30  |          0  (omitted)
             31  |          0  (omitted)
             32  |          0  (omitted)
                 |
sector1#c.DWP_jt |
             10  |   -.007955   .0024418    -3.26   0.001    -.0127409   -.0031691
             11  |   -.004336   .0035532    -1.22   0.222    -.0113001    .0026281
             12  |    -.00851   .0074887    -1.14   0.256    -.0231877    .0061676
             13  |  -.0072332   .0040025    -1.81   0.071     -.015078    .0006116
             14  |   .0017449   .0025176     0.69   0.488    -.0031895    .0066793
             15  |   .0059696   .0037561     1.59   0.112    -.0013923    .0133314
             16  |   .0008984   .0055685     0.16   0.872    -.0100156    .0118124
             17  |  -.0016787   .0047734    -0.35   0.725    -.0110345     .007677
             18  |   -.016464   .0045652    -3.61   0.000    -.0254116   -.0075165
             19  |  -.0013528   .0029942    -0.45   0.651    -.0072213    .0045157
             20  |  -.0003031   .0016259    -0.19   0.852    -.0034899    .0028837
             21  |  -.0080486   .0025083    -3.21   0.001    -.0129647   -.0031325
             22  |  -.0056654   .0020844    -2.72   0.007    -.0097507   -.0015801
             23  |  -.0060115   .0016355    -3.68   0.000    -.0092171    -.002806
             24  |  -.0006525   .0018593    -0.35   0.726    -.0042967    .0029917
             25  |    -.00685   .0017327    -3.95   0.000     -.010246    -.003454
             26  |   .0037718   .0011331     3.33   0.001      .001551    .0059926
             27  |   .0021261   .0022674     0.94   0.348    -.0023178    .0065701
             28  |  -.0002189    .001518    -0.14   0.885    -.0031941    .0027562
             29  |    .001217   .0031422     0.39   0.699    -.0049416    .0073755
             30  |   -.019625    .005028    -3.90   0.000    -.0294797   -.0097703
             31  |  -.0007644   .0043387    -0.18   0.860    -.0092681    .0077393
             32  |  -.0005853   .0016847    -0.35   0.728    -.0038872    .0027166
                 |
           _cons |  -1.026944   .0195372   -52.56   0.000    -1.065236   -.9886519
----------------------------------------------------------------------------------

Absorbed degrees of freedom:
-------------------------------------------------------+
   Absorbed FE | Categories  - Redundant  = Num. Coefs |
---------------+---------------------------------------|
 hcountry_year |      2720        2720           0    *|
 pcountry_year |      2720           0        2720     |
   sector_year |       312          13         299     |
-------------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

For the second specification, instead:

Code:

ppmlhdfe export c.DWP_jt##b24.sector1, absorb(hcountry_pcountry hcountry_year pcountry_year sector_year, savefe) vc
> e(cluster hcountry)
(dropped 3442104 observations that are either singletons or separated by a fixed effect)
note: 24 variables omitted because of collinearity: DWP_jt 1bn.sector1 2bn.sector1 3bn.sector1 4bn.sector1 5bn.sector
> 1 6bn.sector1 7bn.sector1 8bn.sector1 9bn.sector1 10bn.sector1 11bn.sector1 12bn.sector1 13bn.sector1 14bn.sector1
> 15bn.sector1 16bn.sector1 17bn.sector1 18bn.sector1 19bn.sector1 20bn.sector1 21bn.sector1 22bn.sector1 23bn.sector
> 1
Iteration 1:   deviance = 1.8677e+05  eps = .         iters = 6    tol = 1.0e-04  min(eta) =  -7.77  P  
Iteration 2:   deviance = 1.0176e+05  eps = 8.35e-01  iters = 4    tol = 1.0e-04  min(eta) =  -9.14      
Iteration 3:   deviance = 8.2568e+04  eps = 2.32e-01  iters = 4    tol = 1.0e-04  min(eta) = -10.82      
Iteration 4:   deviance = 7.8456e+04  eps = 5.24e-02  iters = 4    tol = 1.0e-04  min(eta) = -12.47      
Iteration 5:   deviance = 7.7570e+04  eps = 1.14e-02  iters = 3    tol = 1.0e-04  min(eta) = -15.00      
Iteration 6:   deviance = 7.7355e+04  eps = 2.77e-03  iters = 3    tol = 1.0e-04  min(eta) = -16.84      
Iteration 7:   deviance = 7.7293e+04  eps = 8.04e-04  iters = 3    tol = 1.0e-04  min(eta) = -18.26      
Iteration 8:   deviance = 7.7275e+04  eps = 2.39e-04  iters = 2    tol = 1.0e-04  min(eta) = -19.88      
Iteration 9:   deviance = 7.7270e+04  eps = 6.76e-05  iters = 2    tol = 1.0e-04  min(eta) = -21.54      
Iteration 10:  deviance = 7.7268e+04  eps = 1.78e-05  iters = 2    tol = 1.0e-05  min(eta) = -22.80      
Iteration 11:  deviance = 7.7268e+04  eps = 4.32e-06  iters = 2    tol = 1.0e-05  min(eta) = -24.12   S  
Iteration 12:  deviance = 7.7268e+04  eps = 9.37e-07  iters = 2    tol = 1.0e-06  min(eta) = -25.25   S  
Iteration 13:  deviance = 7.7268e+04  eps = 1.74e-07  iters = 2    tol = 1.0e-07  min(eta) = -26.42   S  
Iteration 14:  deviance = 7.7268e+04  eps = 2.53e-08  iters = 2    tol = 1.0e-07  min(eta) = -27.16   S  
Iteration 15:  deviance = 7.7268e+04  eps = 2.23e-09  iters = 2    tol = 1.0e-09  min(eta) = -27.55   S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out   s: exact solver   h: step-halving   o: epsilon below tolerance)
Converged in 15 iterations and 43 HDFE sub-iterations (tol = 1.0e-08)

HDFE PPML regression                              No. of obs      =   10151496
Absorbing 4 HDFE groups                           Residual df     =        209
Statistics robust to heteroskedasticity           Wald chi2(23)   =     262.89
Deviance             =  77267.78601               Prob > chi2     =     0.0000
Log pseudolikelihood =  -147528.227               Pseudo R2       =     0.6886

Number of clusters (hcountry)=       210
                                 (Std. Err. adjusted for 210 clusters in hcountry)
----------------------------------------------------------------------------------
                 |               Robust
          export |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
          DWP_jt |          0  (omitted)
                 |
         sector1 |
             10  |          0  (omitted)
             11  |          0  (omitted)
             12  |          0  (omitted)
             13  |          0  (omitted)
             14  |          0  (omitted)
             15  |          0  (omitted)
             16  |          0  (omitted)
             17  |          0  (omitted)
             18  |          0  (omitted)
             19  |          0  (omitted)
             20  |          0  (omitted)
             21  |          0  (omitted)
             22  |          0  (omitted)
             23  |          0  (omitted)
             24  |          0  (omitted)
             25  |          0  (omitted)
             26  |          0  (omitted)
             27  |          0  (omitted)
             28  |          0  (omitted)
             29  |          0  (omitted)
             30  |          0  (omitted)
             31  |          0  (omitted)
             32  |          0  (omitted)
                 |
sector1#c.DWP_jt |
             10  |   -.007955   .0024418    -3.26   0.001    -.0127409   -.0031691
             11  |   -.004336   .0035532    -1.22   0.222    -.0113001    .0026281
             12  |    -.00851   .0074887    -1.14   0.256    -.0231877    .0061676
             13  |  -.0072332   .0040025    -1.81   0.071     -.015078    .0006116
             14  |   .0017449   .0025176     0.69   0.488    -.0031895    .0066793
             15  |   .0059696   .0037561     1.59   0.112    -.0013923    .0133314
             16  |   .0008984   .0055685     0.16   0.872    -.0100156    .0118124
             17  |  -.0016787   .0047734    -0.35   0.725    -.0110345     .007677
             18  |   -.016464   .0045652    -3.61   0.000    -.0254116   -.0075165
             19  |  -.0013528   .0029942    -0.45   0.651    -.0072213    .0045157
             20  |  -.0003031   .0016259    -0.19   0.852    -.0034899    .0028837
             21  |  -.0080486   .0025083    -3.21   0.001    -.0129647   -.0031325
             22  |  -.0056654   .0020844    -2.72   0.007    -.0097507   -.0015801
             23  |  -.0060115   .0016355    -3.68   0.000    -.0092171    -.002806
             24  |  -.0006525   .0018593    -0.35   0.726    -.0042967    .0029917
             25  |    -.00685   .0017327    -3.95   0.000     -.010246    -.003454
             26  |   .0037718   .0011331     3.33   0.001      .001551    .0059926
             27  |   .0021261   .0022674     0.94   0.348    -.0023178    .0065701
             28  |  -.0002189    .001518    -0.14   0.885    -.0031941    .0027562
             29  |    .001217   .0031422     0.39   0.699    -.0049416    .0073755
             30  |   -.019625    .005028    -3.90   0.000    -.0294797   -.0097703
             31  |  -.0007644   .0043387    -0.18   0.860    -.0092681    .0077393
             32  |  -.0005853   .0016847    -0.35   0.728    -.0038872    .0027166
                 |
           _cons |  -.5855637   .0195372   -29.97   0.000    -.6238559   -.5472715
----------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------------+
       Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------------+---------------------------------------|
 hcountry_pcountry |     32653       32653           0    *|
     hcountry_year |      2720        2720           0    *|
     pcountry_year |      2720           0        2720     |
       sector_year |       312          13         299     |
-----------------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

Could someone explain why the two regressions give me exactly the same results?

Thank you very much,
Davide

↧

Substring matching function?

February 11, 2020, 9:16 am

≫ Next: Encoding string variable

≪ Previous: Gravity models, ppmlhdfe and different sets of HDFEs

Is there a function that can look for a string within a larger string variable, and generate a 1 if it finds a match and 0 if it does not?

Thanks

Paul

↧

Encoding string variable

February 11, 2020, 9:41 am

≫ Next: Generating new column given ONE time observation for every ID

≪ Previous: Substring matching function?

Dear all statalist members,

I have one problem and hope you can help me.

I have the variable saying the name of the hospital, which is obviously string. I want it to have in numeric but I can use encode for that, its fine. The problem is that this is longitudinal data with several repeated measurements and this variable also is repeated several times in the dataset. Is there any command which help me to encode this variable but to have the same number for the rest of the sets of variables? For instance if I have first hospital coded as 23, I want it to have second, third,... tenth as 23. Am I clear enough?

↧

Generating new column given ONE time observation for every ID

February 11, 2020, 9:48 am

≫ Next: Rename observation help

≪ Previous: Encoding string variable

Dear everyone,

ID	time	Revenue	Revenue_at_t2
1	01/01/2000	5	6
1	02/01/2000	6	6
1	03/01/2000	7	6
2	01/01/2000	5	8
2	02/01/2000	8	8
2	03/01/2000	7	8

I have a question regarding making while using panel data. To put it more specifically, I want to generate the revenue in the column Revenue_at_t2 of one specific time (say 02/01/2000) for every ID.

Informally, the code should look like: for ID = _n, gen Revenue_at_t2 = Revenue at time = 02/01/2000

However, I don’t want to reshape my whole dataset. I looked into the commands reshape and egen, but I can’t find the right code. Can someone help me on this?

Thank you very much in advance!

↧

Rename observation help

February 11, 2020, 10:07 am

≫ Next: Plotting OR/HR's in the y-axis with a continuous variable in the x-variable

≪ Previous: Generating new column given ONE time observation for every ID

Hi

I created my own age of marriage variable using year of birth, month of birth, year of marriage and month of marriage
However, in my dataset, an observation under year of marriage and month of marriage of 'don't know' is causing problems with the mean/max/min values, as it's counting it as 8000 or something so need to get rid of this qualitative observation
How can I replace all of the 'don't know' observations with either 0 or .?

↧

Plotting OR/HR's in the y-axis with a continuous variable in the x-variable

February 11, 2020, 10:42 am

≫ Next: Transform panel data with all vars in a column to individual variables

≪ Previous: Rename observation help

Hello

I am using Stata 14.1 and am having trouble creating a certain type of figure and have not been able to figure it out in Stata.

I want to display OR's from a logistic regression model on the y-axis over a continuous variable on the x-axis by a secondary dichotomous variable.

For example, I want to display all the OR's from logistic regression models of the relationship between the development of tuberculosis and BCG vaccination -- over a third variable that is continuous, which we can say is age here. On the x-axis would be age. On the y-axis would be OR's. The reference would be "No vaccination" and the OR's would show the increased odds of tuberculosis among BCG vaccinated individuals over age. I understand this would need to be done with some sort of smoothed curve.
__________________________________________________ __________________________________________________ ____

To give you an example of what I am trying to create see below

,Array

From the article "Strength of the Association of Elevated Vitamin B12 and Solid Cancers: An Adjusted Case-Control Study" in the journal J. Clin. Med. 2020, 9, 474.

Another example is:

Array

From the article "Dietary carbohydrate intake, glycaemic index, glycaemic load and digestive system cancers: an updated dose–response meta-analysis"

Cai, X., Li, X., Tang, M., Liang, C., Xu, Y., Zhang, M., Yu, W. and Li, X., 2019. Dietary carbohydrate intake, glycaemic index, glycaemic load and digestive system cancers: an updated dose–response meta-analysis. British Journal of Nutrition, 121(10), pp.1081-1096

__________________________________________________ __________________________________________________ __________________________________________________ ____

I'm able to show multiple odds ratios for differing age groups using the 'coefplots' command and here is my code and graph for that graph.

Code:

#Running a logistic model for each age group and saving the estimates#

melogit death_merg bcg_merg age_merg sex_cont_merg prosp_retro_merg  if all_dis_merg<5 & age_merg<15 & age_merg>=10 ||study_new:, or
estimates store death_bcg_3
melogit death_merg bcg_merg age_merg sex_cont_merg prosp_retro_merg  if all_dis_merg<5 & age_merg<10 & age_merg>=5 ||study_new:, or
estimates store death_bcg_2
melogit death_merg bcg_merg age_merg sex_cont_merg prosp_retro_merg  if all_dis_merg<5 & age_merg<5  ||study_new:, or
estimates store death_bcg_1
melogit death_merg bcg_merg age_merg sex_cont_merg prosp_retro_merg  if all_dis_merg<5 & age_merg<150 & age_merg>=15 ||study_new:, or
estimates store death_bcg_4
melogit death_merg bcg_merg age_merg sex_cont_merg prosp_retro_merg  if all_dis_merg<5  ||study_new:, or
estimates store death_bcg_5

#coefplot based on saved estimates#

coefplot (death_bcg_5 , label("All Participants") mlabels(bcg_merged=1 "N=18175") mlabsize(small) mlabcolor(black) lpatt(solid)lcol(black)msym(d)mcol(black)ciopts(recast(. rcap)lcol(black))  drop(_cons age_merged sex_cont_merged prosp_retro_merged) xlabel(0.2 "0.2" 0.5 "0.5" 1 "1.0"  2 "2.0" 3.0 "3.0")) (death_bcg_1 , label("<5 years old") mlabels(bcg_merged=1 "N=4567") mlabsize(small) mlabcolor(black) lpatt(solid)lcol(brown)msym(d)mcol(brown)ciopts(recast(. rcap)lcol(brown))  drop(_cons age_merged sex_cont_merged prosp_retro_merged) xlabel(0.2 "0.2" 0.5 "0.5" 1 "1.0"  2 "2.0" 3.0 "3.0")) (death_bcg_2, label("5-9 years old") mlabels(bcg_merged=1 "N=6139") mlabsize(small) mlabcolor(black) lpatt(solid)lcol(maroon)msym(d)mcol(maroon)ciopts(recast(. rcap)lcol(maroon)) drop(_cons age_merged sex_cont_merged prosp_retro_merged) xlabel(0.2 "0.2" 0.5 "0.5" 1 "1.0"  2 "2.0" 3.0 "3.0")) (death_bcg_3, label("10-14 years old")  mlabels(bcg_merged=1 "N=4632") mlabsize(small) mlabcolor(black)  xlabel(0.2 "0.2" 0.5 "0.5" 1 "1.0"  2 "2.0" 3.0 "3.0") lpatt(solid)lcol(dkgreen)msym(d)mcol(dkgreen)ciopts(recast(. rcap)lcol(dkgreen)) drop(_cons age_merged sex_cont_merged prosp_retro_merged)) (death_bcg_4, label("≥15 years old") mlabels(bcg_merged=1 "N=2837") mlabsize(small) mlabcolor(black)  xlabel(0.2 "0.2" 0.5 "0.5" 1 "1.0"  2 "2.0" 3.0 "3.0") lpatt(solid)lcol(navy)msym(d)mcol(navy)ciopts(recast(. rcap)lcol(navy)) drop(_cons age_merged sex_cont_merged prosp_retro_merged )), eform  xline(1, lcolor(black) lwidth(thin) lpattern(dash)) xtitle(Odds Ratio) levels(95)   msymbol(d) mfcolor(white) ciopts(recast(. rcap))legend(rows(8)  ring(0) pos(2)  col(1)  size(3.5) region(fcolor(gs15))) graphregion(fcolor(white))

Figure. Risk of Death Among BCG Vaccinated and Unvaccinated Children, Stratified by Age Groups

Array

But I would like to create a graph that treats age as a continuous variable and is able to smooth out the relative risk estimates over the continuous variable if possible. I have been unable to find any leads or ways to do this as of yet.

I'd appreciate any help you can offer.

Best
Leo

↧

Transform panel data with all vars in a column to individual variables

February 11, 2020, 11:10 am

≫ Next: "inv(): matrix has missing values r(504);" error

≪ Previous: Plotting OR/HR's in the y-axis with a continuous variable in the x-variable

Hello! I downloaded a panel database from the FAO but the thing is that it has all variables on a single column (value) instead of having an individual column for each variable. I need to reorganize it to for my research for the usual things, regressions, descriptive statistics, etc... I really appreciate any advise on how to do this.

My data looks like this:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long commodity_code str28 commodity_description str30 country_name int market_year byte month int attribute_id str30 attribute_description double value
577400 "Almonds, Shelled Basis" "Afghanistan" 2010 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Afghanistan" 2010 10 125 "Domestic Consumption"    0
577400 "Almonds, Shelled Basis" "Afghanistan" 2010 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Afghanistan" 2010 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Afghanistan" 2010 10  57 "Imports"                 0
577400 "Almonds, Shelled Basis" "Afghanistan" 2010 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Afghanistan" 2010 10 178 "Total Distribution"      0
577400 "Almonds, Shelled Basis" "Afghanistan" 2010 10  86 "Total Supply"            0
577400 "Almonds, Shelled Basis" "Algeria"     2001 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2001 10 125 "Domestic Consumption"  300
577400 "Almonds, Shelled Basis" "Algeria"     2001 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2001 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2001 10  57 "Imports"               300
577400 "Almonds, Shelled Basis" "Algeria"     2001 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2001 10 178 "Total Distribution"    300
577400 "Almonds, Shelled Basis" "Algeria"     2001 10  86 "Total Supply"          300
577400 "Almonds, Shelled Basis" "Algeria"     2002 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2002 10 125 "Domestic Consumption" 1200
577400 "Almonds, Shelled Basis" "Algeria"     2002 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2002 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2002 10  57 "Imports"              1200
577400 "Almonds, Shelled Basis" "Algeria"     2002 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2002 10 178 "Total Distribution"   1200
577400 "Almonds, Shelled Basis" "Algeria"     2002 10  86 "Total Supply"         1200
577400 "Almonds, Shelled Basis" "Algeria"     2003 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2003 10 125 "Domestic Consumption" 1600
577400 "Almonds, Shelled Basis" "Algeria"     2003 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2003 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2003 10  57 "Imports"              1600
577400 "Almonds, Shelled Basis" "Algeria"     2003 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2003 10 178 "Total Distribution"   1600
577400 "Almonds, Shelled Basis" "Algeria"     2003 10  86 "Total Supply"         1600
577400 "Almonds, Shelled Basis" "Algeria"     2004  8  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2004  8 125 "Domestic Consumption"  800
577400 "Almonds, Shelled Basis" "Algeria"     2004  8 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2004  8  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2004  8  57 "Imports"               800
577400 "Almonds, Shelled Basis" "Algeria"     2004  8  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2004  8 178 "Total Distribution"    800
577400 "Almonds, Shelled Basis" "Algeria"     2004  8  86 "Total Supply"          800
577400 "Almonds, Shelled Basis" "Algeria"     2005  8  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2005  8 125 "Domestic Consumption" 1600
577400 "Almonds, Shelled Basis" "Algeria"     2005  8 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2005  8  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2005  8  57 "Imports"              1600
577400 "Almonds, Shelled Basis" "Algeria"     2005  8  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2005  8 178 "Total Distribution"   1600
577400 "Almonds, Shelled Basis" "Algeria"     2005  8  86 "Total Supply"         1600
577400 "Almonds, Shelled Basis" "Algeria"     2006 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2006 10 125 "Domestic Consumption" 2200
577400 "Almonds, Shelled Basis" "Algeria"     2006 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2006 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2006 10  57 "Imports"              2200
577400 "Almonds, Shelled Basis" "Algeria"     2006 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2006 10 178 "Total Distribution"   2200
577400 "Almonds, Shelled Basis" "Algeria"     2006 10  86 "Total Supply"         2200
577400 "Almonds, Shelled Basis" "Algeria"     2007 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2007 10 125 "Domestic Consumption" 2800
577400 "Almonds, Shelled Basis" "Algeria"     2007 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2007 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2007 10  57 "Imports"              2800
577400 "Almonds, Shelled Basis" "Algeria"     2007 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2007 10 178 "Total Distribution"   2800
577400 "Almonds, Shelled Basis" "Algeria"     2007 10  86 "Total Supply"         2800
577400 "Almonds, Shelled Basis" "Algeria"     2008 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2008 10 125 "Domestic Consumption" 7600
577400 "Almonds, Shelled Basis" "Algeria"     2008 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2008 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2008 10  57 "Imports"              7600
577400 "Almonds, Shelled Basis" "Algeria"     2008 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2008 10 178 "Total Distribution"   7600
577400 "Almonds, Shelled Basis" "Algeria"     2008 10  86 "Total Supply"         7600
577400 "Almonds, Shelled Basis" "Algeria"     2009 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2009 10 125 "Domestic Consumption" 3800
577400 "Almonds, Shelled Basis" "Algeria"     2009 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2009 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2009 10  57 "Imports"              3800
577400 "Almonds, Shelled Basis" "Algeria"     2009 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2009 10 178 "Total Distribution"   3800
577400 "Almonds, Shelled Basis" "Algeria"     2009 10  86 "Total Supply"         3800
577400 "Almonds, Shelled Basis" "Algeria"     2010 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2010 10 125 "Domestic Consumption" 6600
577400 "Almonds, Shelled Basis" "Algeria"     2010 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2010 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2010 10  57 "Imports"              6600
577400 "Almonds, Shelled Basis" "Algeria"     2010 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2010 10 178 "Total Distribution"   6600
577400 "Almonds, Shelled Basis" "Algeria"     2010 10  86 "Total Supply"         6600
577400 "Almonds, Shelled Basis" "Algeria"     2011 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2011 10 125 "Domestic Consumption" 7500
577400 "Almonds, Shelled Basis" "Algeria"     2011 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2011 10  88 "Exports"                 0
577400 "Almonds, Shelled Basis" "Algeria"     2011 10  57 "Imports"              7500
577400 "Almonds, Shelled Basis" "Algeria"     2011 10  28 "Production"              0
577400 "Almonds, Shelled Basis" "Algeria"     2011 10 178 "Total Distribution"   7500
577400 "Almonds, Shelled Basis" "Algeria"     2011 10  86 "Total Supply"         7500
577400 "Almonds, Shelled Basis" "Algeria"     2012 10  20 "Beginning Stocks"        0
577400 "Almonds, Shelled Basis" "Algeria"     2012 10 125 "Domestic Consumption" 6700
577400 "Almonds, Shelled Basis" "Algeria"     2012 10 176 "Ending Stocks"           0
577400 "Almonds, Shelled Basis" "Algeria"     2012 10  88 "Exports"                 0
end

Many thanks!

↧

"inv(): matrix has missing values r(504);" error

February 11, 2020, 11:24 am

≫ Next: Dif-in-Dif: treatment group and continuous time varying variable

≪ Previous: Transform panel data with all vars in a column to individual variables

Hello:
I am doing a network meta-analysis.
When I execute the netweight code to produce the contribution plot for the direct and indirect evidence, I get the error "inv(): matrix has missing values r(504);".

The code I write

Code:

 netweight ES seES t1 t2, aspect(0.9) notable

This command is part of the mvmeta and network packages

Code:

 ssc install mvmeta

Code:

 ssc install network

My data look like this:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
* dataex id t1 t2 ES seES year rob _BD1 _BD2
input float(id t1 t2 ES seES) int year str7 rob float(_BD1 _BD2)
 1 3 1   -.450053   .2575157 1983 "high"    1 1
10 3 1 -.17532563  .08648849 1996 "unclear" 3 .
11 3 1  -.3528744   .4098247 1982 "high"    . .
15 3 1  -.4058854  .28737193 1977 "high"    4 2
22 3 1 -.05726908  .12128844 1991 "unclear" . .
21 3 1 -.15116373  .10245428 1991 "unclear" 5 3
20 3 1 -1.0296195   .8807464 1980 "high"    2 4
19 3 1 .007722046  .33268955 1988 "high"    . .
18 3 1 -.19956337   .3266024 1983 "high"    . .
17 3 1  .03738753  .28755212 1978 "high"    . .
16 3 1  .08554983    .212983 1987 "high"    . .
 1 4 1 -.51234347  .25925958 1983 "high"    . .
11 4 1 -.18152188   .4012824 1982 "high"    . .
10 4 1  -.4697811   .0912347 1996 "unclear" . .
 9 4 1  -.4406041    .105911 1996 "unclear" . .
23 5 1  -.2573939  .14490347 1989 "high"    . .
 8 3 2   .1841851  .10752216 2007 "low"     . .
 6 5 2  .06790633 .070427805 2004 "low"     . .
 7 5 2 -.04800922   .4647679 2003 "low"     . .
 1 4 3 -.06229049  .27819666 1983 "high"    . .
14 4 3 -.10626426   .1730147 1985 "high"    . .
13 4 3 -.27857122  .11656784 2006 "low"     . .
12 4 3   .8082165   .7232321 1990 "unclear" . .
11 4 3   .1713525   .4354255 1982 "high"    . .
10 4 3 -.29445547   .0933331 1996 "unclear" . .
 2 5 3   .2051014  .13786088 2003 "low"     . .
 5 5 3  -.6807467   .3773143 2000 "low"     . .
 4 5 3  -.0775357  .08348966 1989 "high"    . .
 3 5 3  -.1010332  .07061027 1996 "unclear" . .
end

I would be grateful if anyone can help me to solve this problem.

↧

Dif-in-Dif: treatment group and continuous time varying variable

February 11, 2020, 12:10 pm

≫ Next: A question about interpreting interaction terms

≪ Previous: "inv(): matrix has missing values r(504);" error

I am trying to examine the change made by an event (the change of location: it is not necessarily on the same year, so I create dummy that is zero before change and one after the change).
And I have industry level import (continuous variable). Here import is changing at an industry level on a yearly basis.

What I wanted to know is the impact of import on plants: whether plants with a location change and plants without a location change show differences.

I try this: regress plant_profit i.changelocation##c.import i.year

Could you please explain whether my identification is correct?
Thank you very much!

↧

A question about interpreting interaction terms

February 11, 2020, 12:35 pm

≫ Next: Screening variables for excess missing data

≪ Previous: Dif-in-Dif: treatment group and continuous time varying variable

Hi! I have a question about interpreting an interaction term. Would you please give me some advice?

When I ran an HLM model with an interaction term, I found the coefficient of the interaction tern had an opposite sign to the main effect term.

For example:
Math score = cons + 0.97*SES + 1.84*Migrant + (-1.31)*SES*Migrant (all coefficients here are significant at .01 level)

Based on my understanding, in this case, the effect of SES on a migrant's math score should be (0.97-1.31)*SES. Then I used the code: test mibystses2+stses2=0, and found chi2=0.73, which means the difference is not significantly different from 0. In other words, migrants' math scores cannot benefit from an increase in SES.

Do I understand this case in the right way?

Thank you for the help!

↧

Screening variables for excess missing data

February 11, 2020, 12:55 pm

≫ Next: Estimating beta's for each id and store in a variable - panel data

≪ Previous: A question about interpreting interaction terms

I have a large data set in which various missing codes have been established from -8 to -2. Examining variables in the browser shows that very few have actual responses. Tallying them with tabulate gives me frequencies such as 37 good responses in 4900 cases. The missing values mostly indicate that the question was not asked. Can I write a routine that would allow me to cycle through all variables compute the mean of each and discard those where the mean is perhaps less than -7.5? I know how to set up the "for each" procedure, but am not sure how to feed the mean (perhaps from the command "mean") to the loop.

for each var in varlist v1-v3000
drop if [MEAN OF] `var' <-7.5

I do not know the syntax well enough to write this test successfully.

Thanks!

↧

Estimating beta's for each id and store in a variable - panel data

February 11, 2020, 12:59 pm

≫ Next: How to deal with Char and Numeric mixed variable

≪ Previous: Screening variables for excess missing data

Dear Stata users,

I have an unbalanced panel dataset with 4329 observations and about 400 variables. Time periods range from 2003 to 2010 with gaps.

I would to run a time-series regression to estimate beta coefficients and store them in a variable for each mfiid and its time period, taking into account the panel structure (gaps). _roe is the dependent, _sp500 the independent variable. The dependent variable varies by mfiid each year and the independent variable is the same for each mfiid each year.

Sorted by year and mfiid:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long mfiid int year float(_roe _sp500)
100000 2007    .2913083 -.010829791
100000 2008   .17713334   -.4216229
100000 2009   .14123334     .201956
100001 2004   .04955833    .0472585
100001 2005 -.008699998  -.01277183
100001 2006   .02328333   .08855005
100001 2007   .21060835 -.010829791
100001 2009  .015933335     .201956
100001 2010  .017058332   .09580792
100004 2008   .20683333   -.4216229
100008 2006   -.4828166   .08855005
100008 2007   -.4197917 -.010829791
100008 2008  -2.9300666   -.4216229
100008 2009    9.721633     .201956
100012 2005       .3659  -.01277183
100012 2006   -.4431167   .08855005
100012 2007   1.6969082 -.010829791
100012 2008   1.2119334   -.4216229
100012 2009    .9820334     .201956
100012 2010   .16185834   .09580792
100013 2009  -.10906667     .201956
100013 2010  -.13484167   .09580792
100016 2005      -.0101  -.01277183
100016 2006 .0035833344   .08855005
100016 2007  .016408335 -.010829791
100016 2008   .05173333   -.4216229
100016 2009   .07153333     .201956
100016 2010   .08925833   .09580792
100017 2003      .17895    .2237602
100017 2004   .16935833    .0472585
100017 2006   .09988333   .08855005
100020 2007  .003608331 -.010829791
100020 2008  .010133334   -.4216229
100020 2009   .09823333     .201956
100020 2010  .016058333   .09580792
100021 2005      -.0768  -.01277183
100021 2006  -.05961667   .08855005
100021 2007 -.017191667 -.010829791
100021 2008  -.04696666   -.4216229
100021 2009 -.023466665     .201956
100021 2010 -.014741667   .09580792
100024 2006    .9799833   .08855005
100024 2009   .51533335     .201956
100024 2010  .014058333   .09580792
100026 2004   17.870058    .0472585
100026 2005       -5.09  -.01277183
100026 2006    9.523084   .08855005
100026 2007  -2.4865916 -.010829791
100026 2008   .54033333   -.4216229
100026 2009   1.4377333     .201956
100026 2010    .4943583   .09580792
100027 2003   1.0200499    .2237602
100027 2004    .8740584    .0472585
100027 2005       .7555  -.01277183
100027 2006   .16448334   .08855005
100029 2006  -.16821668   .08855005
100029 2007   .27910832 -.010829791
100029 2008   .07793334   -.4216229
100029 2009  .005633332     .201956
100030 2005      -.0131  -.01277183
100030 2006  .024183333   .08855005
100030 2007   .02750833 -.010829791
100030 2008   .05113334   -.4216229
100031 2005        -.02  -.01277183
100031 2006  -.03771666   .08855005
100031 2007 -.033691667 -.010829791
100031 2009   .14953333     .201956
100031 2010   .19615833   .09580792
100032 2004   .06425834    .0472585
100032 2005      -.0216  -.01277183
100032 2008   .17953333   -.4216229
100032 2010   .17735833   .09580792
100033 2005    .9403999  -.01277183
100033 2006    .7986833   .08855005
100033 2007   .07110833 -.010829791
100033 2008   .05233334   -.4216229
100033 2009   .03813333     .201956
100033 2010   .23085834   .09580792
100036 2003     1.58285    .2237602
100036 2004    16.50846    .0472585
100036 2005       .0053  -.01277183
100036 2006    .6539834   .08855005
100036 2007   .11130834 -.010829791
end

Sorted by year the data looks as follows:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long mfiid int year float(_roe _sp500)
100324 2003       .13935 .2237602
100118 2003      -.05345 .2237602
100296 2003       .18615 .2237602
100092 2003      -.05935 .2237602
100720 2003       .14455 .2237602
100104 2003      -.02805 .2237602
101107 2003   .015449997 .2237602
100648 2003       .08945 .2237602
100302 2003       .16415 .2237602
100467 2003       .40555 .2237602
100303 2003       .14535 .2237602
100281 2003       .23025 .2237602
100523 2003       .08065 .2237602
100326 2003       .10955 .2237602
100673 2003       .71915 .2237602
100211 2003  -.066250004 .2237602
100096 2003      -.11025 .2237602
100492 2003      -.22445 .2237602
100312 2003       .06345 .2237602
100431 2003       .02765 .2237602
100565 2003       .55315 .2237602
100477 2003       .08275 .2237602
100713 2003      -.04965 .2237602
100547 2003  -.036650002 .2237602
100684 2003       .07855 .2237602
100556 2003       .05095 .2237602
100404 2003       .07495 .2237602
100280 2003       .25795 .2237602
100080 2003       .19575 .2237602
100288 2003       .04715 .2237602
100309 2003       .01885 .2237602
100253 2003     -2.49875 .2237602
100127 2003      -.46735 .2237602
100749 2003  -.012650002 .2237602
100357 2003       .09915 .2237602
100617 2003      -.18485 .2237602
100027 2003    1.0200499 .2237602
100255 2003   .011749998 .2237602
100346 2003       .22665 .2237602
100233 2003   .007949997 .2237602
100348 2003       .08205 .2237602
100350 2003       .05175 .2237602
100813 2003       .14445 .2237602
100741 2003       .14795 .2237602
100155 2003      -.02585 .2237602
100243 2003       .09685 .2237602
100173 2003      -.05955 .2237602
100733 2003  -.011250002 .2237602
100664 2003       .31375 .2237602
100705 2003       .16505 .2237602
100678 2003      -.02375 .2237602
100590 2003      -.22435 .2237602
100378 2003      -.01945 .2237602
100188 2003 -.0014500022 .2237602
100700 2003       .36955 .2237602
100420 2003       .05695 .2237602
100500 2003       .04465 .2237602
100402 2003       .23905 .2237602
100298 2003       .19165 .2237602
100716 2003       .19895 .2237602
100542 2003    .26064998 .2237602
100339 2003       .39035 .2237602
100641 2003      -.02645 .2237602
100168 2003  -.072850004 .2237602
100158 2003      -.27875 .2237602
100097 2003       .39195 .2237602
100611 2003   -.28904998 .2237602
100512 2003       .46595 .2237602
100299 2003       .50835 .2237602
100225 2003       .50175 .2237602
100036 2003      1.58285 .2237602
100386 2003       .15465 .2237602
100329 2003       .01035 .2237602
100017 2003       .17895 .2237602
100576 2003   -.29035002 .2237602
100442 2003       .12485 .2237602
100831 2003       .06045 .2237602
100089 2003      -.37675 .2237602
100601 2003    .27354997 .2237602
100769 2003      -.16215 .2237602
100747 2003       .11985 .2237602
100515 2003       .08925 .2237602
100222 2003      -.06445 .2237602
100765 2003      -.02665 .2237602
100681 2003      -.03485 .2237602
100217 2003   .005849998 .2237602
100098 2003      -.08055 .2237602
100540 2003   .018349998 .2237602
100464 2003       .01055 .2237602
100660 2003  -.005450003 .2237602
100574 2003      -.02425 .2237602
100236 2003       .17965 .2237602
100205 2003  -.018150002 .2237602
100369 2003      -.02675 .2237602
100714 2003       .33685 .2237602
100497 2003      -.05555 .2237602
100376 2003       .11915 .2237602
100325 2003       .00925 .2237602
100666 2003   .009949997 .2237602
100242 2003       .11325 .2237602
end

Help is highly appreciated. Thanks a lot in advance!

Kind regards,
Rafael

↧

How to deal with Char and Numeric mixed variable

February 11, 2020, 1:58 pm

≫ Next: Creating a subset of an Indicator variable

≪ Previous: Estimating beta's for each id and store in a variable - panel data

Hi there -

I have a "Sales Volume" variable defined in $range (character) and real number(numeric) as you can see below. I want to group them following the $ range. I was able to group them by $range (Second column, current stage). However, I am having a difficult time to group numeric obs. For example, $43,630,200,000 is between 20 to 50 million, so it will be categorized in "7". Is there any way I can create new variable with numeric numbers only so that I can proceed this process easy? Thank you.

Sales Volume	Sales_category (current stage)	Sales_Category (This is what I want)
$43,630,200,000		7
$67,420,000,000		8
$70,373,400,000		8
$70,395,000,000		8
$112,373,000,000		9
LESS THAN $500,000	1	1
$500,000 TO $1 MILLION	2	2
$1 TO 2.5 MILLION	3	3
$2.5 TO 5 MILLION	4	4
$5 TO 10 MILLION	5	5
$10 TO 20 MILLION	6	6
$20 TO 50 MILLION	7	7
$50 TO 100 MILLION	8	8
$100 TO 500 MILLION	9	9
$500 MILLION TO $1 BILLI	10	10
$1 TO 2.5 MILLION	3	3
$5 TO 10 MILLION	5	5
$50 TO 100 MILLION	8	8
$100 TO 500 MILLION	9	9

↧

Creating a subset of an Indicator variable

February 11, 2020, 2:06 pm

≫ Next: Impulse Response Function shock size?

≪ Previous: How to deal with Char and Numeric mixed variable

Hi Everyone,

I have an indicator variable that is coded as follows:

codebook NumAP

------------------------------------------------------------------------------------------------
NumAP Type of Assurance Provider
------------------------------------------------------------------------------------------------

type: numeric (byte)
label: NumAP

range: [1,4] units: 1
unique values: 4 missing .: 0/2,100

tabulation: Freq. Numeric Label
424 1 Accountant
95 2 Engineering firm
1,450 3 None
131 4 Small consultancy/ boutique firm

I want to create a new variable that does not include the none category, but includes all the others. However, in this new variable '4 Small consultancy/boutique firm' should be labeled 3 - and all the other labels should remain the same.

I used the this RECODE command, but it does not eliminate the 'none' category: recode NumAP (3=0 1=1 2=2 4=3), gen (acct_engg_bout)

I would greatly appreciate some help with this.

↧

Impulse Response Function shock size?

February 11, 2020, 3:09 pm

≫ Next: Data Conversion From Excel to STATA

≪ Previous: Creating a subset of an Indicator variable

I was wondering what is the interpretation of the shock of the impulse variable when doing Impulse Response Functions in Stata? Stata doesn't specify if it is a one unit shock or a one standard deviation shock? Thanks!

↧

Data Conversion From Excel to STATA

February 11, 2020, 3:46 pm

≫ Next: float to string format error results in wrong values -- why?

≪ Previous: Impulse Response Function shock size?

I have the following excel data set (sorry if image is blurry) I want to covert this excel data to a long data format on stata. I am not sure how to do this, I I initially uploaded the excel file using import excel and listing the cells I want but the import excel command generated the STATA format seen below. One specific problem is that the excel file lists average drinking by mean, upci,lci for each borough from the years 2003-2019. STATA is not recognizing each year as a different variable so it creates a variable mean, upcl,lcl for the year 2003 then defaults to using the name of the column. I hope this makes sense sorry for any confusion in my writing English is not my native language

Array

Array

↧

float to string format error results in wrong values -- why?

February 11, 2020, 4:25 pm

≫ Next: xtpmg works in Stata 13/14 but not in 15/16?

≪ Previous: Data Conversion From Excel to STATA

I need to create the fipscode for census tracts that were formatted incorrectly.

The error occurs when I try to make a string variable from the newly created trid90 variable. For some reason, line 204 and line 204 have the same fipscode but the censusTract and the trtid90 both show that they are different. Do you have any idea why this would occur?

In the second screenshot you can see that in lines 163 and 164 there isn’t an issue. I’m very confused. I hope you can see the error.

Here is the code I used:

Code:

 format state %02.0f
format county %03.0f
codebook censusTract
format censusTract %6.2f
            gen trtid90 = censusTract * 100
            label variable trtid90 "1990 census tracts w/o decimal"
 
gen str_state =string(int(state), "%02.0f")
gen str_county =string(int(county), "%03.0f")
gen str_censusTract =string(int(trtid90), "%06.0f")
egen fipscode = concat(str_state str_county str_censusTract)
sort fipscode activityYear
 duplicates report fipscode
 duplicates list  fipscode, nolabel sepby(fipscode) // 13
 duplicates tag fipscode, gen(dup_fipscode)

↧