number of types within groups

July 16, 2018, 3:22 am

≫ Next: Why does _optimize() issue an error message and abort?

≪ Previous: How to retrieve a part of a string, before a specific symbol?

Hi

I have the following dataset.

ClientID Nurses
1 A
1 B
1 A
2 A
2 A
3 C
3 A
3 A

Now, I will need to find the number of nurses for each client during their episode of care. Therefore, I would like to see the following.

ClientID Nurses n_nurses
1 A 2
1 B 2
1 A 2
2 A 1
2 A 1
3 C 2
3 A 2
3 A 2

What command should I use.

I tried
sort ClientID Nurses
by ClientID Nurses: gen n_nurses=_n

But it doesn't work.

Thanks in advance.

↧

Why does _optimize() issue an error message and abort?

July 16, 2018, 3:56 am

≫ Next: Reshape long NLSY97

≪ Previous: number of types within groups

Dear Statalist,

I am writing an estimation command and would like to modify the error message that is issued by optimize() when the numerical derivative could not be calculated. In specific, I would like to add a recommendation for the user how to solve the error.

I've tried to solve my problem by using _optimize(), which is supposed return nonzero values in case of error and not to abort. If there is an error, I could print the error text and a recommendation for the user. However, _optimize() aborts when it "could not calculate numerical derivatives -- discontinuous region with missing values encountered".

A minimal working example is:

Code:

mata:
mata clear

void myFunction(real scalar todo, real scalar p, score, g, H) {
    score = 2
}

S  = optimize_init()
optimize_init_params(S, 0)
optimize_init_evaluator(S, &myFunction())
_optimize(S)

if (ec = optimize_result_errorcode(S)) {
    errprintf("{p}\n")
    errprintf("%s\n", optimize_result_errortext(S))
    errprintf("\n Print some recommendation")
    errprintf("{p_end}\n")
    exit(optimize_result_returncode(S))
    /*NOTREACHED*/
}

result = optimize_result_params(S)

printf("Result is: %5.2f", result)

end

Why does _optimize() abort? What am I doing wrong?

I'm using Stata 14 with Windows 7.

Thank you very much for your help!

Christoph

↧

Reshape long NLSY97

July 16, 2018, 4:20 am

≫ Next: Access p-value using @pval e.g. for Coefplot

≪ Previous: Why does _optimize() issue an error message and abort?

Dear all,

I am having difficulties with reshape long for the NLSY97 data in Stata 14.0.
I have read all previous posts and guides but I cannot figure out how to reshape my data.
I have downloaded over 6.000 variables from the INvestigator webpage. I do not plan to use them all but I want to explore them in the appropiate format.
I am uploading a sample of the variables. Some have the year in the middle of the name, others at the end, and others have the year in a 2 digits format i.e. CVC_HOURS_WK_YR_SE_00_XRND (00 refers to 2000)

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(PUBID_1997 EMP_STATUS_1994_01_XRND EMP_STATUS_1994_02_XRND EMP_STATUS_1995_01_XRND EMP_STATUS_1995_02_XRND YEMP_101104_01_2000 YEMP_101104_02_2000 YEMP_101106_01_2000 YEMP_101106_02_2000 CVC_HOURS_WK_YR_SE_00_XRND CVC_HOURS_WK_YR_SE_01_XRND)
  1 -4 -4 -4 -4 -4 -4 -4 -4    0    0
  2 -4 -4 -4 -4 -4 -4 -4 -4    0    0
  3 -4 -4 -4 -4 -4 -4 -4 -4    0    0
  4 -4 -4 -4 -4 -4 -4 -4 -4    0    0
  5 -4 -4 -4 -4 -4 -4 -4 -4    0    0
  6 -4 -4 -4 -4 -4 -4 -4 -4    0    0
  7 -4 -4 -4 -4 -4 -4 -4 -4    0    0
  8 -4 -4 -4 -4 -4 -4 -4 -4    0    0
  9 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 10 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 11 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 12 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 13 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 14 -4 -4  5  5 -5 -5 -5 -5    0    0
 15 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 16 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 17 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 18 -4 -4 -4 -4 -4 -4 -4 -4 5194 5096
 19 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 20 -4 -4  5  5 -4 -4 -4 -4    0    0
 21 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 22 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 23 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 24 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 25 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 26 -4 -4  5  5 -4 -4 -4 -4    0    0
 27 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 28 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 29 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 30 -4 -4 -4 -4 -5 -5 -5 -5   -4   -4
 31 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 32 -4 -4 -4 -4 -5 -5 -5 -5    0    0
 33 -4 -4  5  5 -4 -4 -4 -4    0    0
 34 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 35 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 36 -4 -4  5  5 -4 -4 -4 -4    0    0
 37 -4 -4  5  5 -4 -4 -4 -4    0    0
 38 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 39 -4 -4  5  5 -4 -4 -4 -4    0    0
 40 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 41 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 42 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 43 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 44 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 45 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 46 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 47 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 48 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 49 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 50 -4 -4 -4 -4 -4 -4 -4 -4    0   -3
 51 -4 -4  5  5 -4 -4 -4 -4    0    0
 52 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 53 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 54 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 55 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 56 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 57 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 58 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 59 -4 -4 -4 -4 -4 -4 -4 -4    0   -3
 60 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 61 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 62 -4 -4  5  5 -4 -4 -4 -4    0    0
 63 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 64 -4 -4  5  5 -4 -4 -4 -4    0    0
 65 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 66 -4 -4  0  0 -4 -4 -4 -4    0    0
 67 -4 -4  1  1 -4 -4 -4 -4    0    0
 68 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 69 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 70 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 71 -4 -4  5  5 -4 -4 -4 -4    0    0
 72 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 73 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 74 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 75 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 76 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 77 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 78 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 79 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 80 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 81 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 82 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 83 -4 -4 -4 -4 -4 -4 -4 -4    0 1160
 84 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 85 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 86 -4 -4  5  5 -4 -4 -4 -4    0    0
 87 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 88 -4 -4 -4 -4 -4 -4 -4 -4    0 1698
 89 -4 -4 -4 -4 -4 -4  1 -4  782  150
 90 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 91 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 92 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 93 -4 -4 -4 -4 -4 -4 -4 -4    0    0
 94 -4 -4  4  4 -4 -4 -4 -4    0    0
 95 -4 -4 -4 -4 -5 -5 -5 -5   31   52
 96 -4 -4 -4 -4 -5 -5 -5 -5    0    0
 97 -4 -4  5  5 -4 -4 -4 -4    0    0
 98 -4 -4  5  5 -4 -4 -4 -4    0    0
 99 -4 -4 -4 -4 -4 -4 -4 -4    0    0
100 -4 -4 -4 -4 -4 -4 -4 -4    0    0
end
label values PUBID_1997 vlR0000100
label values EMP_STATUS_1994_01_XRND vlE0011401
label values EMP_STATUS_1994_02_XRND vlE0011402
label values EMP_STATUS_1995_01_XRND vlE0011501
label def vlE0011501 0 "0: No information reported to account for week; job dates indeterminate", modify
label def vlE0011501 1 "1: Not associated with an employer, not actively searching for an employer job", modify
label def vlE0011501 4 "4: Unemployed", modify
label def vlE0011501 5 "5: Out of the labor force", modify
label values EMP_STATUS_1995_02_XRND vlE0011502
label def vlE0011502 0 "0: No information reported to account for week; job dates indeterminate", modify
label def vlE0011502 1 "1: Not associated with an employer, not actively searching for an employer job", modify
label def vlE0011502 4 "4: Unemployed", modify
label def vlE0011502 5 "5: Out of the labor force", modify
label values YEMP_101104_01_2000 vlR4768500
label values YEMP_101104_02_2000 vlR4768600
label values YEMP_101106_01_2000 vlR4769300
label def vlR4769300 1 "RUN BUSINESS FROM HOME", modify
label values YEMP_101106_02_2000 vlR4769400
label values CVC_HOURS_WK_YR_SE_00_XRND vlZ9071800
label def vlZ9071800 0 "0", modify
label values CVC_HOURS_WK_YR_SE_01_XRND vlZ9071900
label def vlZ9071900 0 "0", modify

So far my problems is:
1. I have too many variables with different name formats so I think I need to rename my variables.

I would be grateful if you can give me some guidance on how to proceed with reshape with these type of variables. If you have used NLSY97 before I would also appreciate learning from your experience reshaping this data.

thank you in advance for your help.

Alejandra.

↧

Access p-value using @pval e.g. for Coefplot

July 16, 2018, 4:33 am

≫ Next: Automatic omission of category of categorical variable without specification in OLS

≪ Previous: Reshape long NLSY97

Hello!
I'm trying to add p-values from a margins command to a graph using internally stored variables as suggested in the helpfile for coefplot (http://repec.sowi.unibe.ch/stata/coe...tlog-1-tempvar). It seems that Stata (SE 15.1) doesn't save the p-values (but all the other variables I've tried). Below is a very simple code

Here is a very simple example to illustrate my problem:

sysuse auto.dta
reg price foreign
coefplot, mlabel(@pval) mlabpos(12)

coefplot, mlabel(@b) mlabpos(12)
coefplot, mlabel(@V) mlabpos(12)
coefplot, mlabel(@se) mlabpos(12)

coefplot , mlabel (@pval) mlabpos (12) produces the error

@pval not found
invalid syntax
r(111);

whereas all other variables work.

Can someone tell me what the reason for this is? Thanks in advance.

Kristina

↧

Automatic omission of category of categorical variable without specification in OLS

July 16, 2018, 4:36 am

≫ Next: Generating dummy variable if another dummy variable occurs over time.

≪ Previous: Access p-value using @pval e.g. for Coefplot

Dear stata experts,

I could use some help with the following problem. I am running a multivariate OLS regression with (standardized) test scores as the dependent variable, and a set of continuous and categorical variables as independent variables. For some of the factor variables, I added an extra category for 'missings'. This works fine for most categorical variables, however for the variable mum_age_deliv_cat (maternal age at delivery), this category is omitted in stata output automatically without specification of reason (multicollinearity etc).

Code for multivariate regression is the following:

Code:

regress zks4_GCSE_tot mum_smokes##c.zea1_pgs i.sex ib3.mum_age_deliv_cat zdepression ib3.mum_SES ib3.marital_st_mum ib3.mum_ed_add ib6.cig_change, robust allbaselevels

Linear regression                               Number of obs     =      5,627
                                                F(28, 5598)       =     156.00
                                                Prob > F          =     0.0000
                                                R-squared         =     0.1924
                                                Root MSE          =     .85361

------------------------------------------------------------------------------------------
                         |               Robust
           zks4_GCSE_tot |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
              mum_smokes |
          doesn't smoke  |          0  (base)
                 smokes  |  -.1532702   .0622202    -2.46   0.014    -.2752459   -.0312945
                         |
                zea1_pgs |   .0896178   .0126652     7.08   0.000     .0647892    .1144464
                         |
   mum_smokes#c.zea1_pgs |
          doesn't smoke  |          0  (base)
                 smokes  |    .055319   .0326839     1.69   0.091    -.0087542    .1193922
                         |
                     sex |
                   Male  |          0  (base)
                 Female  |   .2701045   .0228218    11.84   0.000     .2253649    .3148442
                         |
       mum_age_deliv_cat |
                    <20  |  -.1404631   .0941917    -1.49   0.136    -.3251154    .0441892
                  20-24  |   -.110315    .036715    -3.00   0.003    -.1822907   -.0383393
                  25-29  |          0  (base)
                  30-34  |   .0396163   .0277931     1.43   0.154    -.0148689    .0941014
                    35+  |   .1217735   .0380242     3.20   0.001     .0472314    .1963156
                         |
             zdepression |  -.0516424   .0123182    -4.19   0.000    -.0757908    -.027494
                         |
                 mum_SES |
                      I  |   .1243631   .0588214     2.11   0.035     .0090503    .2396759
                     II  |   .0022687   .0313728     0.07   0.942     -.059234    .0637715
III (non-manual labour)  |          0  (base)
    III (manual labour)  |  -.1506617    .049965    -3.02   0.003    -.2486125    -.052711
                     IV  |  -.1566356   .0502407    -3.12   0.002    -.2551268   -.0581443
                      V  |   -.380365   .1006184    -3.78   0.000    -.5776161   -.1831139
                Missing  |  -.2539358   .0404962    -6.27   0.000    -.3333241   -.1745475
                         |
          marital_st_mum |
          Never married  |  -.1206388   .0385989    -3.13   0.002    -.1963076   -.0449699
              Separated  |  -.1357422   .0572148    -2.37   0.018    -.2479053    -.023579
           Ever married  |          0  (base)
                Missing  |  -.1130427   .1802733    -0.63   0.531    -.4664482    .2403628
                         |
              mum_ed_add |
             CSE / None  |  -.2884407   .0391122    -7.37   0.000    -.3651157   -.2117657
             Vocational  |  -.1568851   .0447715    -3.50   0.000    -.2446547   -.0691156
               O-levels  |          0  (base)
               A-levels  |   .1809204   .0312505     5.79   0.000     .1196574    .2421835
                 Degree  |   .4228745   .0440691     9.60   0.000      .336482    .5092671
                Missing  |  -.0050217   .0820701    -0.06   0.951     -.165911    .1558676
                         |
              cig_change |
            Went off it  |  -.1052319   .0450578    -2.34   0.020    -.1935626   -.0169012
               Cut down  |   .0007196   .0611025     0.01   0.991    -.1190651    .1205042
            Craved more  |  -.0448434   .2700721    -0.17   0.868    -.5742895    .4846027
               Had more  |  -.4333814   .0764357    -5.67   0.000    -.5832251   -.2835377
              NO Change  |  -.0952739   .0793533    -1.20   0.230     -.250837    .0602893
         Never has this  |          0  (base)
                         |
                   _cons |   .1129289   .0281212     4.02   0.000     .0578005    .1680574
------------------------------------------------------------------------------------------

The missing category for mum_age_deliv_cat isn't omitted until I include zdepression or mum_smokes to the regression.

For example:

Code:

regress zks4_GCSE_tot i.mum_age_deliv_cat sex i.marital_st_mum i.mum_ed_add i.mum_SES

      Source |       SS           df       MS      Number of obs   =    11,904
-------------+----------------------------------   F(20, 11883)    =    134.48
       Model |  2197.07793        20  109.853896   Prob > F        =    0.0000
    Residual |  9707.18812    11,883   .81689709   R-squared       =    0.1846
-------------+----------------------------------   Adj R-squared   =    0.1832
       Total |   11904.266    11,903  1.00010636   Root MSE        =    .90382

------------------------------------------------------------------------------------------
           zks4_GCSE_tot |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
       mum_age_deliv_cat |
                  20-24  |   .0309369   .0453468     0.68   0.495    -.0579502     .119824
                  25-29  |   .1836477   .0453645     4.05   0.000     .0947259    .2725695
                  30-34  |    .246943     .04722     5.23   0.000     .1543841     .339502
                    35+  |   .2855106    .052452     5.44   0.000      .182696    .3883251
                Missing  |   .6171147    .064453     9.57   0.000     .4907763    .7434531
                         |
                     sex |   .2518333   .0165871    15.18   0.000     .2193198    .2843467
                         |
          marital_st_mum |
              Separated  |  -.0312996   .0435235    -0.72   0.472    -.1166129    .0540136
           Ever married  |   .2037728   .0250902     8.12   0.000     .1545919    .2529536
                Missing  |  -.0736996   .0423146    -1.74   0.082    -.1566431    .0092439
                         |
              mum_ed_add |
             Vocational  |   .1721054   .0345848     4.98   0.000     .1043134    .2398973
               O-levels  |   .3589388   .0260158    13.80   0.000     .3079437     .409934
               A-levels  |   .5817595    .030404    19.13   0.000     .5221626    .6413564
                 Degree  |   .9064845   .0398891    22.73   0.000     .8282955    .9846736
                Missing  |   .2245926   .0366603     6.13   0.000     .1527325    .2964527
                         |
                 mum_SES |
                     II  |  -.1239549   .0526468    -2.35   0.019    -.2271513   -.0207585
III (non-manual labour)  |  -.0957971   .0546101    -1.75   0.079    -.2028419    .0112477
    III (manual labour)  |  -.2680993   .0634493    -4.23   0.000    -.3924703   -.1437283
                     IV  |   -.316245   .0617112    -5.12   0.000     -.437209   -.1952809
                      V  |   -.420445   .0846138    -4.97   0.000    -.5863018   -.2545882
                Missing  |  -.3964954   .0566912    -6.99   0.000    -.5076195   -.2853714
                         |
                   _cons |  -.8257235   .0739581   -11.16   0.000    -.9706935   -.6807534
------------------------------------------------------------------------------------------

shows missing category for mum_age_deliv_cat correctly.

I (manually) checked in data browser whether the missings for mum_age_deliv are the same observations as mum_smokes or zdepression, however this is not the case. Also see:

Code:

tab mum_age_deliv_cat

     Age of |
  mother at |
  delivery, |
    grouped |      Freq.     Percent        Cum.
------------+-----------------------------------
        <20 |        656        4.21        4.21
      20-24 |      2,705       17.38       21.59
      25-29 |      5,440       34.95       56.54
      30-34 |      3,878       24.91       81.46
        35+ |      1,397        8.98       90.43
    Missing |      1,489        9.57      100.00
------------+-----------------------------------
      Total |     15,565      100.00

Code:

tab mum_smokes if mum_age_deliv_cat==6

mother smokes |
any amount of |
  cigs during |
    pregnancy |      Freq.     Percent        Cum.
--------------+-----------------------------------
doesn't smoke |        312       78.00       78.00
       smokes |         88       22.00      100.00
--------------+-----------------------------------
        Total |        400      100.00

Code:

tab mum_age_deliv_cat if missing(mum_smokes)

     Age of |
  mother at |
  delivery, |
    grouped |      Freq.     Percent        Cum.
------------+-----------------------------------
        <20 |        177        6.20        6.20
      20-24 |        510       17.85       24.05
      25-29 |        572       20.02       44.07
      30-34 |        350       12.25       56.32
        35+ |        159        5.57       61.88
    Missing |      1,089       38.12      100.00
------------+-----------------------------------
      Total |      2,857      100.00

Finally, when I try to run the regression with the missing category set as the baselevel, this is the response I get:

Code:

. regress zks4_GCSE_tot mum_smokes ib6.mum_age_deliv_cat
note: 5.mum_age_deliv_cat omitted because of collinearity
note: 6b.mum_age_deliv_cat identifies no observations in the sample

      Source |       SS           df       MS      Number of obs   =     9,936
-------------+----------------------------------   F(5, 9930)      =    161.76
       Model |  729.671972         5  145.934394   Prob > F        =    0.0000
    Residual |  8958.40157     9,930  .902155244   R-squared       =    0.0753
-------------+----------------------------------   Adj R-squared   =    0.0749
       Total |  9688.07354     9,935  .975145802   Root MSE        =    .94982

-----------------------------------------------------------------------------------
    zks4_GCSE_tot |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------+----------------------------------------------------------------
       mum_smokes |  -.4360266   .0240728   -18.11   0.000    -.4832141   -.3888391
                  |
mum_age_deliv_cat |
             <20  |  -.6483968   .0581204   -11.16   0.000    -.7623246    -.534469
           20-24  |   -.466057   .0384945   -12.11   0.000    -.5415141   -.3905999
           25-29  |  -.2004997   .0344241    -5.82   0.000     -.267978   -.1330215
           30-34  |   -.055606   .0358554    -1.55   0.121    -.1258899    .0146779
             35+  |          0  (omitted)
         Missing  |          0  (empty)
                  |
            _cons |   .3428121   .0312033    10.99   0.000     .2816473    .4039768
-----------------------------------------------------------------------------------

I am at a loss as to why this happens, and it now states that there are no observations in the sample. Hope someone can help me!

PS: This is my first post, so I hope I formatted everything the right way. Apologies upfront if not!

Kind regards,
Wouter

↧

Generating dummy variable if another dummy variable occurs over time.

July 16, 2018, 4:47 am

≫ Next: Using a loop to generate a new variable based on other variable with many distinct values

≪ Previous: Automatic omission of category of categorical variable without specification in OLS

Hello Statalisters,

I'm currently writing my master thesis in Finance and I want to do a persistence test. I have found a few ways to do this but for this I need dummy variables which I do not know how to create. So I have this data set containing excess return data of 254 funds over a time period of 12 years. I indicated winners and losers for each fund observation by generating dummy variables. Winner (W) which takes the value of 1 if the excess return is above the median and loser (L) which takes the value of 1 if the excess return is below the median. For the persistence test I want to generate a new dummy taking the value of 1 if a fund is a winner in month t & a winner in month t+1. This dummy will be indicated by WW. The same sort of dummy variables are needed for Winner-Loser (WL) and Loser-Loser (LL). I tried to use tsspell but I did not manage to get the results I need.
I hope you understand what I try to do.
hanks in advance for any answer/suggestion!

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long time int(_j date) float(excessReturn median_excessReturn W L)
1   1 16802 3.4313195 5.486612 0 1
1   2 16802  6.981278 5.486612 1 0
1   3 16802         . 5.486612 . .
1   4 16802         . 5.486612 . .
1   5 16802         . 5.486612 . .
1   6 16802         . 5.486612 . .
1   7 16802  8.563093 5.486612 1 0
1   8 16802         . 5.486612 . .
1   9 16802         . 5.486612 . .
1  10 16802  7.770547 5.486612 1 0
1  11 16802         . 5.486612 . .
1  12 16802         . 5.486612 . .
1  13 16802         . 5.486612 . .
1  14 16802  7.977136 5.486612 1 0
1  15 16802         . 5.486612 . .
1  16 16802  6.994602 5.486612 1 0
1  17 16802 4.5341887 5.486612 0 1
1  18 16802         . 5.486612 . .
1  19 16802         . 5.486612 . .
1  20 16802   3.94336 5.486612 0 1
1  21 16802         . 5.486612 . .
1  22 16802         . 5.486612 . .
1  23 16802         . 5.486612 . .
1  24 16802         . 5.486612 . .
1  25 16802         . 5.486612 . .
1  26 16802  3.907271 5.486612 0 1
1  27 16802         . 5.486612 . .
1  28 16802         . 5.486612 . .
1  29 16802 3.7624195 5.486612 0 1
1  30 16802         . 5.486612 . .
1  31 16802         . 5.486612 . .
1  32 16802         . 5.486612 . .
1  33 16802         . 5.486612 . .
1  34 16802         . 5.486612 . .
1  35 16802         . 5.486612 . .
1  36 16802         . 5.486612 . .
1  37 16802         . 5.486612 . .
1  38 16802         . 5.486612 . .
1  39 16802         . 5.486612 . .
1  40 16802         . 5.486612 . .
1  41 16802         . 5.486612 . .
1  42 16802         . 5.486612 . .
1  43 16802         . 5.486612 . .
1  44 16802         . 5.486612 . .
1  45 16802         . 5.486612 . .
1  46 16802         . 5.486612 . .
1  47 16802         . 5.486612 . .
1  48 16802         . 5.486612 . .
1  49 16802         . 5.486612 . .
1  50 16802         . 5.486612 . .
1  51 16802         . 5.486612 . .
1  52 16802         . 5.486612 . .
1  53 16802         . 5.486612 . .
1  54 16802  8.679188 5.486612 1 0
1  55 16802         . 5.486612 . .
1  56 16802         . 5.486612 . .
1  57 16802  8.707003 5.486612 1 0
1  58 16802         . 5.486612 . .
1  59 16802  8.579934 5.486612 1 0
1  60 16802         . 5.486612 . .
1  61 16802         . 5.486612 . .
1  62 16802         . 5.486612 . .
1  63 16802         . 5.486612 . .
1  64 16802 3.4701564 5.486612 0 1
1  65 16802         . 5.486612 . .
1  66 16802         . 5.486612 . .
1  67 16802  3.168994 5.486612 0 1
1  68 16802         . 5.486612 . .
1  69 16802         . 5.486612 . .
1  70 16802         . 5.486612 . .
1  71 16802         . 5.486612 . .
1  72 16802         . 5.486612 . .
1  73 16802         . 5.486612 . .
1  74 16802         . 5.486612 . .
1  75 16802         . 5.486612 . .
1  76 16802         . 5.486612 . .
1  77 16802         . 5.486612 . .
1  78 16802         . 5.486612 . .
1  79 16802         . 5.486612 . .
1  80 16802         . 5.486612 . .
1  81 16802         . 5.486612 . .
1  82 16802         . 5.486612 . .
1  83 16802         . 5.486612 . .
1  84 16802         . 5.486612 . .
1  85 16802         . 5.486612 . .
1  86 16802         . 5.486612 . .
1  87 16802         . 5.486612 . .
1  88 16802 2.7816935 5.486612 0 1
1  89 16802 2.7989986 5.486612 0 1
1  90 16802         . 5.486612 . .
1  91 16802         . 5.486612 . .
1  92 16802         . 5.486612 . .
1  93 16802         . 5.486612 . .
1  94 16802  2.790924 5.486612 0 1
1  95 16802 3.9475884 5.486612 0 1
1  96 16802         . 5.486612 . .
1  97 16802         . 5.486612 . .
1  98 16802         . 5.486612 . .
1  99 16802         . 5.486612 . .
1 100 16802         . 5.486612 . .
end
format %tdnn/dd/CCYY date

↧

Using a loop to generate a new variable based on other variable with many distinct values

July 16, 2018, 4:55 am

≫ Next: Transposing esttab

≪ Previous: Generating dummy variable if another dummy variable occurs over time.

Hi,

In my data I have a variable that has a variable ID with values such as: 220, 240, 470, 1120, 2840, 8710, etc. I want to generate a more simple identifying variable, say ID2, with numbers running from 1 to [number of distinct values in ID].

I know I can do this by hand using a number of replace commands, but as my data contains 60+ distinct values in ID, I'd prefer to do with using a loop. But unfortunately I am not very experienced with Stata so I could use some help with this.

↧

Transposing esttab

July 16, 2018, 5:46 am

≫ Next: Reporting Baselevels Gologit2

≪ Previous: Using a loop to generate a new variable based on other variable with many distinct values

Esttab normally puts stored estimation horizontally (one next to the other). I would like to transpose table and present stored estimation vertically (one under the other). How could I do it?

I know that I can transpose r(coefs), but I would like to present confidence intervals instead of t and p statistics, so this option is not the thing I am looking for.

I am using Stata15.

↧

Reporting Baselevels Gologit2

July 16, 2018, 6:25 am

≫ Next: problem with substr

≪ Previous: Transposing esttab

Hi All,

I have been using the Gologit2 user written model with factorial variables. Is it at all possible to report the base levels of the variables. I have attempted adding a baselevels option which runs the model as if baselevels is not reported the same happens for allbaselevels.

Regards,

Tiarnán

↧

problem with substr

July 16, 2018, 7:06 am

≫ Next: How is this possible: A time series with a stationary level but non-stationary first differences ???

≪ Previous: Reporting Baselevels Gologit2

Hello,

I have a problem with the substr command.
I want to extract the title from a file name that looks like this

B09520003465-this is the title.csv

I use the following code

Code:

 gen title= substr(filename, 14, length(filename) - 4)

and I get this

this is the title.csv

What am I missing?

Thanks,

Ylenia

↧

How is this possible: A time series with a stationary level but non-stationary first differences ???

July 16, 2018, 7:19 am

≫ Next: How to visualize the marginal effect of interaction term?

≪ Previous: problem with substr

It's the following time series that has this strange property:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float ln_CPI_AUT
 4.388754
 4.388754
 4.388754
4.3907385
 4.388754
 4.388754
4.3896227
4.3896227
 4.393708
 4.394449
4.4012165
 4.398269
4.4050107
 4.407938
 4.407207
 4.407938
4.4157033
4.4165487
4.4165487
4.4194427
4.4213676
4.4249663
4.4268804
4.4279556
4.4318876
 4.433789
 4.438761
 4.440649
4.4426513
 4.443592
4.4415917
 4.444532
 4.445588
 4.444532
 4.446526
4.4494514
 4.450386
 4.452252
  4.45609
4.4589877
4.4599133
4.4589877
 4.460953
 4.460953
   4.4628
 4.461877
 4.464758
4.4665976
 4.467631
 4.470381
 4.469465
 4.469465
4.4714103
 4.470381
4.4732375
4.4751754
4.4732375
4.4751754
 4.476996
 4.477905
4.4816456
4.4853725
4.4853725
 4.490096
4.4937916
4.4918895
4.4955783
4.4955783
4.4992537
 4.501142
4.5056815
4.5065646
4.5096498
4.5146985
 4.512616
 4.513603
 4.516667
 4.515683
 4.515683
4.5176497
4.5186315
 4.515683
4.5196123
4.5186315
 4.521571
4.5246105
4.5295844
4.5305543
4.5315237
4.5305543
 4.533459
4.5315237
4.5305543
4.5315237
4.5344257
4.5344257
4.5374265
4.5422306
 4.547117
4.5499744
end

When I apply an augmented Dickey-Fuller test the level is stationary but the first difference of the stationary level is non-stationary:

PHP Code:


dfuller   ln_CPI_AUT, lags(12) trend

PHP Code:


dfuller d.ln_CPI_AUT, lags(12) trend

I should admit, if I apply a Phillips-Perron test to the same time series, the result is consistent, in the sense that the level is non-stationary and the first difference is stationary.

How would you interpret this ADF result ???

Thanks for any comments!
Nora

↧

How to visualize the marginal effect of interaction term?

July 16, 2018, 7:27 am

≫ Next: Creating dummy variables by percentiles

≪ Previous: How is this possible: A time series with a stationary level but non-stationary first differences ???

Dear all, I would like to ask you how can you visualize "the marginal effect" of two variables which were in interaction. I am using screen shot from article of Brambor2005:

Array

Which is based on this reg. model:

Array

I am hundred percent positive It can be done in Stata, but I was not able to find and recommendation or guide how to do so. I am glad for any help.

↧

Creating dummy variables by percentiles

July 16, 2018, 3:24 pm

≫ Next: Pseudo R2 for MI Ologit

≪ Previous: How to visualize the marginal effect of interaction term?

Hello All,

I'm using a corruption index as an explanatory variable, This index ranks countries from 0 to 100, where 100 is the least corrupt. Instead of using the index in its current form, I'm interested in creating a dummy variable for the 25th percentile. The dummy would take the value of 1 if the country is ranked among the most corrupt (bottom 25%) and 0 otherwise.

I've found different codes online, but they all seem to be more complex than needed and don't quite work with my data.

I'd highly appreciate any help.

Best wishes,
Henry

↧

Pseudo R2 for MI Ologit

July 16, 2018, 4:40 pm

≫ Next: gsem: how to include error terms to a multinomial logit model

≪ Previous: Creating dummy variables by percentiles

I have tried applying the commands listed in on the two forum posts https://www.statalist.org/forums/for...ple-imputation and https://www.statalist.org/forums/for...t keep getting. I used the command:

Code:

local vars "i.female i.aframer i.asian i.latino i.other c.pared i.fedgrant i.sf_loansyest1"
noi mi estimate, or saving(miest, replace): ologit arts `vars', cluster(schoolid)
qui mi query
local M=`r(M)'
scalar r2=0
scalar cstat=0
qui mi xeq 1/`M': ologit arts `vars'; scalar r2=r2+e(r2_p); lroc, nog; scalar cstat=cstat+r(area)
scalar r2=r2/`M'
scalar cstat=cstat/`M'
noi di "Pseudo R=squared over imputed data = " r2
noi di "C statistic over imputed data = " cstat

and received the error message:

Code:

 qui mi xeq 1/`M': ologit arts `vars'; scalar r2=r2+e(r2_p); lroc, nog; scalar cstat=cstat+r(area)
last estimates not found

I have tried the [CODE ]local M=`r(M)'[/CODE] with and without the quote around the r(M)

I had the command structure work for me last week on a series of logits, but for some reason, the ologit is not giving me an output.

Thanks for any help!

↧

gsem: how to include error terms to a multinomial logit model

July 16, 2018, 5:34 pm

≫ Next: adjusting for disease duration

≪ Previous: Pseudo R2 for MI Ologit

Dear Statalisters,

I am attempting to fit a generalised linear structure equation using panel data, with y as categorical response.

I wanted to constrain the error term in the structural equation to 0, as following
var(e.y@0)

Stata returns the message that "invalid covariance specification; e.y does not identify an error for an uncensored gaussian response with the identity link"

Does gsem include an error term by default please? Sem estimates error terms. I read the manual and couldn't find out whether gsem estimates error terms by default. Shall I include a latent variable as the disturbance please?

Thank you very much for your very kind help!!!

Boshuo
PhD student, Imperial College London

↧

adjusting for disease duration

July 16, 2018, 5:44 pm

≫ Next: Mixed AR Residuals

≪ Previous: gsem: how to include error terms to a multinomial logit model

Hi,
I have a population of people with diagnosed end-stage renal disease (ESRD). The data comes from a national registry of people with ESRD, established in 1960.
I am estimating annual rates of amputations in this population between 2000 and 2015, age-standardised.
Below I provide the code I have used to obtain this data.

stset dox1, fail(lea1==1) origin(born) entry (entry) scale(365.25) id(usrds_id)
stsplit _year, after(time=d(1/1/2000)) at(0(1)15) trim
replace _year=2000 + _year
gen pop = 1
gen agecat = .
replace agecat = 1 if _t0>17 & _t0<45
replace agecat = 2 if _t0>=45 & _t0<65
replace agecat = 3 if _t0>=65 & _t0<75
replace agecat = 4 if _t0>=75 & _t0<.
collapse (sum) pop lea1, by(_year agecat)

where dox1 is date of exit (either date of amputation, date of death or 31december 2015, whichever occurred first)
lea1 = amputation event
entry = 01 january 2000 or date of ESRD registration if thereafter

Once i have counts of amputations and 'population at risk' by year and agegroup, i run the following command to obtain age-standardised results by year:

foreach x of varlist lea1 {
set more off
qui dstdize `x' pop agecat, by(_year) using("2000_pop")
putexcel set lea.xlsx, sheet("`x'", replace) modify
matrix C = r(Nobs)', r(crude)'*1000, r(adj)'*1000, r(lb_adj)'*1000, r(ub_adj)'*1000, r(se)'*1000
matrix rowname C = 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
matrix list C
putexcel A1=("year") B1=("pop") C1=("`x'crude") D1=("`x'Rate") E1=("`x'LL") F1=("`x'UL") G1=("`x'SE") A2=matrix(C, rownames)
}

Overall - i show that between 2000 and 2009, rates of amputation declined but thereafter, they did not change. I am trying to explore possible reasons for this. One element i want to explore is disease duration, i.e. with increased survival in this population (leading to increased disease duration), is the 'population at risk' in more recent years different to those in earlier years such that amputations are less likely?

My question: people in my dataset have varying degrees of disease duration (with respect to date of ESRD diagnosis to dox1). On average, disease duration increased 2 years between 2000 and 2015. Is there a way to 'adjust/standardize' for disease duration in this dataset?

Many thanks
Jess

↧

Mixed AR Residuals

July 16, 2018, 6:24 pm

≫ Next: Time series survey data analysis - beginner questions

≪ Previous: adjusting for disease duration

Hi Folks, I am using Mixed in Stata/SE 15..

I have a dataset where I have weekly sales from a farmers market (there many gaps in the data, especially as the market is closed several months each year). Then, I have sales by individual vendor types. I have one time series analysis that looks at aggregated sales (of all vendor types), and it is clear there is an autoregressive process that must be included. With the multilevel analysis, I want to consider the random effects of the vendor type. When I do this, and include the residual structure, the conversion is very slow (in fact I have not been able to see any results from the analysis). Here is the code for that, followed by sample data. Does anybody have any suggestions for improving the convergence? Maybe there is something I am overlooking.

Thank you!

-Steve

Code:

tsset vend_id date3
 mixed lnsales_type10v lnspec_index   ///
||vend_type: , mle residuals(ar 1, by(vend_type) t(date3))

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(lnsales_type10v lnspec_index) str9 vend_type float date3
 3.505557 1.6897603 "spec"      19230
 3.987673 1.6897603 "nonedible" 19230
4.1076307 1.6897603 "plants"    19230
 4.795667  1.776106 "plants"    19181
 4.824479  1.609438 "nonedible" 19811
 4.922168  1.670123 "nonedible" 19573
 4.928991 1.7769105 "nonedible" 19552
 5.108971  1.684198 "nonedible" 19608
 5.161642 1.7881165 "plants"    19972
 5.166727  1.609438 "produce"   19811
 5.192368 1.5377253 "produce"   19762
 5.200264 1.8057457 "nonedible" 19895
 5.201256 1.7337543 "produce"   19321
  5.20396  1.609438 "plants"    19811
 5.228194  1.764295 "plants"    19209
 5.232668 1.7796197 "plants"    19202
 5.232712 1.5377253 "nonedible" 19762
 5.236389 1.7176515 "produce"   19790
 5.243749 1.6897603 "value"     19230
 5.296916  1.609438 "spec"      19811
 5.327488 1.5145186 "produce"   19398
 5.336576 1.6397433 "produce"   19426
 5.339828  1.789192 "plants"    19216
 5.380818 1.7403235 "nonedible" 19615
 5.400657  1.764015 "plants"    19272
 5.414137 1.7466246 "plants"    19188
 5.414655 1.7120484 "nonedible" 19937
  5.43238  1.728827 "plants"    19223
 5.452424   1.79868 "nonedible" 19916
 5.457228 1.7917595 "plants"    19195
 5.487449  1.670238 "nonedible" 19601
 5.488313 1.6897603 "meat"      19230
 5.498344 1.8590926 "plants"    19545
  5.50289 1.7026595 "produce"   19447
 5.505159  1.771281 "nonedible" 19174
 5.510217 1.7940716 "nonedible" 19517
 5.514417  1.754488 "plants"    19643
 5.521911 1.7149657 "nonedible" 19496
 5.530543 1.7574022 "nonedible" 19559
 5.544161  1.628456 "produce"   19454
 5.547513 1.7149657 "plants"    19496
 5.570927   1.78413 "plants"    19251
 5.574079  1.688008 "produce"   19818
 5.577326   1.74204 "plants"    19958
 5.579881 1.8174162 "nonedible" 19888
 5.588661 1.7403235 "plants"    19615
 5.596269 1.7881165 "nonedible" 19972
 5.600256 1.7897793 "plants"    19944
 5.618566  1.655423 "plants"    19657
 5.626764  1.628456 "nonedible" 19454
   5.6287  1.704262 "nonedible" 19587
 5.645447  1.686756 "nonedible" 19489
 5.646227  1.684198 "plants"    19608
 5.658646  1.776106 "nonedible" 19181
 5.663717 1.7120484 "plants"    19937
 5.677672  1.704262 "plants"    19587
 5.677943  1.754019 "plants"    19930
 5.678123 1.6464264 "nonedible" 19580
 5.683178 1.6945958 "nonedible" 19909
 5.686351 1.7278557 "nonedible" 19244
 5.691782 1.7412496 "plants"    19951
 5.709225 1.7013754 "plants"    19636
 5.710659  1.754019 "nonedible" 19930
 5.714527  1.609438 "meat"      19811
 5.721739 1.7721362 "plants"    19650
 5.721868 1.7278557 "nonedible" 19265
 5.744672 1.5377253 "plants"    19762
 5.746185 1.7796197 "nonedible" 19202
 5.747989  1.762772 "plants"    19279
 5.748619 1.6945958 "plants"    19909
 5.760365  1.688008 "plants"    19293
 5.762837 1.7574022 "plants"    19559
 5.765787  1.754488 "nonedible" 19643
 5.766636  1.699146 "plants"    19160
 5.767956 1.6897603 "produce"   19230
 5.770072  1.789192 "nonedible" 19216
 5.771064  1.727043 "plants"    19622
 5.778542 1.7562047 "nonedible" 19860
 5.786503  1.781418 "nonedible" 19146
 5.789114  1.764295 "nonedible" 19209
 5.792663  1.670123 "plants"    19573
 5.793485 1.7769105 "plants"    19552
 5.800488  1.670238 "nonedible" 19461
 5.805037 1.7512915 "nonedible" 19139
 5.806499 1.6028227 "nonedible" 19965
 5.812849  1.776106 "plants"    19566
 5.821338 1.7647308 "spec"      19832
 5.827877 1.4319825 "nonedible" 19797
 5.831261 1.6464264 "plants"    19580
 5.844341 1.6846718 "plants"    19629
 5.845448  1.670238 "plants"    19601
 5.846124 1.7466246 "nonedible" 19188
 5.857612 1.6479865 "plants"    19034
 5.861122 1.7104533 "nonedible" 19104
 5.869641   1.79868 "plants"    19916
 5.870623  1.720624 "plants"    19111
 5.872681 1.6672574 "produce"   19825
 5.876374 1.7647308 "nonedible" 19832
 5.891682 1.7940716 "plants"    19517
 5.892911 1.7278557 "plants"    19265
end

↧

Time series survey data analysis - beginner questions

July 16, 2018, 10:42 pm

≫ Next: Multinomial logit or xtreg?

≪ Previous: Mixed AR Residuals

I am working on a data set with 70 variables and 440 observations of different types, categorical (ordinal and nominal) and numerical. The data set is from two knowledge, attitude, and practice surveys conducted in three villages (one control, one with a single intervention, one with two interventions), the first conducted in 2016 and the other in 2018. The 2018 survey contained a few repeat questions, but also collected variables that were not included in the first survey (demographic variables and others). The data was in the long format with the first survey respondents on top of the second survey respondents with the surveys identified by a time variable that is 0 if conducted in 2016 and 1 if conducted in 2018. The same participants were interviewed both times and a case ID was generated that identifies the participants as being the same in both surveys.

The question is, how do I approach analysis of this data? Do I reshape the data to wide with i(caseid) and j(time)? Is it possible to compare the same individuals through time while comparing the villages to each other?

Any guidance would be appreciated.

↧

Multinomial logit or xtreg?

July 17, 2018, 2:13 am

≫ Next: Creating a new date variable common to each ID

≪ Previous: Time series survey data analysis - beginner questions

Good morning users!
I have an issue and I hope you could help me: I have a panel dataset, I am interested to study how people adjust their risk aversion during financial crisis. I have a categorical variable (riskavers) which can have three different values (1=not taking any financial risk, 2=willing to take a medium risk, 3=willing to take a great risk). However, I do not know which model can capture better its relationship with the dummy variable "crisis". Here I posted a multinomial logit regression (even though I am not completely sure I can use it in a panel dataset as a pooled estimation) and the xtreg command. Can you please help me to decide which one is better or if I should use another one?
Thank you so much
Luke Brown

Code:

xtreg riskavers hhsex age educ race logsaving crisis logincome, vce(cluster YY1)

Random-effects GLS regression                   Number of obs      =     31879
Group variable: YY1                             Number of groups   =      6551

R-sq:  within  = 0.2245                         Obs per group: min =         1
       between = 0.2259                                        avg =       4.9
       overall = 0.2260                                        max =         6

                                                Wald chi2(7)       =  11592.78
corr(u_i, X)   = 0 (assumed)                    Prob > chi2        =    0.0000

                                 (Std. Err. adjusted for 6551 clusters in YY1)
------------------------------------------------------------------------------
             |               Robust
   riskavers |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       hhsex |  -.0983859    .007495   -13.13   0.000    -.1130758    -.083696
         age |  -.0046486   .0001839   -25.27   0.000     -.005009   -.0042881
        educ |   .0414245   .0010909    37.97   0.000     .0392863    .0435627
        race |  -.0267025   .0031038    -8.60   0.000    -.0327859   -.0206191
   logsaving |   .0056858   .0005938     9.57   0.000     .0045219    .0068497
      crisis |  -.0392017   .0055675    -7.04   0.000    -.0501137   -.0282897
   logincome |   .0899202   .0021053    42.71   0.000     .0857939    .0940464
       _cons |   .6533855   .0272177    24.01   0.000     .6000397    .7067313
-------------+----------------------------------------------------------------
     sigma_u |  .05883202
     sigma_e |  .48710283
         rho |  .01437794   (fraction of variance due to u_i)
------------------------------------------------------------------------------

Code:

 mlogit riskavers hhsex age educ race logsaving crisis logincome, baseoutcome(1) vce(cluster YY1)

Iteration 0:   log pseudolikelihood = -26224.432  
Iteration 1:   log pseudolikelihood =  -21636.01  
Iteration 2:   log pseudolikelihood = -21376.921  
Iteration 3:   log pseudolikelihood = -21374.942  
Iteration 4:   log pseudolikelihood = -21374.941  

Multinomial logistic regression                   Number of obs   =      31879
                                                  Wald chi2(14)   =    5973.95
                                                  Prob > chi2     =     0.0000
Log pseudolikelihood = -21374.941                 Pseudo R2       =     0.1849

                                 (Std. Err. adjusted for 6551 clusters in YY1)
------------------------------------------------------------------------------
             |               Robust
   riskavers |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1            |  (base outcome)
-------------+----------------------------------------------------------------
2            |
       hhsex |  -.3511427   .0340069   -10.33   0.000     -.417795   -.2844904
         age |   -.020097   .0008849   -22.71   0.000    -.0218314   -.0183627
        educ |   .2007731   .0058042    34.59   0.000      .189397    .2121492
        race |   -.173934   .0142654   -12.19   0.000    -.2018938   -.1459743
   logsaving |   .0406524   .0033108    12.28   0.000     .0341633    .0471415
      crisis |  -.2012526   .0283804    -7.09   0.000    -.2568771   -.1456281
   logincome |   .5681068    .015723    36.13   0.000     .5372903    .5989234
       _cons |   -6.04118     .18116   -33.35   0.000    -6.396247   -5.686113
-------------+----------------------------------------------------------------
3            |
       hhsex |  -.3277998   .0828562    -3.96   0.000    -.4901949   -.1654046
         age |  -.0365087   .0021477   -17.00   0.000    -.0407182   -.0322993
        educ |   .1516283   .0124127    12.22   0.000     .1272998    .1759568
        race |  -.0236554    .026387    -0.90   0.370    -.0753731    .0280622
   logsaving |  -.0086072   .0064052    -1.34   0.179    -.0211612    .0039468
      crisis |   -.183439   .0579029    -3.17   0.002    -.2969265   -.0699515
   logincome |   .7799633   .0246435    31.65   0.000     .7316628    .8282637
       _cons |   -9.82149   .2998162   -32.76   0.000    -10.40912   -9.233861
------------------------------------------------------------------------------

↧

Creating a new date variable common to each ID

July 17, 2018, 4:30 am

≫ Next: Cox model with competing risk error message

≪ Previous: Multinomial logit or xtreg?

Hello

I have a database of visitors to a clinic. Each visitor has an id and I have made the variable dup which numbers each visitor's visits in chronological order.

I want to create the variables datefirst, datesecond etc. which give the date of the first visit, second visit etc. so that each case would contain all of the information on dates of visitors' visits.

I tried the below but it is not working. Any help much appreciated.

bysort id (dup) : gen datefirst = dateofvisit[1]
bysort id (dup) : gen datesecond = dateofvisit[2]
bysort id (dup) : gen datethird = dateofvisit[3]
bysort id (dup) : gen datefourth = dateofvisit[4]

↧