Help with expanding a dataset and merging with a new dataset?

December 27, 2019, 5:52 pm

≫ Next: traffic accident analyze by logistic regression or binomial regression with did

Hi folks! I have a list of J respondents, each with an identifier and a zip code. I also have a separate dataset made up of N different sites, and their lat/long. I want to expand each respondent observation by the number of sites, and then append the site dataset onto the respondent dataset J times, so that each respondent has one observation for each site.

For context, here is the structure of my respondent data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte personid long zipcode
1 60606
2 66206
3 60602
4 48912
5 43877
6 24595
end

Here is my site data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte siteID float(lat lon)
1 42.34538 -82.9744
2 42.57205 -82.7986
3 42.67822 -82.7337
4  41.6858 -83.3781
end

And here Is what I'd like the end result to look like:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str8 personID str7 zipcode str6 siteID str8(lat lon)  
"1"        "60606"   "1"      "42.34538" "-82.9744"
"1"        "60606"   "2"      "42.57205" "-82.7986"
"1"        "60606"   "3"      "42.67822" "-82.7337"
"1"        "60606"   "4"      "41.6858"  "-83.3781"
"2"        "66206"   "1"      "42.34538" "-82.9744"
"2"        "66206"   "2"      "42.57205" "-82.7986"
"2"        "66206"   "3"      "42.67822" "-82.7337"
"2"        "66206"   "4"      "41.6858"  "-83.3781"
"3"        "60602"   "1"      "42.34538" "-82.9744"
"3"        "60602"   "2"      "42.57205" "-82.7986"
"3"        "60602"   "3"      "42.67822" "-82.7337"
"3"        "60602"   "4"      "41.6858"  "-83.3781"
"4"        "48912"   "1"      "42.34538" "-82.9744"
"4"        "48912"   "2"      "42.57205" "-82.7986"
"4"        "48912"   "3"      "42.67822" "-82.7337"
"4"        "48912"   "4"      "41.6858"  "-83.3781"
"5"        "43877"   "1"      "42.34538" "-82.9744"
"5"        "43877"   "2"      "42.57205" "-82.7986"
"5"        "43877"   "3"      "42.67822" "-82.7337"
"5"        "43877"   "4"      "41.6858"  "-83.3781"
"6"        "24595"   "1"      "42.34538" "-82.9744"
"6"        "24595"   "2"      "42.57205" "-82.7986"
"6"        "24595"   "3"      "42.67822" "-82.7337"
"6"        "24595"   "4"      "41.6858"  "-83.3781"
end

Apologies for the formatting; I'm pretty new here. I've been trying to figure this out for a while, with no luck, so any help would be greatly appreciated!

↧

traffic accident analyze by logistic regression or binomial regression with did

December 27, 2019, 10:45 pm

≫ Next: dummy variable in the two-way fixed effect model

≪ Previous: Help with expanding a dataset and merging with a new dataset?

hi everyone, I want to use annual road traffic accident data for analysis. Since the people who cause car accidents are different each year, it is not so-called panel data. I currently think that use logistic regression or negative binomial regression for analysis maybe can work.
I want to focus on a policy effect which passing a car driving license test needing add a road driving test. I want to compare the impact on traffic accidents before and after the law revision.
the following is my thought that i want analyze:
1.The number of deaths/injuries/total deaths and injuries caused per road traffic accidents
2.No serious injuries, minor injuries, serious injuries, deaths
here is my question:
1. If i use logistic regression or negative binomial regression, can i use the did method for analysis to analyze the control and experiment group the difference before and after the law revision
2. In addition to the above two methods, is there any other feasible method?

↧

dummy variable in the two-way fixed effect model

December 28, 2019, 12:23 am

≫ Next: Error r(900) - no room to add variables, up to 2048 variables are allowed, but I'm only using 340 variables

≪ Previous: traffic accident analyze by logistic regression or binomial regression with did

I want to regress two-way fixed effect model with dummy variable using difference .
the model look like

D.lnEm = a+ bD.lnX₁+f

here, f is dummy variable of female.
But I'm not sure that my dataset is well-organized for it. Because when I generate time variable there are 4 times for each observation. for example

time year sex
1 2014 0
2 2014 1
3 2016 0
4 2016 1

in this case, if i use reg D.lmEm D.lnX₁ the estimation result might not be approprate.
My question is whether I can use the regression as usual.
I attached the dataset and command below.

Thank you always, sir

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str54 occupation int year long 근로자수 double(lnMW WMW RII lnMWRIIWMW) byte(dum_male dum_female trend) long occ float(time lemp)
"가사 및 육아 도우미"                            2014    252 8.558335134747413 1.8181631811064782 1.3936567164179103 21.685925421035577 1 0 1  1 1  5.529429
"가사 및 육아 도우미"                            2014   7544 8.558335134747413 1.6029385234629385 1.3936567164179103 19.118858876664508 0 1 1  1 2  8.928508
"가사 및 육아 도우미"                            2015     17 8.626944055375356 1.1608015640273706 1.3936567164179103 13.956315492043816 1 0 2  1 3  2.833213
"가사 및 육아 도우미"                            2015  14352 8.626944055375356  1.408982506406359 1.3936567164179103  16.94019459618348 0 1 2  1 4  9.571645
"경비원 및 검표원"                                2014 107613 8.558335134747413 1.3282946688893464 2.0087185725871857  22.83509452223576 1 0 1  2 1 11.586297
"경비원 및 검표원"                                2014   4422 8.558335134747413 1.8967881712730512 2.0087185725871857  32.60822932903531 0 1 1  2 2  8.394347
"경비원 및 검표원"                                2015 109891 8.626944055375356 1.2634214229623928 2.0087185725871857 21.893959653427974 1 0 2  2 3 11.607244
"경비원 및 검표원"                                2015   2475 8.626944055375356 1.4579579042004305 2.0087185725871857  25.26510232517308 0 1 2  2 4  7.813996
"계기검침수금 및 주차관련 종사원"          2014  20222 8.558335134747413 2.1115378785129137 2.0042857142857144  36.21994583725714 1 0 1  3 1  9.914526
"계기검침수금 및 주차관련 종사원"          2014  12159 8.558335134747413 1.8454744301318164 2.0042857142857144 31.656066691303064 0 1 1  3 2  9.405825
"계기검침수금 및 주차관련 종사원"          2015  21018 8.626944055375356 2.0388977025677186 2.0042857142857144  35.25429621400446 1 0 2  3 3  9.953135
"계기검침수금 및 주차관련 종사원"          2015  13690 8.626944055375356 1.7921146953405018 2.0042857142857144 30.987205605969343 0 1 2  3 4  9.524421
"공예 및 귀금속 세공원"                         2014   1347 8.558335134747413  2.924311239142827 1.4082278481012658  35.24405016514967 1 0 1  4 1  7.205635
"공예 및 귀금속 세공원"                         2014    467 8.558335134747413 2.0519121417717425 1.4082278481012658 24.729821330605265 0 1 1  4 2  6.146329
"공예 및 귀금속 세공원"                         2015   1102 8.626944055375356 2.7660075329566856 1.4082278481012658  33.60340363413303 1 0 2  4 3  7.004882
"공예 및 귀금속 세공원"                         2015    766 8.626944055375356 1.6559178978269262 1.4082278481012658 20.117254505877238 0 1 2  4 4  6.641182
"기타 서비스 관련 단순 종사원"               2014   6050 8.558335134747413   2.20828786342301 2.1068376068376065  39.81768774068604 1 0 1  5 1  8.707813
"기타 서비스 관련 단순 종사원"               2014   6044 8.558335134747413  1.811149755386547 2.1068376068376065   32.6568816530181 0 1 1  5 2  8.706821
"기타 서비스 관련 단순 종사원"               2015   6309 8.626944055375356  1.772102247598006 2.1068376068376065 32.208968745997566 1 0 2  5 3  8.749732
"기타 서비스 관련 단순 종사원"               2015   6743 8.626944055375356 1.5683642151213106 2.1068376068376065 28.505913840837533 0 1 2  5 4   8.81626
"농림어업관련 단순 종사원"                    2014   1578 8.558335134747413  2.050491383740809 2.0044576523031203  35.17581132102656 1 0 1  6 1  7.363914
"농림어업관련 단순 종사원"                    2014    722 8.558335134747413 1.3391063696826317 2.0044576523031203 22.972129203882215 0 1 1  6 2  6.582025
"농림어업관련 단순 종사원"                    2015   2183 8.626944055375356 2.3797592669737715 2.0044576523031203  41.15161594782715 1 0 2  6 3  7.688456
"농림어업관련 단순 종사원"                    2015   1012 8.626944055375356 1.3968359270623139 2.0044576523031203  24.15456740113578 0 1 2  6 4  6.919684
"방문노점 및 통신판매 관련직 (중)"          2014  21539 8.558335134747413  3.037961185753892              1.276   33.1758595813505 1 0 1  7 1   9.97762
"방문노점 및 통신판매 관련직 (중)"          2014  48348 8.558335134747413  2.223240224759497              1.276  24.27875176882079 0 1 1  7 2  10.78618
"방문노점 및 통신판매 관련직 (중)"          2015  22235 8.626944055375356  2.322396838525871              1.276 25.564899378038028 1 0 2  7 3 10.009423
"방문노점 및 통신판매 관련직 (중)"          2015  31639 8.626944055375356 2.1009978999889474              1.276 23.127744154517504 0 1 2  7 4 10.362145
"섬유 및 가죽관련 기능 종사자"               2014  10002 8.558335134747413  2.742577416098783 1.4120171673819744  33.14272103488336 1 0 1  8 1  9.210541
"섬유 및 가죽관련 기능 종사자"               2014  25551 8.558335134747413 1.8208168871288501 1.4120171673819744 22.003691050427072 0 1 1  8 2 10.148432
"섬유 및 가죽관련 기능 종사자"               2015  11641 8.626944055375356 2.3694662404339826 1.4120171673819744  28.86339973141534 1 0 2  8 3  9.362288
"섬유 및 가죽관련 기능 종사자"               2015  29138 8.626944055375356 1.5547676016412504 1.4120171673819744 18.939235347538034 0 1 2  8 4 10.279799
"세탁관련 기계조작원"                           2014   1468 8.558335134747413 1.7624739830089817                1.7 25.642533121868773 1 0 1  9 1  7.291656
"세탁관련 기계조작원"                           2014   1686 8.558335134747413 1.5036899422131442                1.7   21.8774401891554 0 1 1  9 2  7.430114
"세탁관련 기계조작원"                           2015   1391 8.626944055375356  1.843948757230372                1.7 27.042992708329066 1 0 2  9 3  7.237778
"세탁관련 기계조작원"                           2015   1910 8.626944055375356 1.3993356533353407                1.7  20.52238367322738 0 1 2  9 4  7.554859
"식품가공관련 기능 종사자"                    2014  23239 8.558335134747413 2.3497995729815995 1.5372767857142855  30.91520840440806 1 0 1 10 1 10.053587
"식품가공관련 기능 종사자"                    2014  29107 8.558335134747413  1.736538506719893 1.5372767857142855  22.84682083306577 0 1 1 10 2 10.278734
"식품가공관련 기능 종사자"                    2015  27579 8.626944055375356 2.1382157133486266 1.5372767857142855 28.357018560838718 1 0 2 10 3  10.22481
"식품가공관련 기능 종사자"                    2015  38112 8.626944055375356   1.66135147085109 1.5372767857142855 22.032844582000237 0 1 2 10 4 10.548285
"어업 숙련직 (중)"                                 2014    591 8.558335134747413 2.7385561251500543              1.909  44.74215142820309 1 0 1 11 1  6.381816
"어업 숙련직 (중)"                                 2014     15 8.558335134747413  1.928392768842361              1.909 31.505814499919794 0 1 1 11 2   2.70805
"어업 숙련직 (중)"                                 2015    767 8.626944055375356 1.3555372336161782              1.909 22.324120665746047 1 0 2 11 3  6.642487
"어업 숙련직 (중)"                                 2015     30 8.626944055375356 1.5061096902529125              1.909  24.80387379058574 0 1 2 11 4 3.4011974
"음식관련 단순 종사원"                          2014  11721 8.558335134747413 1.7154321243188502 2.1242544731610336 31.186696158567447 1 0 1 12 1  9.369138
"음식관련 단순 종사원"                          2014  51210 8.558335134747413  1.652896202303043 2.1242544731610336   30.0497879875856 0 1 1 12 2  10.84369
"음식관련 단순 종사원"                          2015  16710 8.626944055375356 1.5971985538336106 2.1242544731610336 29.269980388156135 1 0 2 12 3  9.723763
"음식관련 단순 종사원"                          2015  62603 8.626944055375356  1.543293824205536 2.1242544731610336 28.282131773307608 0 1 2 12 4 11.044568
"의복 제조관련 기능 종사자"                   2014    871 8.558335134747413 3.1173677673918343 1.2487520798668885  33.31605375664688 1 0 1 13 1  6.769642
"의복 제조관련 기능 종사자"                   2014   3253 8.558335134747413 1.9079062164222282 1.2487520798668885   20.3902493423626 0 1 1 13 2  8.087333
"의복 제조관련 기능 종사자"                   2015   1516 8.626944055375356 2.0498781738304217 1.2487520798668885 22.083161957804528 1 0 2 13 3  7.323831
"의복 제조관련 기능 종사자"                   2015   2498 8.626944055375356 1.8730556022522327 1.2487520798668885 20.178267542220752 0 1 2 13 4  7.823246
"이미용예식 및 의료보조 서비스직 (중)"    2014  21118 8.558335134747413 2.2736072096414124              1.395  27.14431798852033 1 0 1 14 1  9.957881
"이미용예식 및 의료보조 서비스직 (중)"    2014 144348 8.558335134747413 1.8289506007866514              1.395 21.835617200069553 0 1 1 14 2 11.879982
"이미용예식 및 의료보조 서비스직 (중)"    2015  20923 8.626944055375356  2.100880459901832              1.395 25.283228581473068 1 0 2 14 3  9.948605
"이미용예식 및 의료보조 서비스직 (중)"    2015 184053 8.626944055375356 1.5650483919799203              1.395 18.834710965584474 0 1 2 14 4  12.12298
"제조관련 단순 종사원"                          2014 110099 8.558335134747413 2.1611761279192216 2.6528225806451613  49.06679105609388 1 0 1 15 1 11.609136
"제조관련 단순 종사원"                          2014 100933 8.558335134747413  1.769042261567682 2.6528225806451613  40.16388386693562 0 1 1 15 2 11.522212
"제조관련 단순 종사원"                          2015 101769 8.626944055375356  2.028379816308244 2.6528225806451613  46.42099742173532 1 0 2 15 3  11.53046
"제조관련 단순 종사원"                          2015 101493 8.626944055375356 1.6862043400948734 2.6528225806451613  38.59005433535032 0 1 2 15 4 11.527745
"조리 및 음식 서비스직 (중)"                   2014  54377 8.558335134747413 2.2274105950497303              1.657 31.587268970438394 1 0 1 16 1 10.903697
"조리 및 음식 서비스직 (중)"                   2014 117593 8.558335134747413 1.7035213551608082              1.657  24.15791114666435 0 1 1 16 2 11.674985
"조리 및 음식 서비스직 (중)"                   2015  56366 8.626944055375356  2.073030759845505              1.657 29.633656086659883 1 0 2 16 3  10.93962
"조리 및 음식 서비스직 (중)"                   2015 125250 8.626944055375356 1.6156633584505402              1.657 23.095659381199614 0 1 2 16 4 11.738067
"직물 및 신발 관련 기계조작원 및 조립원" 2014   8646 8.558335134747413   2.28119890091089 2.3097826086956523  45.09449727598078 1 0 1 17 1  9.064852
"직물 및 신발 관련 기계조작원 및 조립원" 2014  12107 8.558335134747413 1.6370949390242147 2.3097826086956523  32.36191865552417 0 1 1 17 2  9.401539
"직물 및 신발 관련 기계조작원 및 조립원" 2015   7498 8.626944055375356  2.294879903988063 2.3097826086956523  45.72861539044473 1 0 2 17 3  8.922392
"직물 및 신발 관련 기계조작원 및 조립원" 2015   8938 8.626944055375356 1.6151983566862902 2.3097826086956523  32.18503256045329 0 1 2 17 4  9.098067
"청소원 및 환경 미화원"                         2014  50447 8.558335134747413  1.971261088343622 1.7921259842519686 30.234443197741435 1 0 1 18 1 10.828678
"청소원 및 환경 미화원"                         2014 129760 8.558335134747413 1.4935258429433318 1.7921259842519686  22.90712403843568 0 1 1 18 2 11.773442
"청소원 및 환경 미화원"                         2015  50211 8.626944055375356 1.8230966889930662 1.7921259842519686  28.18611508233687 1 0 2 18 3  10.82399
"청소원 및 환경 미화원"                         2015 132109 8.626944055375356 1.3432689479360722 1.7921259842519686 20.767704412851195 0 1 2 18 4 11.791383
"판매관련 단순 종사원"                          2014  35681 8.558335134747413  2.010153775234008 1.8492509363295877 31.813717340486853 1 0 1 19 1 10.482373
"판매관련 단순 종사원"                          2014  37742 8.558335134747413  1.732603775746985 1.8492509363295877  27.42106970311667 0 1 1 19 2 10.538528
"판매관련 단순 종사원"                          2015  42539 8.626944055375356 1.8694443477823788 1.8492509363295877 29.823964242358233 1 0 2 19 3 10.658176
"판매관련 단순 종사원"                          2015  44004 8.626944055375356 1.5804560652553223 1.8492509363295877  25.21362309218094 0 1 2 19 4 10.692036
end
label values occ occ
label def occ 1 "가사 및 육아 도우미", modify
label def occ 2 "경비원 및 검표원", modify
label def occ 3 "계기검침수금 및 주차관련 종사원", modify
label def occ 4 "공예 및 귀금속 세공원", modify
label def occ 5 "기타 서비스 관련 단순 종사원", modify
label def occ 6 "농림어업관련 단순 종사원", modify
label def occ 7 "방문노점 및 통신판매 관련직 (중)", modify
label def occ 8 "섬유 및 가죽관련 기능 종사자", modify
label def occ 9 "세탁관련 기계조작원", modify
label def occ 10 "식품가공관련 기능 종사자", modify
label def occ 11 "어업 숙련직 (중)", modify
label def occ 12 "음식관련 단순 종사원", modify
label def occ 13 "의복 제조관련 기능 종사자", modify
label def occ 14 "이미용예식 및 의료보조 서비스직 (중)", modify
label def occ 15 "제조관련 단순 종사원", modify
label def occ 16 "조리 및 음식 서비스직 (중)", modify
label def occ 17 "직물 및 신발 관련 기계조작원 및 조립원", modify
label def occ 18 "청소원 및 환경 미화원", modify
label def occ 19 "판매관련 단순 종사원", modify

------------------ copy up to and including the previous line ------------------

encode occupation, gen(occ)

. sort occ, stabel
option stabel not allowed
r(198);

. sort occ, stable

. by occ: gen time=_n

. gen lemp=log( 근로자수)

. tsset occ time
panel variable: occ (strongly balanced)
time variable: time, 1 to 4
delta: 1 unit

. reg D.lemp D.lnMWRIIWMW

Source | SS df MS Number of obs = 57
-------------+---------------------------------- F(1, 55) = 0.62
Model | 2.26400386 1 2.26400386 Prob > F = 0.4332
Residual | 199.781152 55 3.63238458 R-squared = 0.0112
-------------+---------------------------------- Adj R-squared = -0.0068
Total | 202.045156 56 3.60794921 Root MSE = 1.9059

------------------------------------------------------------------------------
D.lemp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnMWRIIWMW |
D1. | -.0270511 .0342644 -0.79 0.433 -.0957185 .0416162
|
_cons | .0055426 .2735975 0.02 0.984 -.5427589 .5538442
------------------------------------------------------------------------------

↧

Error r(900) - no room to add variables, up to 2048 variables are allowed, but I'm only using 340 variables

December 28, 2019, 4:25 am

≫ Next: Loops within loops

≪ Previous: dummy variable in the two-way fixed effect model

Hello,

I'm trying to add 9 variables (w_racel) for each wave of data (waves a to i) to a large dataset that contains all the other data I'm using. But it keeps coming up with r(900): 'no room to add more variables. Up to 2,048 variables are allowed with this version of Stata. Versions are available that allow up to 120,000 variables.'

However, the data only contains 340 variables

Code:

 describe

Contains data from C:\Users\iz9\w_all.dta
  obs:        86,094                          Substantive data for responding adults (16+)
 vars:           340                          28 Dec 2019 11:42
                                              (_dta has notes)

The merge command I am attempting is, for example for wave a:

Code:

merge 1:1 pidp using "C:\Users\iz9\a_indresp.dta", keepusing(a_racel) nogenerate

Many thanks for any help,

Iz

↧

Loops within loops

December 28, 2019, 7:18 am

≫ Next: Time-series regression loop (approximately 250 regressions) and saving their coefficients

≪ Previous: Error r(900) - no room to add variables, up to 2048 variables are allowed, but I'm only using 340 variables

Dear everyone,

I have trouble when create a loop to generate a new variable for my dataset. My data is presented as below.
For subset of data that var Y =1, I want to create a new_var for each industry(sic) such that new_var in year t = total value of X from year t+1 to t+3.
I a new beginner in Stata and I think that I should work with loop. I try but still can not find a solution for that.

Can anyone help me? I am really appreciate for that. Thank you so much

↧

Time-series regression loop (approximately 250 regressions) and saving their coefficients

December 28, 2019, 9:00 am

≫ Next: How to omit one category of margins from marginsplot

≪ Previous: Loops within loops

Hi all,

I've performed a time-series regression loop by using the following code (approximately 250 regressions):
gen sic_2 = real(substr(sic,1,3))
egen SIC_id = group(sic_2)
drop if SIC_id==.
bysort SIC_id: gen no_obs=[_N]
keep if n_obs>50

forval i = 1(1)258{
if inlist(`i', 65,66,67,170,195,232,236,249,250,251) continue
reg WD_WC WCFO_m1 WCFO_0 WCFO_p1 if `i' == SIC_id, r
estimates store reg`i'
}
where WD_WC means change in working capital, and WCFO_m1, WCFO_0, and WCFO_p1 are past, current, and future cash flows, respectively (model of Dechow and Dichev of 2002).
This gives me all the coefficients of the 250 regressions. However, I want to save their coefficients as 'one'. So, actually, I want to know the mean of these 250 coefficients. Specifically, I want the mean of alpha (the constant), B0, B1, and B2.

Does anyone know how I can do this? I would be really really happy to receive an answer.

Roy

↧

How to omit one category of margins from marginsplot

December 28, 2019, 9:53 am

≫ Next: Generate sum of preceding values conditional on dates, id, and illness type

≪ Previous: Time-series regression loop (approximately 250 regressions) and saving their coefficients

Dear Statalisters,

I have the following easy issue to solve. I would like to plot margins of employ2, which has four levels and a missing category.

svy: mlogit uniontype2 i.durationtype i.agesm2_f##i.mage_gap_cat i.wave5 i.employ2 ib0.evunionpr i.shared i.educ2_f i.homog if marital==3&duratyrs3>0&duratyrs3<6&homo3==0&sampst ==1&ivfioall==1&(minagentry>18&minagentry<37)&sex! =.&educ2_f!=4&educ2_m!=4&agesm2_f!=., level(90) rrr base

margins i.employ2 , level(90) pr(out(1)) pr(out(2))

mplotoffset , xdim(employ2) legend( order(4 "Marriage" 3 "Dissolution" ) ring(5)) ytitle("Probability") title("Employment") offset(0.15) recast(scatter) plotregion(lcol(black)) saving(employ2, replace) xlab(,labsize(small) alternate) xtitle("") plot2opts(msymbol(D))

I would like to plot the four categories, but not the missing. one.

However, I would avoid restrcting with "if" in the following way,

margins i.employ2 if employ2<5 , level(90) pr(out(1)) pr(out(2))
as I have slight different estimates than if computed margins for all the categories.

Thank you and best,
Lydia

↧

Generate sum of preceding values conditional on dates, id, and illness type

December 28, 2019, 9:58 am

≫ Next: create dummy variable based on condition of other variables

≪ Previous: How to omit one category of margins from marginsplot

Dear Sir/Madam,

I am Masters student and working on Stata 14.2 obtained by my University.

I am attempting to calculate the measure the magnitude of antibiotic usage prior to a new incidence of a repeated illness. I ought to calculate two variables namely, the direct and indirect effects of using prescribed antibiotic on a set of illnesses.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str23 ID str9 Illness float(YearofContracting DosageOfAntibiotic ClassofIllness)
"Patient 1" "Illness 1" 2010  37 1
"Patient 1" "Illness 1" 2008 100 1
"Patient 1" "Illness 1" 2011  49 1
"Patient 1" "Illness 2" 2001  15 1
"Patient 1" "Illness 2" 2012  60 1
"Patient 1" "Illness 3" 2006  26 2
"Patient 1" "Illness 4" 2018  80 2
"Patient 2" "Illness 1" 2013  89 1
"Patient 2" "Illness 1" 1999  15 1
"Patient 2" "Illness 2" 2016  90 1
"Patient 3" "Illness 1" 2001  18 1
"Patient 3" "Illness 4" 1987  10 3
"Patient 4" "Illness 7" 2002  30 3
"Patient 5" "Illness 8" 2004  60 4
end

I would like to explain with an example:

Direct effect
As per the Patient1 below, Illness 1 struck in year 2008, 20010 and 2011, hence, each successive occurrence of the same illness and its prescribed dosage has some impact on the patient's next occurrence of the same illness. By some logic, we believe that there is an impact of time gap signified by a constant multiple (let's call it gamma).

For Time lapse and constant multiple relationship, please refer the table below

Time lapse	constant scale multiple (=gamma)
1	.1
2	.05
3	.007
4	.001
5 and above	.0003

For Direct effect calculation
Incident1 of Illness 1: Direct effect (in year 2008)= 0
Incident2 of Illness 1: Direct effect (in 2010) = (corresponding gamma for time lapse between 2010-2008 )* DosageofAntibiotic (in 2008) * Count of same illness in year 2008
Incident3 of Illness 1: Direct effect (in 2011) = (corresponding gamma for time lapse between 2011 and 2008)* DosageofAntibiotic (in 2008)*Count of same illness in year 2008 + (corresponding gamma for time lapse between 2011-2010)* DosageofAntibiotic (in 2010)* Count of same illness in year 2010

Essentially:
[ATTACH=CONFIG]temp_16563_1577542093155_488[/ATTACH]
Indirect effect
For patient1, impact of illness 1 in the "year n" is dependent upon the prior incidents of illness 2 (because they are in same "Class of illness" - a variable in my data) that is incidents of illness 2 in year X (where X<n).

Time lapse	constant scale multiple (=alpha)
1	.08
2	.021
3	.0078
4	.0004
5 and above	.000065

Incident1 of Illness 1 (happens to be in year 2008): Indirect effect (of illness 2 which occured in year 2001)= (corresponding alpha for time lapse between 2008 and 2001 )* DosageofAntibiotic (in 2008) * Count of illness in same "Class of ui
Incident2 of Illness 1: Direct effect (in 2010) = (corresponding gamma for time lapse between 2010-2008 )* DosageofAntibiotic (in 2008) * Count of same illness in year 2008
Array
This is my first post in this community and I would appreciate all the help.

Regards,
Ram.

↧

create dummy variable based on condition of other variables

December 28, 2019, 11:31 am

≫ Next: Rename many variables

≪ Previous: Generate sum of preceding values conditional on dates, id, and illness type

Dear Statalist,

I am struggling with this problem which is very difficult to explain. I am trying my best to describe it here and really appreciate if anyone can help me.

My dataset based on a survey of the employment status of parents. It has 3 variables: id, year_of_survey, employment_status ( the variable "employment_status" is a dummy with two categories: 1= employed and 0= unemployed).
I want to create another dummy variable called: "unemployment_experience" to define which mother/father has been through unemployment at the time she/he was interviewed (a mother/father once was unemployed will be considered to have experience in unemployment from then on even if he is employed after that). It should take value 1 if from the time interviewing backward, employment_status=0 at least once) and 0 otherwise. (1 = have experience in unemployment , 0 = no experience in unemployment)

For example, I want to generate this following dataset:

id	year_of_survey	employment_status	unemployment_experience
1	2000	1	0
1	2001	1	0
1	2002	0	1
1	2003	1	1
2	2000	1	0
2	2001	1	0
2	2002	1	0
2	2003	0	1
3	2000	0	1
3	2001	1	1
3	2002	1	1
3	2003	1	1
3	2004	1	1
3	2005	1	1

Thank you so much for your time.

Best regards,
Cameron.

↧

Rename many variables

December 28, 2019, 11:54 am

≫ Next: Plotting a likelihood surface

≪ Previous: create dummy variable based on condition of other variables

Dear all,

I have 53 variables, with name like: zmale, zschool, zincome,......zhospitalratio.
How can I rename them as male, school, income,...... hospitalratio, with more advanced coding methods or loops. I know I can get what I want to like this:

Code:

rename zmale male

But there 53 variables and I think Stata can be more clever than this. Thanks for your help!

Happy new year!
Jack

↧

Plotting a likelihood surface

December 28, 2019, 2:28 pm

≫ Next: Can PPML coefficients be analyzed as elasticities? Gravitational Model

≪ Previous: Rename many variables

Dear all

Let's say I have a linear model with two predictors:

Code:

   
program linear
        args lnf beta sigma
        quietly replace `lnf' = ln(normalden($ML_y1,`beta',`sigma'))
end
ml model lf linear (coeffs: y = x1 x2) (sigma: )
ml maximize

I would like to plot the likelihood surface as a heatmap conditional on the estimated sigma (i.e. possible values of both cofficients on the axes, colors indicating likelihood). Is there a convenient (built-in?) way to compute the likelihood for all possible combinations of coefficient values (within a certain range)? I am using Stata 16.

Thank you in advance

↧

Can PPML coefficients be analyzed as elasticities? Gravitational Model

December 28, 2019, 4:50 pm

≫ Next: Define a binary variable using and / or

≪ Previous: Plotting a likelihood surface

Dear,

I am using PPML to analyze the impact of institutions on exports.

My question is, can coefficients be interpreted as elasticities? just like traditional logarithmic model?

For example, if I get 0.10 for the Income variable using PPML, then the interpretation is that the 1% increase in income causes the 0.10% increase in exports?

And for binary variables, which get 0 or 1, as boundary and language is this interpretation the same? Or do I need to do some transformation to analyze the results?

Sorry for the poorly written English, but I am Brazilian and still have difficulties with English.

Thanks

Maitê

↧

Define a binary variable using and / or

December 28, 2019, 5:04 pm

≫ Next: How to keep whole group with a variable including 3 values

≪ Previous: Can PPML coefficients be analyzed as elasticities? Gravitational Model

Dear all,
I have a panel data set. Each observation is organized by firmId and year. Some other variables include firms’ industry classification code (SIC) storage type integer.

I need to define a binary variable set equal to one if a firm’s SIC code between 2833 and 2836, 3570 and 3577, 3600 and 3674; and zero otherwise.

I plan to use
gen sic_dum=0
replace sic_dum =1 if (sic>=2833 & sic <=2836) | sic>= 3570 & sic <=3577 | sic>= 3600 & sic <=3674)

But I am not sure whether & takes precedence than | using these conditional statement).

Thank you,

Rochelle

↧

How to keep whole group with a variable including 3 values

December 28, 2019, 5:31 pm

≫ Next: Extracting a month/year variable from a date

≪ Previous: Define a binary variable using and / or

Hi
I have a panel data with firm and year.
I want to keep all the observations of a firm if this firm headquartered in US and at least have values at 2000, 2001, and 2002.

Thanks

↧

Extracting a month/year variable from a date

December 28, 2019, 5:56 pm

≫ Next: Append all .dta files in directory

≪ Previous: How to keep whole group with a variable including 3 values

I'm new to Stata and I have a dumb question. I have a date variable (indexdate) with the format %tdD_m_Y. I want to create a year/month variable from the date variable so I can examine what's happening to another variable (% imprisoned) month by month. Can anyone help?

regards Don

↧

Append all .dta files in directory

December 28, 2019, 7:29 pm

≫ Next: issue with "svy: biprobit"

≪ Previous: Extracting a month/year variable from a date

Many people asked this quesion on 'batch appending' in many ways. Some of them are obscure so I tried the simpler ones, including the fs (ssc), but not getting anywhere. Does anyone have a concrete solution for this.

Problem: I've 39 datasets in the directory, and want to 1) append them in one shot, 2) keeping only selected variables from each datafile.

Thanks in advance...

↧

issue with "svy: biprobit"

December 28, 2019, 7:56 pm

≫ Next: semi-parametric quantile model

≪ Previous: Append all .dta files in directory

Hi everyone,
I have a longitudinal panel data with the person-spell format where a spell is defined as the time span between waves (2 years). I also have pweight, strata and psu variables. When I use pweight or respondent weight then the results become really unusual and inconsistent with previous studies. In contrast, when I drop the pweight and run the model only with strata and psu variables then the results are completely reasonable. Is there something wrong with using respondent weight to fit a biprobit model when one is using person-spell data format.
Thanks for your advice!
Best,
Nader

↧

semi-parametric quantile model

December 29, 2019, 1:34 am

≫ Next: Panel Data Analysis: Growth rates or levels?

≪ Previous: issue with "svy: biprobit"

Hi,
How to run semi-parametric quantile regression in Stata?

There is a way to run semi-parametric regression using semipar command.
and quantile regression using qreg

but what if I want to run semi-parametric quantile regression with 3 variables as non-parametric? ( as in semipar command, only 1 variable can be included in non-parametric)

Thanks in advance!

↧

Panel Data Analysis: Growth rates or levels?

December 29, 2019, 2:01 am

≫ Next: Comparing two probit models with clustered standard errors

≪ Previous: semi-parametric quantile model

Hi Statalist-community!

I am currently writing a seminar paper. I am estimating the effect of the share of leftwing government members in swiss cantons on the public expenses (and their categories) of these cantons.

The data is balanced panel data with N (cantons) = 26 and T(years) = 28.

The dependent variable is: public expenses in category j per canton in year t
The main independent variable is: share of left-wing government members in government of canton i in year t.

(Category means for example: health, social security, culture, education, etc.)

As control variables I have:

- GDP per canton
- unemployment rate per canton
- ratio of people aged > 64 per canton
- debt per canton
- population per canton
- and a lagged variable of the proportion of left-wing politicians in the parliament

Here is a sample of my main data for one canton:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int jahr long kanton double(links total BIP) long bev double(alt schulden Alquote)
1990 1  .2 2103595.49679  22258.61295548331 498035               .126 4118853.07339                .0021
1991 1  .2  2418372.7013 23247.730242745427 506818               .126 4258530.37578 .0055000000000000005
1992 1  .2 2622013.84704  23731.27671978167 511979               .126 3995904.96175   .01648507796021844
1993 1  .2 2752637.76148 24251.730934888055 518945               .126 4020907.67093   .03353853343994975
1994 1  .2 2866913.15304  24850.55070690186 523114               .127 4312582.93007  .032740127331169364
1995 1  .2 2871594.22682  25154.26712852215 528887 .12800000000000003  4261014.9935   .02906240833847551
1996 1   0  3033372.5041 25331.135728586345 531665               .129 4566428.24205   .03839377973484641
1997 1   0 3065249.40363  25817.39558783384 534028                .13  4734656.4063  .046613346283090946
1998 1   0 3092004.07619  26558.17208651548 536462               .132 4797195.89518  .030217723885365436
1999 1   0 3184180.88853 27039.352615536056 540639 .13195681406631782 4907276.22711  .021088996722396877
2000 1   0 3288728.78046 28525.767132413897 544306 .13364541269065564 4901166.57689  .013503121668950816
2001 1   0 3448772.61443  29194.49554783396 550298 .13510134508938793  4762340.0067   .01214187822228023
2002 1   0 3559261.63535  29167.80471578718 555782  .1365679349097308 4489301.77899   .02123115577889447
2003 1   0       3771771 29508.175608019385 560674 .13796074010922568 4366114.90529  .033257738911005245
2004 1   0 3868402.48256 30431.545233146422 565122 .14008302631998046 5305308.83576  .034339989993256326
2005 1   0 3987495.90309   31596.1406982345 569344 .14222684352517986 4762380.78077   .03251647849637799
2006 1   0 4304953.29254 33544.994154576845 574813 .14474272502535607 4510972.05203   .02857251626095847
2007 1   0 4505096.31065  35767.62753463508 581562   .147655795942651 4886900.96703  .023552013313319846
2008 1 .25 3995789.07397        37774.52858 591632 .15066122184060363 4832603.85848  .022927679523156913
2009 1  .4 4107370.48387        36943.54089 600040  .1535130991267249 4584035.66422  .033850257782418576
2010 1  .4 4146058.66746        37664.96637 608299 .15526129007990633 4696771.97718   .03128852310932995
2011 1  .4 4336101.42091        38505.31274 618298 .15872281650595668 4311897.59296  .025658837672748246
2012 1  .4 4577403.78175        38719.76854 627340 .16137022985940638 3900244.74448   .02685290486325758
2013 1  .4 4703960.67493        39488.54672 636362  .1640371360954928 3744099.61901  .028494090775842893
2014 1  .4 4770144.18455        40139.27705 645277  .1663657623005314 4051689.93526  .027857224266495634
2015 1  .4 4892710.96136        40647.58003 653675 .16873522775079358 4254435.63835  .029879852698914817
2016 1  .2 4973509.49627           40813.49 663462  .1705282291977536 4838286.30867  .031555575596547404
2017 1  .2 4944214.42435        41592.47817 670988 .17351129975498816 4990070.88585  .030368726678589565
end
label values kanton kanton1
label def kanton1 1 "AG", modify
label var jahr "jahr"
label var kanton "kanton"
label var links "Anteil links in Regierung"
label var total "Total Ausgaben"
label var BIP "BIP"
label var bev "Total Wohnbevölkerung"
label var alt "Anteil Bevölkerung >64"
label var schulden "Bruttoschulden"
label var Alquote "Arbeitslosenquote in Dezimalzahlen"

My question to you is now: Should I use the growth rates in the variables (the dependent variables as well as the control variables) or their levels? I decided to estimate a LSDVC model (xtlsdvc in STATA). When I use levels, I see some effects, but when I use growth rates, virtually all the variables become insignificant.

I use the following code in STATA for the analysis in levels:

Code:

xtlsdvc kat03 links lagd_mlp bev alt schulden Alquote BIP, initial(ab) vcov(50) first

And the following code for the analysis in growth rates:

Code:

xtlsdvc gln_kat03 links lagd_mlp gln_alt gln_schulden gln_Alquote gln_BIP, initial(ab) vcov(50) first

The reason why I'm unsure is because almost all the scientific papers examining the same hypothesis are using growth rates, but I don't really see why.

The data are stationary and the Hausman-test result proposed to use fixed effects.

Also: Do you think I am estimating the right model?

Thank you so much, your answer would help me an awful lot!!

Regards,
Lara Knuchel

↧

Comparing two probit models with clustered standard errors

December 29, 2019, 2:49 am

≫ Next: Generating data according to a pattern

≪ Previous: Panel Data Analysis: Growth rates or levels?

Hey everyone,

I would like to compare two probit models (that are nested) to see whether the addition of further variables improves the model. As I need clustered standard errors, it is not possible to use a likelihood ratio test. Which other possibilites do I have? Is it appropriate to compare the AIC/BIC or do another test?

Looking forward to some advice. Thanks! :-)

↧