Subgroup differences with metaprop?

December 23, 2019, 2:59 am

≪ Previous: How to make a random sample or subset of data with two group?

Hi,

I am running a metanalysis to look at pooled estimates of attrition using metaprop. I have generated a forest plot as attached. How do I test subgroup differences within this analysis? i.e is there a significant difference between men and women pooled attrition estimate?

Many thanks,
Carla

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str20 Study float Year int size byte attrition int completed float(sex totalbysex _ES _seES _LCI _UCI _WT)
"Aufses"     1998   88 11   52 0   63        .125 .035254754   .07125274  .21011756  1.951692
"Bergen"     1998  132 11   92 0  103   .08333334  .02405626   .04716797  .14306453  3.186563
"Brown"      2014   85  9   40 0   49   .10588235  .03337334   .05671241   .1891352 2.1106992
"Carter"     2018   88 11   52 0   63        .125 .035254754   .07125274  .21011756  1.951692
"Dodson"     2004  120 11   76 0   87   .09166667  .02634133   .05195718   .1567085 2.8715415
"Gifford"    2014  371 48  128 0  176   .12938005 .017424528    .0989906   .1673659 4.3181467
"Nadeem"     2014  106 29   55 0   84    .2735849  .04329977    .1977593  .36524725   1.42592
"Sullivan"   2013 2033 81 1236 0 1317    .0398426 .004337868  .032172184  .04924871  6.828648
"Symer"      2018  792 84  421 0  505    .1060606  .01094129   .08648507  .12943918  5.662387
"Yaghoubian" 2012  348 29  191 0  220   .08333334 .014815813   .05864695  .11711818  4.844076
"Yeo"        2017  836 90  438 0  528    .1076555 .010719666   .08841137  .13048883  5.708778
"Yeo"        2010 6303 50 2641 0 2691   .00793273 .001117399  .006022631  .01044226  7.082663
"Aufses"     1998   88  8   17 1   25    .0909091  .03064545   .04678642   .1692539 2.3720863
"Bergen"     1998  132  7   22 1   29    .0530303 .019504873   .02592242   .1054179  3.928538
"Brown"      2014   85  7   29 1   36   .08235294 .029817274   .04046373  .16035984 2.4594166
"Carter"     2018   88  8   17 1   25    .0909091  .03064545   .04678642   .1692539 2.3720863
"Dodson"     2004  120  9   24 1   33        .075  .02404423   .03995712  .13640918 3.1883204
"Gifford"    2014  371 39   73 1  112    .1051213  .01592357    .0778562     .14048  4.616435
"Nadeem"     2014  106 12   10 1   22   .11320755 .030774835   .06595682   .1875127 2.3587935
"Sullivan"   2013 2033 50  666 1  716  .024594195 .003435107  .018704975 .032276634  6.927888
"Symer"      2018  792 69  218 1  287    .0871212  .01002088  .069419935  .10880835  5.853562
"Yaghoubian" 2012  348 26  102 1  128   .07471264 .014094372   .05149312  .10721887   4.99494
"Yeo"        2017  836 74  234 1  308   .08851675   .0098239    .0710965  .10970127  5.893878
"Yeo"        2010 6303 27 1241 1 1268 .0042836745 .000822626 .0029457496 .006225475  7.091249
end
label values sex sex
label def sex 0 "Male", modify
label def sex 1 "Female", modify

------------------ copy up to and including the previous line ------------------

Listed 24 out of 24 observations

↧

reclink function

December 23, 2019, 3:01 am

≫ Next: The occurance of autocorrelation and heteroscedasticity in a panel data OLS regression

≪ Previous: Subgroup differences with metaprop?

Hi guys,

hope you are enjoying festivities.

So basically I am trying to perform a fuzzy matching between the two following databases (where variable "prd" in Master DataBase is the equivalent of variable "drug_name" in using DataBase):

Master DataBase:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 atc3no0 str18 prd float Year
"A7F"  "ACIDOPHILUS"        2013
"A7F"  "ACIDOPHILUS"        2014
"A7F"  "ACIDOPHILUS"        2015
"M5X"  "ARTHRI-FLEX"        2013
"M5X"  "ARTHRI-FLEX"        2014
"M5X"  "ARTHRI-FLEX"        2015
"A12A" "CALCIUM 600"        2014
"A12A" "CALCIUM 600"        2015
"A12A" "CALCIUM 600+D3"     2014
"A12A" "CALCIUM 600+D3"     2015
"A12A" "CALCIUM 600+D3 PLU" 2014
"A12A" "CALCIUM 600+D3 PLU" 2015
"A12A" "CALCIUM CIT/VIT D"  2013
"A12A" "CALCIUM CIT/VIT D"  2014
"A12A" "CALCIUM CIT/VIT D"  2015
"A12A" "CALCIUM/VIT D"      2008
"A12A" "CALCIUM/VIT D"      2009
"A12A" "CALCIUM/VIT D"      2010
"A12A" "CALCIUM/VIT D"      2011
"A12A" "CALCIUM/VIT D"      2012
"A12A" "CALCIUM/VIT D"      2013
"A12A" "CALCIUM/VIT D"      2014
"A12A" "CALCIUM/VIT D"      2015
"V3X"  "CINNAMON"           2008
"V3X"  "CINNAMON"           2009
"V3X"  "CINNAMON"           2010
"V3X"  "CINNAMON"           2011
"V3X"  "CINNAMON"           2012
"V3X"  "CINNAMON"           2013
"V3X"  "CINNAMON"           2014
"V3X"  "CINNAMON"           2015
"C1B"  "FISH OIL"           2012
"C1B"  "FISH OIL"           2013
"C1B"  "FISH OIL"           2014
"C1B"  "FISH OIL"           2015
"V3X"  "FLAXSEED OIL"       2013
"V3X"  "FLAXSEED OIL"       2014
"V3X"  "FLAXSEED OIL"       2015
"B3X"  "FOLIC ACID"         2014
"B3X"  "FOLIC ACID"         2015
"M5X"  "GLUCOSAMINE/CHONDR" 2008
"M5X"  "GLUCOSAMINE/CHONDR" 2009
"M5X"  "GLUCOSAMINE/CHONDR" 2010
"M5X"  "GLUCOSAMINE/CHONDR" 2011
"M5X"  "GLUCOSAMINE/CHONDR" 2012
"M5X"  "GLUCOSAMINE/CHONDR" 2013
"M5X"  "GLUCOSAMINE/CHONDR" 2014
"B3A1" "IRON"               2014
"B3A1" "IRON"               2015
"C1B"  "KRILL OIL OMEGA"    2012
"C1B"  "KRILL OIL OMEGA"    2013
"C1B"  "KRILL OIL OMEGA"    2014
"C1B"  "KRILL OIL OMEGA"    2015
"S1M"  "LUTEIN"             2014
"S1M"  "LUTEIN"             2015
"H4X"  "MELATONIN"          2008
"H4X"  "MELATONIN"          2009
"H4X"  "MELATONIN"          2010
"H4X"  "MELATONIN"          2011
"H4X"  "MELATONIN"          2012
"H4X"  "MELATONIN"          2013
"H4X"  "MELATONIN"          2014
"H4X"  "MELATONIN"          2015
"A12C" "MGO"                2013
"A12C" "MGO"                2014
"A12C" "MGO"                2015
"A11X" "NIACIN FLUSH FREE"  2011
"A11X" "NIACIN FLUSH FREE"  2012
"A11X" "NIACIN FLUSH FREE"  2013
"A11X" "NIACIN FLUSH FREE"  2014
"A11X" "NIACIN FLUSH FREE"  2015
"A11A" "PRENATAL"           2010
"A11A" "PRENATAL"           2011
"A11A" "PRENATAL"           2012
"A11A" "PRENATAL"           2013
"A11A" "PRENATAL"           2014
"A11A" "PRENATAL"           2015
"V3X"  "PRENATAL DHA"       2012
"V3X"  "PRENATAL DHA"       2013
"V3X"  "PRENATAL DHA"       2014
"V3X"  "PRENATAL DHA"       2015
"A11A" "PRENATAL VIT/DHA"   2013
"A11A" "PRENATAL VIT/DHA"   2014
"A11A" "PRENATAL VIT/DHA"   2015
"G2X9" "SOY ISOFLAVONES EX" 2011
"C1B"  "TRIPLE OMEGA CMPLX" 2014
"C1B"  "TRIPLE OMEGA CMPLX" 2015
"A11F" "VIT B12"            2013
"A11F" "VIT B12"            2014
"A11F" "VIT B12"            2015
"D8A"  "AVAGARD"            2004
"D8A"  "AVAGARD"            2005
"D8A"  "AVAGARD"            2006
"D8A"  "AVAGARD"            2007
"D8A"  "AVAGARD"            2008
"D8A"  "AVAGARD"            2009
"D8A"  "AVAGARD"            2010
"D8A"  "AVAGARD"            2011
"D8A"  "AVAGARD"            2012
"D8A"  "AVAGARD"            2013
end

Using DataBase:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 atc3no0 str100 drug_name int priority_year byte(priorityUS finalphase)
"A1C" "PEGylated oral insulin, Biocon"                                   2001 1 5
"A1C" "insulin, human"                                                   1982 0 9
"A1C" "insulin lispro"                                                   1994 1 9
"A1C" "AFREZZA"                                                          1994 1 9
"A1C" "drug delivery system, oral insulin, Generex Biotechnology"        1998 1 .
"A1C" "ORMD 0801"                                                        2005 1 5
"A1C" "ORMD 0801"                                                        2005 1 5
"A1C" "insulin aspart"                                                   1985 0 9
"A1C" "insulin glulisine"                                                2001 0 9
"A1C" "insulin lispro"                                                   1989 1 9
"A1C" "drug delivery system, oral insulin, Generex Biotechnology"        1998 1 .
"A1C" "insulin, human"                                                   1990 0 9
"A1C" "insulin degludec"                                                 2003 1 9
"A1C" "drug delivery system, oral insulin, Generex Biotechnology"        1998 1 .
"A1C" "NASULIN"                                                          2003 1 5
"A1C" "insulin lispro"                                                   1994 1 9
"A1C" "drug delivery system, TRANSFERSOMES transdermal insulin"          1990 0 5
"A1C" "insulin lispro"                                                   1989 1 9
"A1C" "insulin glulisine"                                                2001 0 9
"A1C" "ALBULIN"                                                          2000 1 2
"A1C" "ORMD 0801"                                                        2005 1 5
"A1C" "insulin lispro"                                                   1994 1 9
"A1C" "drug delivery system, oral insulin, Generex Biotechnology"        1998 1 .
"A1C" "insulin aspart"                                                   1985 0 9
"A1C" "AI 401"                                                           1990 1 5
"A1C" "drug delivery system, MEDUSA, human insulin, FLAMEL Technologies" 1995 0 5
"A1C" "NASULIN"                                                          2001 1 5
"A1C" "insulin degludec"                                                 2003 1 9
"A1C" "insulin glargine"                                                 1988 0 9
"A1C" "insulin aspart"                                                   1996 0 9
"A1C" "insulin lispro"                                                   1989 1 9
"A1C" "insulin lispro"                                                   1994 1 9
"A1C" "ALBULIN"                                                          2000 1 2
"A1C" "drug delivery system, inhaled insulin, Pfizer"                    1994 1 9
"A1C" "insulin lispro"                                                   1989 1 9
"A1C" "insulin lispro"                                                   1994 1 9
"A1C" "ORMD 0801"                                                        2005 1 5
"A1C" "insulin degludec + liraglutide"                                   2007 0 9
"A1C" "insulin peglispro"                                                2008 1 6
"A1C" "insulin glulisine"                                                1997 0 9
"A1C" "insulin lispro"                                                   1989 1 9
"A1C" "AFREZZA"                                                          1994 1 9
"A1C" "NASULIN"                                                          2003 1 5
"A1C" "insulin aspart"                                                   1996 0 9
"A1C" "NASULIN"                                                          2001 1 5
"A1C" "insulin aspart + insulin degludec"                                2003 1 9
"A1C" "drug delivery system, inhaled insulin, Pfizer"                    1994 1 9
"A1C" "ORMD 0801"                                                        1995 0 5
"A1C" "insulin lispro"                                                   1989 1 9
"A1C" "AI 401"                                                           1990 1 5
"A1C" "insulin detemir"                                                  1993 0 9
"A1C" "ORMD 0801"                                                        1995 0 5
"A1C" "NASULIN"                                                          2003 1 5
"A1C" "insulin glulisine"                                                1997 0 9
"A1C" "NASULIN"                                                          2003 1 5
"A1C" "NASULIN"                                                          2001 1 5
"A1C" "insulin lispro"                                                   1994 1 9
"A1C" "PEGylated oral insulin, Biocon"                                   2001 1 5
"A1C" "insulin peglispro"                                                2008 1 6
"A1C" "insulin aspart + insulin degludec"                                2003 1 9
"A1C" "NASULIN"                                                          2001 1 5
"A1H" "drug delivery system, GITS glipizide"                             1989 1 9
"A1H" "drug delivery system, GITS glipizide"                             1989 1 9
"A1H" "glimepiride"                                                      1979 0 9
"A1H" "drug delivery system, modified release gliclazide, Servier"       1999 0 9
"A1J" "metformin + dapagliflozin"                                        1999 1 8
"A1J" "metformin"                                                        1998 1 9
"A1J" "GLUMETZA"                                                         1997 1 9
"A1J" "glibenclamide + metformin"                                        1998 0 9
"A1J" "metformin + dapagliflozin"                                        1999 1 8
"A1J" "GLUMETZA"                                                         1997 1 9
"A1J" "canagliflozin + metformin"                                        2003 1 8
"A1J" "canagliflozin + metformin"                                        2003 1 8
"A1J" "fenofibrate + metformin"                                          2002 0 6
"A1J" "metformin"                                                        1998 1 9
"A1J" "drug delivery system, extended-release metformin, Actavis"        1998 1 9
"A1J" "drug delivery system, extended-release metformin, Actavis"        1998 1 9
"A1K" "pioglitazone + glimepiride"                                       2003 0 9
"A1K" "balaglitazone"                                                    1996 1 6
"A1K" "pioglitazone"                                                     1985 0 9
"A1K" "rosiglitazone"                                                    1987 0 9
"A1K" "pioglitazone"                                                     1985 0 9
"A1K" "metformin + rosiglitazone"                                        1997 0 9
"A1K" "pioglitazone + metformin extended-release"                        2002 1 9
"A1K" "rosiglitazone"                                                    1992 0 9
"A1K" "englitazone"                                                      1985 1 2
"A1K" "lobeglitazone"                                                    2002 0 .
"A1K" "englitazone"                                                      1987 1 2
"A1K" "englitazone"                                                      1987 1 2
"A1K" "pioglitazone"                                                     1978 0 9
"A1K" "englitazone"                                                      1985 1 2
"A1K" "rosiglitazone"                                                    1991 0 9
"A1K" "NC 2100"                                                          1994 0 2
"A1K" "rosiglitazone"                                                    1987 0 9
"A1K" "rosiglitazone"                                                    1991 0 9
"A1K" "pioglitazone + metformin"                                         1995 0 9
"A1K" "pioglitazone + metformin extended-release"                        2002 1 9
"A1K" "rosiglitazone"                                                    1992 0 9
"A1K" "rosiglitazone"                                                    1991 0 9
"A1K" "rosiglitazone"                                                    1992 0 9
end

My aim is to perform a fuzzy matching between the two databases using as variables atc3no0 and the drug name with a lower weight on the name of the product. I have read about reclink but actually am not able to understand if it can fits to my needs.

Can you please help me?

Edit: I have done something along these lines but an error ") required" occurs:

Code:

reclink atc3no0 prd using output_estrazionedatiDS13_US, gen(myscore) idm(id_mas) idu(id_us) wmatch(10 2)

Federico

↧

The occurance of autocorrelation and heteroscedasticity in a panel data OLS regression

December 23, 2019, 3:06 am

≫ Next: To make Trajectory Plot directly using WIDE data format without reshaping to long format

≪ Previous: reclink function

Hello everyone,

I am currently running an OLS panel data regression with industry and time fixed effects and I am testing the assumptions of linear regression. I have a sample with 10.000 datapoints over a time period from 2009-2018. When I conduct my tests however, normality is being violated according to the jarque bera test, which is strange because when I plot my residuals, I can see a distribution that is almost perfect (Normality plot .pdf). Furthermore, when I test for heteroscedasticity and autocorrelation, both tests give me a rather high chi-sq and F score (Tests for autocorrelation and heteroscedasticity.pdf). According to the sayings of several forums and articles, I could solve this by running my regressions with: cluster(FirmID) at the end, and so I did. But when I run my regressions in such a matter, I can no longer test for autocorrelation and heteroscedasticity, so how can I be sure that by clustering the standard errors on the firm ID's the problem of heteroscedasticity and autocorrelation is resolved?

Yours sincereley,

Gyman van der Tol.

↧

To make Trajectory Plot directly using WIDE data format without reshaping to long format

December 23, 2019, 3:20 am

≫ Next: 5-class latent class model and memory problems r(3900)

≪ Previous: The occurance of autocorrelation and heteroscedasticity in a panel data OLS regression

Dear all,

Is there any easy command or way to make a trajectory plot using WIDE format without reshaping it to long format?

So far I found the user-written command profileplot which is easy and convenient by simply inputting:
profileplot time1 time2 time3 time4, by(id)

However, profileplot has limited graph options; for example, unable to adjust axis scale and axis label font.

The rest choices like xtline, graph twoway line, traj package, etc. requires long format data.
I tried to avoid reshaping data, each reshaping increases the risk to cause error.
profileplot command is really a smart creation. Why STATA did not continue to develop more comprehensive graph options on profileplot command?

Thank you and looking forwards to any solution.

↧

5-class latent class model and memory problems r(3900)

December 23, 2019, 5:48 am

≫ Next: fisher's exact test does not reproduce the results in the literature

≪ Previous: To make Trajectory Plot directly using WIDE data format without reshaping to long format

Dear Statalister,

i am running the following 5-class latent class model on a dataset of about 19.000 observations.

gsem (ictcomp_d <-, ologit) (ictapp ictrob itprodimp itperfmon ecommerce <-, logit) (C<- i.country i.est_size i.broad_sector i.est_type), lclass(C 5) from(b) nonrtolerance listwise

everything goes fins till the end of the EM iterations (20), memory consumption at about 40%. As the NR iterations begin the memory consumption jumps to 96% and it fluctuate around that value. I have 16GB RAM, no other applications running.

It is at this point that either Stata crashes or I get the r(3900) message.

I run this model after having run the 2-class, 3-class, and 4-class models and the idea is to get to the 7-class solution.

Notice that a colleague of mine run the same model, from 2-class to a 10-class solution, in Latent Gold without a problem (estimation time less than 10 minutes in the most complex solution), no convergence issue, on a laptop with 8GB RAM and other applications running.

Any idea on what could be going on?

I take this opportunity to wish the list a serene holiday break and a happy new year

Kind regards

Giovanni

↧

fisher's exact test does not reproduce the results in the literature

December 23, 2019, 6:19 am

≫ Next: Staggered diff in diff with continuous treatment

≪ Previous: 5-class latent class model and memory problems r(3900)

Hello,

I am trying to reproduce the results in the literature using fisher's exact test to compare the distribution of two independent samples.

There is data description:

Let's call the data from one paper 'sample one', and the data from another paper "sample two".
Both of the two papers are measuring the same thing.
In both of two samples, there are 6 types of subjects are identified: level 0, level 1, level 2, level 3, level 4, unidentified.
In sample one, there are 116 subjects, the proportions of types are 5.17%, 23.28%, 26.72%, 21.55%, 22.41%, 0.86% , respectively.
In sample two, there are 179 subjects, the proportions of types are 3.91%, 14.53%, 27.93%, 21.23%, 17.32%, 15.08% respectively.

The paper itself says "If the unidentified subjects are excluded, the Fisher's exact test comparing these two categorical distributions yields a p-value of 0.926, suggesting that they are statistically not different."

Thus, I assume that the Fisher's exact test will reject the null when unidentified subjects are included, which I am able to get, but I am not able to get "p-value of 0.926" to not reject the null excluding unidentified, so I am thinking the command I am using is not right.

Here is the code I am using:

Code:

set obs 179
gen jin = 0 in 1/7
replace jin = 1 in 8/33
replace jin = 2 in 34/83
replace jin = 3 in 84/121
replace jin = 4 in 122/152
replace jin = -1 in 153/179 //unidentified
proportion jin

gen k=-1 in 1 //unidentified
replace k=0 in 2/7
replace k =1 in 8/34
replace k=2 in 35/65
replace k =3 in 66/90
replace k = 4 in 91/116
proportion k

tabulate jin k , all exact //reject the null
tabulate jin k if jin!=-1 & k != -1, all exact// reject

My question what the right way is to reproduce the results. And I am wondering if sample sized matter as if I don't using option -missing-, the table it produces look like the larger sample is truncated, and if for example, shuffle the data, the larger sample will be truncated in a different way. so should we account for missing values if two samples are not balanced?

I also tried other tests to compare two samples which give different results:

Code:

 set obs 295
 gen group = 1 in 1/179
replace group =0 in 180/295
gen jin_k=jin in 1/179
forvalues i = 1(1)116{
replace jin_k = k[`i'] if _n == `i'+179
 }
 ranksum jin_k, by(group)//not reject at 5%
median jin_k, by(group) exact//not reject
ksmirnov jin_k, by(group) exact //not reject

Further, I just realised from this topic, that level 0, level 1, level 2, level 3, level 4 are likely to be ordered category. (I am not sure actually, the category in the paper is like education taking values of high school, undergraduate, postgraduate.) Thus I am wondering if it is indeed ordered category, then fisher's exact test is not appropriate, then what about other test I have used?

Finally, I have my own data measuring the same thing with 157 subjects. When comparing my sample to either sample one or two, I cannot reject the null using Fisher's exact test, but I can reject the null using all other tests -ranksum-, -median-, -ksmirnov-, and -ttest-. It seems that all these give different results from fisher exact test or chi square test, when either comparing sample one and two, or comparing my sample and sample one or two. I am really confused by those different results.

Thanks for any help!!

↧

Staggered diff in diff with continuous treatment

December 23, 2019, 6:41 am

≫ Next: Loop over file names

≪ Previous: fisher's exact test does not reproduce the results in the literature

My data involves treatment exposures in multiple groups and multiple time periods and I used staggered or two ways fixed effect diff in diff model with the following command

Code:

xtset country
xtreg DV postreatment i.year, fe vce(cluster country)

I also want to measure the treatment intensity But I am not sure which regression I should run

This:

Code:

xtreg DV postreatment##c.intensity i.year, fe vce(cluster country)

which only returns a coeffcient for postreatment and.intensity interaction

or this:

Code:

xtreg DV postreatment intensity i.year, fe vce(cluster country)

which returns two coeffcients.

I am a little bit confused because some papers report two coeffcients seperatly, for instance in this paper https://www.gwern.net/docs/ai/2018-brynjolfsson.pdf on Table 3

↧

Loop over file names

December 23, 2019, 7:33 am

≫ Next: Colour coded scatter graph

≪ Previous: Staggered diff in diff with continuous treatment

Hi, I have been trying to loop over some file names.

I have something like this:

foreach s of num 2007/2015{
clear all
use "$path/Matricula Universitaria/`s'_1C.dta", clear
merge m:m mrun using "$path/Matricula Universitaria/`s+1'_2A.dta", keep(match master) force
}

For instance, I want to match the database 2007_1C with the 2008_1A but I can't get the `s+1' to work. If I do that, it merges the 2007_1C with the 2007_1A

If I write
merge m:m mrun using "$path/Matricula Universitaria/`s'+1_2A.dta", keep(match master) force

I get the following error message:
file C:.../Matricula Universitaria/2007+1_2A.dta not found

Any help is welcome! Thanks

↧

Colour coded scatter graph

December 23, 2019, 7:53 am

≫ Next: Bootstrapping with SUEST

≪ Previous: Loop over file names

Hi,
I would like to create a scatter plot. I have count data for both variables in question, and most observations take values between 1 and 5, so many points are overlapping. I would like to show how many observations are at each point in the scatter graph through a colour coding scheme.
I know it is possible to create a scatter plot with weighted markers, however I don't want this I want all points to stay the same size.
Is this possible?
Thanks
Anthony

↧

Bootstrapping with SUEST

December 23, 2019, 9:14 am

≫ Next: Rescale a variable between 0 and 1 within industry

≪ Previous: Colour coded scatter graph

Hi everyone,

I should use Stata to make a mediatior analysis. I used the SUEST and I should do the bootstrapping for the mediatior analysis.
My command is the following:

capture program drop bootmm
program bootmm, rclass
reg ca_zsm g_comp g_coop tam_bedienbarkeit alter beruf female nettoeinkommen bildungsabschluss
estimates store cognitive_absoprtion

reg C_pc_risiko C_bedenken_zsm C_pc_nutzen tam_bedienbarkeit alter beruf female nettoeinkommen bildungsabschluss
estimates store risiko

reg C_pc_nutzen C_bedenken_zsm C_pc_risiko tam_bedienbarkeit alter beruf female nettoeinkommen bildungsabschluss
estimates store nutzen

poisson sm_infos_zsm_cv c.C_bedenken_zsm c.C_pc_risiko c.C_pc_nutzen c.C_ca_zsm i.g_comp i.g_coop c.C_pc_risiko#c.C_ca_zsm c.C_pc_nutzen#c.C_ca_zsm c.C_bedenken_zsm#c.C_ca_zsm c.tam_bedienbarkeit c.alter c.female c.bildungsabschluss
estimates store social_media_infos

poisson persönliche_infos_cv c.C_bedenken_zsm c.C_pc_risiko c.C_pc_nutzen c.C_ca_zsm i.g_comp i.g_coop c.C_pc_risiko#c.C_ca_zsm c.C_pc_nutzen#c.C_ca_zsm c.C_bedenken_zsm#c.C_ca_zsm c.tam_bedienbarkeit c.alter c.female c.bildungsabschluss
estimates store persönliche_infos

poisson sensible_infos_cv c.C_bedenken_zsm c.C_pc_risiko c.C_pc_nutzen c.C_ca_zsm i.g_comp i.g_coop c.C_pc_risiko#c.C_ca_zsm c.C_pc_nutzen#c.C_ca_zsm c.C_bedenken_zsm#c.C_ca_zsm c.tam_bedienbarkeit c.alter c.female c.bildungsabschluss
estimates store sensible_infos

suest cognitive_absoprtion risiko nutzen social_media_infos persönliche_infos sensible_infos

return scalar indav1a = _b[cognitive_absoprtion_mean:g_comp]*_b[social_media_infos_sm_infos_zsm_:C_ca_zsm]
return scalar indav2a = _b[cognitive_absoprtion_mean:g_comp]*_b[persönliche_infos_persönliche_in:C_ca_zsm]
return scalar indav3a = _b[cognitive_absoprtion_mean:g_comp]*_b[sensible_infos_sensible_infos_cv:C_ca_zsm]
return scalar indav1b = _b[cognitive_absoprtion_mean:g_coop]*_b[social_media_infos_sm_infos_zsm_:C_ca_zsm]
return scalar indav2b = _b[cognitive_absoprtion_mean:g_coop]*_b[persönliche_infos_persönliche_in:C_ca_zsm]
return scalar indav3b = _b[cognitive_absoprtion_mean:g_coop]*_b[sensible_infos_sensible_infos_cv:C_ca_zsm]
return scalar indtotal = _b[cognitive_absoprtion_mean:g_comp]*_b[social_media_infos_sm_infos_zsm_:C_ca_zsm] + ///
_b[cognitive_absoprtion_mean:g_comp]*_b[persönliche_infos_persönliche_in:C_ca_zsm] + ///
_b[cognitive_absoprtion_mean:g_comp]*_b[sensible_infos_sensible_infos_cv:C_ca_zsm] + ///
_b[cognitive_absoprtion_mean:g_coop]*_b[social_media_infos_sm_infos_zsm_:C_ca_zsm] + ///
_b[cognitive_absoprtion_mean:g_coop]*_b[persönliche_infos_persönliche_in:C_ca_zsm] + ///
_b[cognitive_absoprtion_mean:g_coop]*_b[sensible_infos_sensible_infos_cv:C_ca_zsm]
end

bootstrap r(indav1a) r(indav2a) r(indav3a) r(indav1b) r(indav2b) r(indav3b) r(indtotal), bca reps(500): bootmm

But it did not work and I do not know, how I can solve this problem.
This is the error message that appears everytime:

. bootstrap r(indav1a) r(indav2a) r(indav3a) r(indav1b) r(indav2b) r(indav3b) r(indtotal), bca reps(500): bootmm
(running bootmm on estimation sample)

Jackknife replications (458)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 50
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 100
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 150
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 200
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 250
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 300
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 350
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 400
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 450
nnnnnnnn
insufficient observations to compute jackknife standard errors
no results will be saved
r(2000);

Can anyone help me?
I would be very very grateful.

Thanks in advance,
Sarah

↧

Rescale a variable between 0 and 1 within industry

December 23, 2019, 11:47 am

≫ Next: Interpretation of threshold result

≪ Previous: Bootstrapping with SUEST

Dear Stata Users,

I have a continuous variable “spread1”. I want to rescale it within industry – “sic2” between 0 and 1. I tried the code below, however it generates missing values. Can you please help me adjust this code?

Code:

 
set more off
gen spread_rank = .
levelsof sic2, local(tempyear)
foreach i in `tempyear' {
g decile_temp=(spread1 - r(min)) / (r(max) - r(min))  if sic2==`i'
replace spread _rank = decile_temp if missing(spread _rank)
drop decile_temp
}

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str6 gvkey double fyear byte sic2 float spread1
"001004" 1991 50          .
"001004" 1995 50  1.1299435
"001004" 1996 50  1.6129032
"001004" 1997 50   .7100592
"001004" 1998 50   3.174603
"001004" 1999 50   1.793722
"001004" 2000 50   .9992887
"001004" 2001 50   .1749738
"001004" 2002 50  1.1173227
"001004" 2009 50  .05082708
"001004" 2010 50  .03787248
"001004" 2012 50   .1994963
"001004" 2013 50  .08229856
"001004" 2014 50   .0676999
"001009" 1991 34   5.128205
"001009" 1992 34  3.5714285
"001009" 1993 34   5.940594
"001009" 1994 34  3.5714285
"001013" 2009 36  .11997875
"001013" 2010 36  .07902196
"001034" 1995 28   4.295943
"001034" 1996 28  2.5316455
"001034" 1997 28  2.2988505
"001034" 1998 28   1.591512
"001034" 1999 28  1.6260163
"001034" 2000 28  1.1396011
"001034" 2001 28  1.6247886
"001034" 2002 28   .3355702
"001034" 2003 28  .04954287
"001034" 2004 28  .05894624
"001034" 2005 28 .035236377
"001034" 2006 28  .08287837
"001034" 2007 28   .0990612
"001036" 1997 34  1.9704434
"001036" 1998 34  2.1374047
"001036" 1999 34       3.75
"001036" 2000 34   5.181347
"001054" 1992 73  2.2222223
"001054" 1993 73   3.550296
"001056" 1999 38   6.022187
"001056" 2000 38  .12586533
"001056" 2001 38   7.373274
"001056" 2002 38  .57636833
"001056" 2003 38 -.25906712
"001056" 2004 38 -.06980962
"001056" 2005 38          0
"001056" 2006 38  .08565506
"001072" 1998 36   2.761341
"001072" 1999 36  1.8166804
"001072" 2000 36          .
"001072" 2001 36   2.437277
"001072" 2002 36  .22271162
"001072" 2003 36  .18187746
"001072" 2004 36  .08160122
"001072" 2014 36  .07010325
"001078" 2015 28   .0222655
"001081" 2006 26   .3913891
"001094" 2004 51   .5129675
"001094" 2005 51   .3992044
"001098" 1994 36   2.690583
"001111" 1995 73  1.8867924
"001111" 1996 73   2.247191
"001111" 1997 73  .58309036
"001111" 1998 73  1.0050251
"001111" 2000 73  .25740024
"001111" 2001 73  .10028643
"001111" 2002 73  .13841148
"001111" 2003 73  .06322653
"001111" 2004 73  .06782115
"001111" 2005 73          0
"001111" 2006 73  .05284137
"001111" 2007 73  .10971178
"001115" 1994 38   4.511278
"001115" 1995 38  2.1052632
"001115" 1996 38  1.8808777
"001115" 1997 38   .3350084
"001115" 1998 38    .260078
"001115" 1999 38   .3169572
"001128" 1991 73  4.0816326
"001128" 1992 73   9.929078
"001128" 1993 73    6.31579
"001137" 1992 36  4.5454545
"001137" 1993 36  1.1976048
"001137" 1994 36  1.3071896
"001161" 1991 36          .
"001161" 1992 36   .6920415
"001161" 1993 36   .7017544
"001161" 1994 36  .50377834
"001161" 1995 36  1.5037594
"001161" 1996 36  1.9417475
"001161" 1997 36  2.1201413
"001161" 1998 36   5.639913
"001161" 1999 36          .
"001161" 2000 36   .4535147
"001161" 2001 36   .9419128
"001161" 2002 36   .1549222
"001161" 2003 36  .20140807
"001161" 2004 36   .0908286
"001161" 2005 36  .06542511
"001161" 2006 36  .09837904
end

↧

Interpretation of threshold result

December 23, 2019, 1:20 pm

≫ Next: create a matrix

≪ Previous: Rescale a variable between 0 and 1 within industry

Dear Community,

I am a user of Stata 15. I am trying to estimate the threshold effects of inflation on value of shares traded using time series data.
I use this command, Threshold value, regionvars(value1 inflation) threshvar(inflation1) nthreshold(2) and I had this result.

. threshold marketcap, threshvar (L. inflation) regionvars (L. marketcap inflation interestrate) nthresholds (2)

Searching for threshold: 1
(Running 86 regressions)
.................................................. 50
....................................
Searching for threshold: 2
(Running 60 regressions)
.................................................. 50
..........

Threshold regression

Number of obs = 107
Full sample: 1991q2 - 2017q4 AIC = 1402.4793
Number of thresholds = 2 BIC = 1434.5533
Threshold variable: L.inflation HQIC = 1415.4817

Order Threshold SSR

1 9.4333 4.438e+07
2 16.9333 4.211e+07

Market cap. Coef. Std. Err. z P>|z| [ 95% Conf. Interval]

Region1
Market cap.
L1. .7015855 .0585698 11.98 0.000 .5867907 .8163802

inflation 1046.162 334.699 3.13 0.002 390.164 1702.16
interest rate -293.9598 53.70977 -5.47 0.000 -399.229 -188.6906
_cons 4293.822 2420.274 1.77 0.076 -449.827 9037.472

Region2
Market cap.
L1. 1.004932 .0161035 62.40 0.000 .9733692 1.036494

inflation -.4003764 34.83209 -0.01 0.991 -68.67001 67.86926
interest rate -1.705306 8.893831 -0.19 0.848 -19.13689 15.72628
_cons 157.0955 502.303 0.31 0.754 -827.4003 1141.591

Region3
Market cap.
L1. .9415728 .0246128 38.26 0.000 .8933327 .989813

inflation -1.075228 7.273 -0.15 0.882 -15.33005 13.17959
interest rate -4.740263 5.111379 -0.93 0.354 -14.75838 5.277856
_cons 496.4073 364.1677 1.36 0.173 -217.3482 1210.163

Guys, can any of you please help me to interpret the result above.
Thank you in advance. please check the attachment

↧

create a matrix

December 23, 2019, 2:49 pm

≫ Next: Double fixed effects Yes/No with estout

≪ Previous: Interpretation of threshold result

Hi all,

This might be a trivial question but whatever I have tried seems not working. (e.g. looping through the values of the variable fdi using forvalues and trying to generate new variables)

I have 2000+ territorial units (variable id) and the number of foreign firms located in each territory (variable fdi). My aim is to create a matrix similar to the one below. Namely, for each id, I want to create a new variable which is equal to the number of foreign firms in that territorial unit. Put differently, a matrix 2000 by 2000 is what I need

Will you please give me some hints regarding which commands I should look at/begin with?

Many thanks!

Array

↧

Double fixed effects Yes/No with estout

December 23, 2019, 3:44 pm

≫ Next: Calculating residuals in a multinomial regression as part of 2SRI

≪ Previous: create a matrix

I need to declare two fixed effects at bottom of regression table, but second fixed effect (Time FE) is being ignored.

Code:

local vars "log_loans log_amount_real late term baddebt"
    foreach x of vars {
               
        reg `x' wind_avg_treat, r
        eststo s1
        estadd local county = "No"
        estadd local time = "No"

        reg `x' wind_avg_treat i.fips, r
        eststo s2
        estadd local county "Yes"
        estadd local time "No"

        reg `x' wind_avg_treat i.fips i.year, r
        eststo s3
        estadd local county "Yes"
        estadd local time "Yes"
        
        reg `x' c.wind_avg_treat##c.wind_avg_treat i.fips i.year, r
        eststo s4
        estadd local county "Yes"
        estadd local time "Yes"

        esttab using `x'.csv, replace drop(*year *fips) ///
        stats(county time, labels("County FE" "Time FE")) label 
        }

↧

Calculating residuals in a multinomial regression as part of 2SRI

December 23, 2019, 3:51 pm

≫ Next: Randomly Assigning Treatment

≪ Previous: Double fixed effects Yes/No with estout

I am attempting to do a 2 stage residual inclusion (2SRI) model where the first stage is estimating a multinomial regression. The problem I am having is figuring out how to calculate the Pearson residual to enter into my second stage. I have seen one other unanswered post related to this question.

Let's say my data looks like this:

ID, provider, copay, opioid
1, MD1, 50, yes
2, MD2, 10, no
3, MD1, 25, no
4, MD3, 10, no
5, MD2, 30, yes
6, MD3, 14, yes

Using MD1 as the base comparison, the first stage estimation would ideally tell me how a dollar increase in copay influences the likelihood that you see MD1 versus MD2, MD1 versus MD3, etc. In Stata speak:

mlogit provider copay
predict res, rstandard

Ideally I would be able to predict the residual as above but it seems that I need to manually calculate this residual rather than rely on a postestimation option to predict the residual. And I'm pretty sure I need the Pearson residual though I am open to the fact I may be wrong on this. If someone could help me figure out the most efficient approach to the calculation, I would greatly appreciate it.

And just to complete the thought, in the second stage, I want to know how the provider choice influence opioid outcomes, and let's assume that provider choice is influenced by copay - hence the need for the first stage.

logit opioid provider res

Thank you!
Bianca

↧

Randomly Assigning Treatment

December 23, 2019, 4:07 pm

≫ Next: Problems with local macros in finding an excel file

≪ Previous: Calculating residuals in a multinomial regression as part of 2SRI

I have person-year level data, and I am utilizing DiD that exploits state-level shocks. Each person is assigned to a unique state based on location. I have 10 treated states. While the primary analysis is quite easy to code, I need help with a desire robustness check. I want to write some sort of loop to run placebo (if that's the right term here) tests with treatment assigned to states that should be control states. In other words, I want the placebo analysis to be repeated with the 40 control states and several iterations of assignment among those 40 (can be 10 for each iteration if that is simpler). I have found several topics that seemed adjacent to this, but I could not make sense of using them in my setting. Please let me know if you need any more information.

Justin

↧

Problems with local macros in finding an excel file

December 23, 2019, 7:45 pm

≫ Next: reshape

≪ Previous: Randomly Assigning Treatment

Hello Statalisters,
I've being working all day in a code and i couldn't make it work. Because i'm recieving this error message:

file C:/Users/afede/OneDrive - CAF/RED 2020/Capitulo_5/Material_proyecciones/4.Entregas/Parciales/Salud/191220_11vo_envio/xls/e1_demografia/_e1_het_gedadg.xlsx not found

I think that the code is not recognizing the locals and it's driving me crazy! I can assure you that in that path there are all the `country'_e1_het_gedadg.xlsx files and i've already checked for some spaces in the name of the files.

Let me explaing the task that i'm after. Mainly i'm trying to import from excel some sheets of files by looping over their names, operating in some way and the trying to save then in .dta files
The running locals are country and sheet and i need to use a different procedure for the countries named COL (Colombia) and URY (Uruguay), that's why you will see and if/else statement.

The code is as follows:

Code:

// Rutine 1

cls
clear all
drop _all

* Setting the paths
local output "C:\Users\afede\OneDrive - CAF\RED 2020\Capitulo_5\Material_proyecciones\4.Entregas\Parciales\Salud\191220_11vo_envio\resultados_proyecciones"
local raw "C:\Users\afede\OneDrive - CAF\RED 2020\Capitulo_5\Material_proyecciones\4.Entregas\Parciales\Salud\191220_11vo_envio\xls\e1_demografia"

* Setting local macros for the loops
local country = "ARG BOL CHL MEX PAN PRY PER COL URY"
local sheet = "all pub ss"

* Making the loop
foreach i of local country {

    if "`i'" == "COL" | "`i'" == "URY" {
            foreach j of local sheet {
                display as error "`i'"
                import excel "`raw'/`i'_e1_homo_gedadg.xlsx", sheet("e1_proy_`j'_g") clear
                sxpose, clear force
                destring _v*,replace

                generate pais = "`i'"
                generate subsistema = "`j'"
                
                generate año=.
                local años "2015 2020 2025 2030 2035 2040 2045 2050 2055 2060 2065"
                forvalues i=1/11 {
                    local n: word `i' of `años'
                    replace año = `n' if _n==`i'
                } 
                
                local nombres "e0_14 e15_29 e30_44 e45_64 e65_80 e_mas80 e_agregado"
                forvalues i = 1/7 {
                    local nomb: word  `i' of `nombres'
                    rename _var`i' `nomb'
                }
                save "`output'/`i'_`j'.dta" , replace
            }
    }
    
    else {
            foreach j of local sheet {
                display as error "`i'"
                import excel "`raw'/`i'_e1_het_gedadg.xlsx", sheet("e1_proy_`j'_g") clear
                sxpose, clear force
                destring _v*,replace

                generate pais = "`i'"
                generate subsistema = "`j'"
                
                generate año=.
                local años "2015 2020 2025 2030 2035 2040 2045 2050 2055 2060 2065"
                forvalues i=1/11 {
                    local n: word `i' of `años'
                    replace año = `n' if _n==`i'
                } 
                
                local nombres "e0_14 e15_29 e30_44 e45_64 e65_80 e_mas80 e_agregado"
                forvalues i = 1/7 {
                    local nomb: word  `i' of `nombres'
                    rename _var`i' `nomb'
                }
                save "`output'/`i'_`j'.dta" , replace
            }
    }

}

Can anyone tell me what might be working wrong? I think that, in general, i'm not so sure about the correct quotating and calling for locals when dealing with paht files but everything that i read didn't work.

Thanks and merry christmas to you all!

↧

reshape

December 23, 2019, 8:29 pm

≫ Next: Creating loop to convert SPSS to CSV

≪ Previous: Problems with local macros in finding an excel file

Hi everyone,
I have a long dataset and used this code to reshape it to wide format:

Code:

ds hhidpn wave, not
reshape wide `r(varlist)', i(hhidpn) j(wave)

I was wondering to know how can I reshape the data to long format once again. I tried the below code, but it did not work:

Code:

reshape long `r(varlist)' , i(hhidpn) j(wave)

Thanks.
Nader

↧

Creating loop to convert SPSS to CSV

December 23, 2019, 9:17 pm

≫ Next: Total % abs(bias) reduction in pstest

≪ Previous: reshape

Hi everyone, new user here. I have some basic skill in Stata but am trying to become more advanced. I have several SPSS files that I am trying to convert to CSV. I created a do-file where one-by-one I import the SPSS files using "import spss" and export as a CSV using "export delimited." I know I can do this all at once using a loop and a local with the different file names, but I'm not sure how to put it together. Would someone more advanced be willing to help me? I'd be very grateful!

↧

Total % abs(bias) reduction in pstest

December 23, 2019, 11:36 pm

≫ Next: How to calculate the elder generation’ maximum education year in one family

≪ Previous: Creating loop to convert SPSS to CSV

Can anyone help to tell me where I can get or calculate total % |bias| reduction in pstest after psmatch2. I saw only the bias reduction for each covariate. However, in some published papers, the authors report 'total %|bias| reduction' and I do not know how to calculate this.
Thank you very much!

↧