Quantcast
Channel: Statalist
Viewing all 73298 articles
Browse latest View live

Replicating Bar Graph From Article (Flors-Mas 2021)

$
0
0
To whomever is interested. I am trying to replicate the following bar graph I obtained from an article (Flors-Mas 2021). As you can see, this is a bar graph whose categories are distributed across different values. I was wondering how to replicated using the data used in the article which I can also attach to this post if need be. I am currently trying to replicate the aforementioned graph, but there are so many options that I don't even know where to begin.

Array

merging 2 rounds of WBES to create a short panel

$
0
0
Hey everyone,
I need your help on how to merge the 2 waves of the WB's Enterprise survey for The Gambia. The waves in question are the 2018 and 2023. the 2023 contains 74 firms that were part of the 2018 dataset. So i want to be able to construct a short panel where firms from 2018 will be merged with those in the 2023. However, my main problem has been that the ID and IDSTD are different in the datasets. But i was able to get the ISIC rev 3 codes (mprod) and the fact that a variable stratificationpanelcode has been defined such that all firms for the year of the wave are coded fresh and if from a previous round are coded panel. So i am able to identify the 74 firms even without matching by simply observing this variable. I have tried using a one to one merge by generating a merge key in both datasets which uses the mprod and stratificationpanelcode to construct it vis gen merge_key_2018 = mprod + Stratificationpanelcode*10000.

After doing this in both datasets. then I merge using
merge 1:1 merge_key_2018 using "C:\Users\User\Desktop\dataset_2023_with_keys. dta" , keepusing(merge_key_2023)
This keeps returning variable merge_key_2018 not found r(111);

Even though the variable is in my dataset.

Is there something that i am missing that could explain what is happening?

Coefplot - multiple models with separate added text for each model.

$
0
0
Hello Stata-listers,

I am using coefplot with multiple models and am trying to get sample sizes listed on the right y-axis (each model has different sample sizes due to missing values and different time periods). I have code to pull the Ns, but can't figure out how to get it displayed. I am using the ymlab option with various local macros. But, for ease with this example, we can use simple text. So, for example, how can I modify the following code to display the strings: first, second, third, etc on the right y-axis for the first plot, and "fourth", "fifth", and "sixth" on the y-axis of the second plot. I hope that makes sense. I have attached a manually modified image to kind of show what I'm trying to do with the code below. I know that I can run two separate plots and combine them using grc1leg, however, the problem with that is that it will repeat the left y-axis labels, and if I remove them on the second plot, the aspect ratio of the two plots are different.

Code:
sysuse auto, clear  
reg trunk foreign gear_ratio if rep78 == 5  // Outcome = Trunk and sub-group = 4
estimates store trunk_s5
reg trunk foreign gear_ratio if rep78 == 4    // Outcome = Trunk and sub-group = 5
estimates store trunk_s4    

reg mpg foreign gear_ratio if rep78 ==5  // Outcome = MPG and sub-group = 4
estimates store mpg_s5
reg mpg foreign gear_ratio if rep78 ==4  // Outcome = MPG and sub-group = 5
estimates store mpg_s4    

reg turn foreign gear_ratio if rep78 ==5 // Outcome = Turn and sub-group = 4
estimates store turn_s5
reg turn foreign gear_ratio if rep78 ==4  // Outcome = Turn and sub-group = 5
estimates store turn_s4    

label var trunk "Trunk"
label var mpg "MPG"
label var turn "Turn"  

#delimit ;    
coefplot (trunk_s4 \ mpg_s4 \ turn_s4, keep(foreign gear_ratio)), ymlab(1 "First" 2 "Second" 3 "Third", axis(2)) ||              
             (trunk_s5 \ mpg_s5 \ turn_s5, keep(foreign  gear_ratio)), ymlab(1 "Fourth" 2 "Fifth" 3 "Sixth", axis(2)) || ,            
             eqrename(*_s4 = "" *_s5 = "")            
             swapnames aseq eqlabels("Origin" "Gear Ratio")            
             bylabels("Subgroup 4" "Subgroup 5") nooffset     ;
#delimit cr
Array

Efficient way to generate binary vars while changing var names

$
0
0
Let's say I have 3 variables: freq_fight freq_run freq_drink, which contain values 0, 1, 2, 3. I want to create binary 0/1 versions of each of those variables and also have the new binary variables replace 'freq' with 'binary' in the variable name. I could do this very clunkily, but is there a way to do this all in one loop?

Fractional response regression for panel data

$
0
0
Is anyone familiar with doing fractional response regression for panel data in Stata?

I'm trying to figure out if I'm using the correct code:

fracreg logit DV IV controls mean_of_the_IV mean_of_each_control i.Year, vce(cluster orgID)

No convergence with stata 17

$
0
0
When I use some programs with stata 17, they did not converge in two of my computers with I7 (32 and 64 gb ram). The same programs and data with stata 17 do converge with amd processors: amd a12-9720 and 16 gb ram and ryzen 9 3900xt. When I use stata 16, there are no problems with my I7 processors. Has someone experienced the same problem?

Guidance on Latent Class Analysis - a few starting questions

$
0
0
I have a 9-question "Vaccine Hesitancy Scale" developed by SAGE to evaluate vaccine confidence. Each question is answered with a 5-level Likert style response (Strongly Agree, Agree, Neither agree nor disagree, Disagree, Strongly disagree). In addition to the "full" data, I've also collapsed responses into an "abbreviated" set (Agree, Neither agree nor disagree, Disagree). The questions can be broadly categorized into questions of confidence (do you believe that vaccines work and are helpful) and questions of concern over risk (do you believe that vaccines are safe and that new vaccines carry low risks). Responses are coded 1-5 (full) or 1-3 (abbreviated), with lower values corresponding to greater confidence / acceptance of vaccines. I have 835 records in my data set.

We hypothesized we might see about 3 latent classes in our data (vaccine enthusiasts, with low values across the board; vaccine skeptics with higher values for all responses; vaccine ambivalents with low "Confidence" responses but higher "risk" responses - the folks in the middle). It looks like we may have four classes, and that the abbreviated responses might provide the best model to work with (also easier to interpret and describe). Here's AIC and BIC for the 7 models I compared
Model N ll(null) ll(model) df AIC BIC
twoclass_full 835 . -10533.3 31 21,128.63 21,275.18
threeclass_full 835 . -8626.52 42 17,337.04 17,535.59
fourclass_full 835 . -8024.92 53 16,155.84 16,406.39
fiveclass_full 835 . -7359.83 64 14,847.67 15,150.22
threeclass_abbreviated 835 . -5011.05 38 10,098.10 10,277.74
fourclass_abbreviated 835 . -2140.76 48 4,377.52 4,604.43
fiveclass_abbreviated 835 . -2147.69 58 4,411.38 4,685.57
While running these models, I came across a few things that I couldn't quite wrap my head around. Could you help with the following questions?

1. I've heard that there is a risk of the model iterations solving for a local likelihood maximum, and missing the global maximum of the likelihood function, but I believe this may be a bigger risk when you specify more groups (i.e., more than five?). Is this something I should be concerned over? How can I ask Stata to resolve the model to better approximate the global maximum?
2. The estat lcmean command took an hour to run (!) Is this normal? I'm finding these postestimation statistics are taking a long time to run; I realize they must be computationally intense, but I have a new-ish computer, and an hour to run a command seems excessive. Could I be doing something wrong, or specifying something incorrectly that I should be thinking about?
3. Would you consider anything else beyond the AIC and BIC in comparing the 7 models above? What else would you look at to select the "best" model?
4. Finally, I am concerned about how to classify the function family. I initially felt ologit made sense for the Likert scale data, but the model wouldn't work ("initial values not feasible"). Then I left out any specification, and Stata just picked Family: Gaussian, Link: Identity. I have heard that Likert can be analyzed as continuous data; does this seem reasonable?

I appreciate any advice or any further videos/presentations/reading you might recommend. Thank you.

Should we control for the sub-item of a composite variable

$
0
0
Dear Statalist:

I want to study the effect of X on Y.
X is a composite variable, eg. X=A*exp(B)-C
Should I control for A,B,C when do regression?
On one hand, according to "A Crash Course in Good and Bad Controls", A,B,C are "Common Causes" for X and Y.
On the other hand, controlling for A,B,C may generate multicollinearity problems.

So in the regression, should we also control the effects of A,B,C separately?

Incidence rates per 100,000 persons

$
0
0
Hi, I have the following data below in patients who have all experience an event of thrombosis. What I am trying to do is get incidence of thrombosis per 100,000 persons using the total population data stratified by gender. Secondly, I want to repeat that analysis and adjust for age. I am not sure how to remotely start this process. I started to use a poisson regression model but it seems this gives incidence rates based on individual observations and can't estimate per 100,000 people. Thank you for your help.

---------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input long id float age str34 sex str113 race str7 survived long censustractid str8 medianhousehold str4 medianage long(totalpopulation white)
102472 55 "Male"   "White"                            "No"  31532 "28611"  "31"    4355  1725
103810 72 "Male"   "Black/African-American"           "No"  34213 "71250"  "44.7"  2602  2082
103864 94 "Female" "White"                            "No"  13010 "72813"  "42.9"  2928  2769
103867 87 "Male"   "Unknown"                          "No"  15748 "31442"  "41.7"  1381   589
103869 70 "Male"   "White"                            "No"  25732 "43883"  "31.8"  3701  3241
103882 27 "Male"   "White"                            "No"  22839 "48365"  "41.2"  4431  4035
103885 67 "Female" "Black/African-American"           "Yes" 22852 "15720"  "23.6"  1499   376
103972 22 "Male"   "Unknown"                          "No"   6661 "36280"  "41.1"  3963  1268
103987 54 "Female" "Unknown"                          "No"  29532 "48405"  "38.4"  5962  5403
103989 81 "Female" "White"                            "No"  29458 "74167"  "53.3"  4314  3927
103999 84 "Female" "White"                            "No"  25711 "70169"  "36.1"  3784  3377
104003 56 "Female" "Unknown"                          "Yes"  6470 "91023"  "39.8"  5684  5053
104050 55 "Female" "Unknown"                          "No"  12866 "11397"  "21.6"  3709   218
104052 77 "Male"   "Black/African-American"           "No"  14864 "31607"  "20.4"  2483   505
104053 44 "Female" "White"                            "No"  14850 "15172"  "21.8"  3523  1222
104054 75 "Female" "Unknown"                          "No"  14975 "50655"  "47.5"  3153  2402
104055 69 "Female" "Black/African-American"           "No"  14912 "46116"  "33.4"  9627    26
104056 54 "Male"   "White"                            "Yes" 14965 "62745"  "28.9"  4962  2975
104068 59 "Female" "Asian"                            "Yes" 15900 "60310"  "37.6"  5659  1097
104069 92 "Male"   "Asian"                            "Yes" 15941 "93750"  "34.5"  1948   650
104070 82 "Male"   "Unknown"                          "Yes" 16161 "47806"  "40.9"  4982   823
104073 73 "Female" "White"                            "Yes" 34279 "47273"  "33.5"  3673  2639
104075 82 "Female" "Native Hawaiian/Pacific Islander" "No"  22831 "35074"  "42.2"  4011  2600
104077 80 "Female" "White"                            "No"  22821 "43309"  "56"    2189  2037
104180 60 "Female" "White"                            "No"  29415 "44196"  "38.9"   946   754
104188 44 "Male"   "White"                            "No"  29300 "43542"  "31.5"  5803  4301
104192 98 "Female" "Black/African-American"           "No"  29294 "59286"  "35.1"  3645  2449
104204 73 "Male"   "Black/African-American"           "No"  29297 "66216"  "39"    4303  3039
104256 73 "Male"   "White"                            "No"  25523 "62796"  "27"    2856  1566
104259 43 "Male"   "White"                            "No"   6671 "18231"  "32.8"  3271  2405
104264 26 "Female" "Unknown"                          "Yes"  8645 "65214"  "34.3"  5123  3770
104271 55 "Female" "White"                            "No"      . ""       ""         .     .
104277 65 "Female" "Asian"                            "Yes"  6615 "97829"  "49.7"  6830  5862
104282 63 "Male"   "White"                            "No"  15598 "54345"  "39.4"  9700  3507
104283 53 "Male"   "White"                            "Yes" 33345 "56818"  "34.2"  7693  5917
104284 77 "Female" "Hispanic/Latino"                  "No"  12877 "22880"  "30.4"  3483   937
104285 69 "Male"   "White"                            "No"  14856 "56190"  "39.5"  1997   625
104287 85 "Male"   "White"                            "No"  22908 "62975"  "34.8"  5276  4954
104288 76 "Male"   "White"                            "No"  16118 "67422"  "38.5"  6605  1772
104289 61 "Female" "Unknown"                          "No"  16072 "91086"  "42.8"  5496   758
104290 37 "Male"   "Asian"                            "No"  15978 "57544"  "39.6"  4942   816
104291 81 "Male"   "Unknown"                          "No"  16129 "61558"  "37.8"  8233  4417
104292 80 "Male"   "White"                            "No"  22964 "63586"  "42.7"  7643  7166
104294 51 "Male"   "White"                            "No"  14790 "65185"  "35.7"  8920  4311
104295 54 "Female" "White"                            "No"  14777 "80386"  "43"    8241  7418
104298 68 "Male"   "White"                            "No"  11169 "57537"  "41"    7241  2256
104299 57 "Female" "White"                            "No"  11056 "31740"  "27.1"  3892  1307
104306 81 "Female" "White"                            "Yes" 11117 "50025"  "49.7"  3586  3159
104312 38 "Female" "White"                            "No"  11125 "54672"  "32.6"  5599  3135
104353 65 "Male"   "White"                            "No"  34434 "33370"  "33.7"  4419  3189
104357 35 "Male"   "Unknown"                          "No"   8635 "38229"  "43.2"  2721  2408
104364 88 "Male"   "Unknown"                          "No"   8711 "88730"  "38.4"  9872  6267
104378 66 "Male"   "Unknown"                          "No"   8591 "57500"  "30.7"  6579  4091
104499 60 "Female" "White"                            "No"   8355 "35280"  "37.3"  4384  3205
104505 86 "Male"   "Unknown"                          "No"   8595 "67227"  "30.1"  3542  2067
104511 74 "Male"   "Hispanic/Latino"                  "No"   6757 "59667"  "35.1"  8160  6307
104513 78 "Male"   "White"                            "No"   6727 "62813"  "36.1"  5677  4420
104516 80 "Female" "Unknown"                          "No"  29467 "48047"  "46.3"  2315  2077
104532 47 "Male"   "Black/African-American"           "Yes" 12864 "18602"  "32.9"  5113   757
104533 56 "Male"   "Black/African-American"           "No"  14863 "18722"  "37.8"  2532   204
104534 77 "Male"   "White"                            "Yes" 14899 "17545"  "40"    3580   269
104537 64 "Male"   "White"                            "No"  27528 "16955"  "34.1"  2647   511
104538 60 "Female" "Black/African-American"           "No"  27575 "33727"  "37.3"  4534  2899
104540 56 "Male"   "Unknown"                          "No"  16034 "35938"  "30.2"  6158   615
104553 44 "Male"   "White"                            "No"  16503 "38365"  "40.8"  4978  4733
104568 65 "Male"   "White"                            "Yes" 33508 "87301"  "35.4"  9155  5716
104569 83 "Female" "Unknown"                          "No"  15866 "87546"  "50.1"  5894   677
104570 59 "Male"   "Unknown"                          "No"  15896 "94875"  "43.3"  4111   922
104571 49 "Female" "Unknown"                          "No"  15986 "90167"  "31.6"  7364   954
104573 55 "Female" "Unknown"                          "Yes" 16065 "72940"  "41"    4377   649
104577 69 "Female" "White"                            "No"  25484 "39056"  "35.7"  2783  1950
104582 23 "Male"   "Black/African-American"           "No"  33383 "69718"  "34.3"  5285  3975
104584 73 "Female" "Black/African-American"           "No"  33489 "78092"  "41.8"  3344  2266
104586 77 "Female" "Black/African-American"           "No"  33297 "60069"  "27"    4364  3264
104593 51 "Female" "White"                            "No"  29554 "54776"  "40.5"  5037  3999
104594 70 "Female" "Unknown"                          "No"  29443 "61857"  "41.2"  5270  4917
104596 91 "Female" "Unknown"                          "Yes" 29455 "71453"  "47.2"  3318  2853
104597 89 "Male"   "Unknown"                          "No"  29455 "71453"  "47.2"  3318  2853
104598 80 "Male"   "White"                            "Yes" 29443 "61857"  "41.2"  5270  4917
104599 71 "Male"   "Unknown"                          "No"  29458 "74167"  "53.3"  4314  3927
104600 71 "Male"   "Black/African-American"           "No"  14848 "19912"  "32.4"  1933    83
104601 57 "Female" "Black/African-American"           "No"  14919 "57784"  "48.8"  5517   116
104602 76 "Male"   "Black/African-American"           "Yes" 14891 "26157"  "31.7"  4190   114
104603 62 "Male"   "Unknown"                          "No"  16125 "62778"  "43.4"  5091  1268
104605 61 "Male"   "Unknown"                          "No"  16023 "47822"  "34"    5526   737
104606 38 "Male"   "Unknown"                          "No"  15930 "105508" "32.9"  3014   524
104609 68 "Male"   "White"                            "No"  14779 "61341"  "38.3"  7755  4956
104626 89 "Female" "White"                            "No"  11117 "50025"  "49.7"  3586  3159
104628 59 "Male"   "White"                            "No"  11127 "67433"  "42.1"  1829  1463
104631 53 "Male"   "White"                            "No"  11062 "45482"  "34.4"  3695  2223
104640 63 "Male"   "White"                            "No"  25711 "70169"  "36.1"  3784  3377
104652 88 "Male"   "Hispanic/Latino"                  "No"  41995 "70972"  "32.6" 10795 10341
104654 67 "Female" "Unknown"                          "No"   8671 "70200"  "36.3"  5330  3603
104656 78 "Female" "White"                            "Yes" 22924 "76979"  "31.5"  5755  4244
104658 62 "Male"   "White"                            "No"  22838 "27250"  "31.2"  5622  4467
104661 65 "Male"   "White"                            "No"  22974 "47671"  "34.6"  8377  6050
104662 61 "Male"   "Black/African-American"           "No"  42457 "33222"  "39.6"  4049  1171
104676 92 "Male"   "Unknown"                          "No"   8632 "38072"  "29.1"  6999  4523
104687 55 "Male"   "White"                            "No"  42808 "94865"  "33"   20376 17812
104694 67 "Male"   "Unknown"                          "Yes"  6625 "34614"  "31.9"  7851  6497
end
------------------ copy up to and including the previous line ------------------

Listed 100 out of 307 observations
Use the count() option to list more

.

meta analysis 'midas' command returning "type mismatch [r(109)]" error with Stata 18.0 and Stata 17.0

$
0
0
Hello,

I am trying to use 'midas' package to compare sensitivity, specificity and AUROC for two diagnostic tests. However, I am getting "type mismatch [r(109)]" error with Stata BE 18.0 and Stata MP 17.0 regardless of my data organization. I would appreciate any guide. Here is the data that I am using:

[ATTACH=CONFIG]temp_36049_1728358673592_664[/ATTACH]

Thank you!

Please help :variable name ECT is in the list of predictors error while mg test panel ardl

$
0
0
I get this error every time i run mg test and there are no variable name ECT also. Please help.

. xtpmg d.int_rate_loan ib_bank , lr(l.int_rate_loan ib_bank ) ec(ECT) replace mg
invalid new variable name;
variable name ECT is in the list of predictors
r(110);

Regression Discontinuity Plots with Fixed Effects

$
0
0
Hello,

I'm, running a Regression Discontinuity in Time specification with time fixed effects such as monthly fixed effects. While there is no issue with the regression itself, the problem arises when I try to use twoway to plot the discontinuity.

Comparing the regression output with the plot, it's clear that the regression estimate is not reflected in the plot. Why is this the case? Notably, the plot using lfit does not change when I include fixed effects, hence this issue must be caused by including fixed effects.

If anyone knows any remedies to make the graph reflect the regression output, that'd be much appreciated. Below is some example code and a twoway plot.

Kind regards,
Angus


* Syphilis tests monthFE
reg Total time i.treatment c.time#i.treatment i.month if in_bandwidth, vce(robust)
estimates store Syphilistest

predict yhat_Syphilistest, xb
twoway (scatter Total time if in_bandwidth, mcolor(gs12) msymbol(o)) ///
(lfit yhat_Syphilistest time if treatment == 0 & in_bandwidth, lcolor(blue) lpattern(solid)) ///
(lfit yhat_Syphilistest time if treatment == 1 & in_bandwidth, lcolor(red) lpattern(solid)), ///
title("Syphilis Tests Over Time") ///
xlabel(#12) ///
ylabel(, angle(horizontal)) ///
xtitle("Months from August 2013") ///
ytitle("Syphilis test count") ///
xline(0, lcolor(black) lpattern(dash)) ///
legend(off) ///
name(Syphilistest, replace)

[ATTACH=CONFIG]temp_36050_1728369461141_21[/ATTACH]Array


Instrumenting an Endogenous Variable and Its Interaction Term Using ivregress

$
0
0
Hello everyone,

I am using Stata 15 on Windows 10 and have a question about implementing an instrumental variables regression involving an endogenous variable and its interaction term.


Context:
- Dependent Variable: Y
- Endogenous Variable: X1
- Exogenous Variable: X2
- Interaction Term: X1X2 (which is X1 * X2) <- these are centered
- Instruments:
For X1: Z1
For X1X2: Z2 (constructed as Z1 * X2) <- these are centered

The stata commands I used are the following:
ivregress 2sls Y X2 (X1 X1X2 =Z1 Z2), r
estat endog, forcenonrobust
estat firststage, all forcenonrobust

Questions:
1. First-Stage Regressions:
- Will Stata produce two first-stage regression tables, one for each endogenous variable (X1 and X1X2), each including both instruments (Z1 and Z2)?
- Are my commands correct or am I missing something?

2. Significance of Instruments:
- In the first-stage regression where X1 is the dependent variable, Z1 is significant while Z2 is not.
- In the first-stage regression where X1X2 is the dependent variable, Z2 is significant while Z1 is not.
- Is this situation acceptable, or should I be concerned about the insignificance of one of the instruments in each first-stage regression?

Thank you in advance

hausman mg pmg no coefficients in common; specify equations(matchlist) for problems with different equation names.

$
0
0
I used xtpmg d.int_rate_loan ib_bank,lr(l.int_rate_loan ib_bank) replace pmg xtpmg d.int_rate_loan ib_bank,lr(l.int_rate_loan ib_bank) replace mg command and got the result, after this I used command hausman mg pmg,sigmamore and got the error . hausman mg pmg,sigmamore no coefficients in common; specify equations(matchlist) for problems with different equation names. r(498); Please help.

Help on Weibull incidence analysis

$
0
0
Hi,

I am trying to do Weibull analysis on cancer onset.

My data are respondents aged 51-79 and im interested in the disease incidence for different educational attainment.
I am using 2 waves and created a dummy variable; cancer_disease_onset . If the respondent develops cancer in between the waves.
In the picture you see what it looks like
Array


I want to use the weibull model. and use the following code. agey_br is the age of the respodent.
stset agey_br, failure(cancer_disease_onset==1)
streg agey_br male i.educ_group, dist(weibull) Array




this result suggests that younger individuals are at higher risk of developing heart disease compared to older individuals. specifically for each additional year of age, the risk of developing heart disease decreases by about 95%??

I dont understand this.

Am i doing something wrong in the model or do i interpret this

thanks in advance

Using Binary variable and countinous variable to calculate the effect of covid on students achievments

$
0
0
Hi guys,

I am using the "mixed" command to regress the effect of COVID-19 on students' achievements. I am trying to use two variables to represent the COVID, one is binary (Post_COVID: 0 for Before the Epidemic, 1 for During the Epidemic), and another is continuous (SchoolClosureDays: means the days of Covid-induced School Closure). Binary Variable: Represents a dichotomous outcome. It indicates whether students were affected by school closure (at least one day) versus not at all, which simplifies the complexity of the situation. Continuous Variable: Represents a nuanced view of the relationship. It captures the impact of incremental changes (each additional day of closure) on student achievement, allowing for a detailed understanding of how longer closures progressively affect outcomes. Can I include both the continuous variable and the binary variable in the same model? I am afraid there may be an overlap in the effects, meaning they might capture similar aspects of the pandemic’s impact. Is there any good way to check if the effects of the two variables are independent? Or any idea about if it is reasonable to include both in the model? Thank you~

This is the command I use: mixed Achievements Time SchoolClosureDays i.Post_COVID##i.Gender i.Post_COVID##i.IMMIG i.Post_COVID##i.SES c.SchoolClosureDays#i.Gender c.SchoolClosureDays#i.IMMIG c.SchoolClosureDays#i.SES i.Gender#c.Time i.IMMIG#c.Time i.SES#c.Time|| CNTRYID:Time, covariance(unstructured) nolog vce(robust)

Gender(binary,0=Girls), IMMIG(immigration backgrounds, 0=native, 1=second-genration, 2=first-generation), SES(socia-economic background, 0=low SES, 1=medium SES, 2=high SES), Time(continuous, 2003-2023), CNTRYID=country identifier.

Yin

Sample mean symbol in graphs

$
0
0
note("Note: X{sub:n}")
I am working in a graph. I would like writte down x bar (an X with the _ over the X). Somebody could help me with this issue, please? Many thanks in advance.

create dataset/frame from previous collect result

$
0
0
hi,

Is there any way to create a new frame from the previous collect result:

Code:
. clear

. sysuse auto
(1978 automobile data)

. collect clear

. collect: qui reg price mpg rep78 headroom weight length turn displacement gear_ratio

. collect layout (colname) (result[_r_b]) (), name(default)

Collection: default
      Rows: colname
   Columns: result[_r_b]
   Table 1: 9 x 1

------------------------------------
                       | Coefficient
-----------------------+------------
Mileage (mpg)          |   -117.2946
Repair record 1978     |    742.5576
Headroom (in.)         |   -594.5623
Weight (lbs.)          |    3.935387
Length (in.)           |   -74.93533
Turn circle (ft.)      |   -208.5507
Displacement (cu. in.) |    16.98242
Gear ratio             |    1645.675
Intercept              |    10076.63
------------------------------------

desired new frame:

Code:
. list

     +---------------------------------+
     |                  var   coeffi~t |
     |---------------------------------|
  1. |         Mileage(mpg)   -117.295 |
  2. |     Repairrecord1978    742.558 |
  3. |        Headroom(in.)   -594.562 |
  4. |         Weight(lbs.)    3.93539 |
  5. |          Length(in.)   -74.9353 |
     |---------------------------------|
  6. |      Turncircle(ft.)   -208.551 |
  7. | Displacement(cu.in.)    16.9824 |
  8. |            Gearratio    1645.68 |
  9. |            Intercept    10076.6 |
     +---------------------------------+

thanks,

Probit or Logit when concers about non-linearity?

$
0
0
Hello everyone,

I'm working with a binary outcome variable and have encountered an interesting issue regarding model specification using both logit and probit regression models in Stata. I have a specific predictor variable that I've found significant in both models. Initially, the link test was significant for both the logit and probit models, indicating potential misspecification.

When I include a polynomial term of this predictor in the probit model, the link test becomes insignificant, suggesting that the non-linearity may be adequately captured. However, in the logit model, the link test remains significant even after adding the polynomial term. Interestingly, when I only include the polynomial term in the logit model (excluding the base term), the sign of the coefficients flips and the polynomial term becomes positive, but when both polynomial and base term are included they are both negative. Additionally, I noticed probit and logit adjusted R2 are very comparable with the lowest R2 achieved with only the polynomial, followed by only the base term, and the highest R2 is in the model with both base term and polynomial. The predictor variable also appears as an interaction with another contiuous variable, VIF shows no problem with multicolinearity, and all continous variables are centered when relevant. Furthermore, I have treid many different speciications including polynomials of other variables and interaction terms with and between other variables, but the source of non-linearity is clearly coming from the variable I'm disucssing.

Given this context, how should I interpret the significance of the link test in both models? Why might the probit model capture the relationship better with the polynomial while the logit model does not? Also, is a 10% significance level too lenient for model evaluation in this scenario? Any insights or recommendations on how to proceed, which model to select or any other relevant matters would be greatly appreciated!

Thank you!

Literate programming and cross-reference of formatted tables

$
0
0
Dear members,

Has anyone solved the issue regarding `cross-reference` of formatted tables (html) while using MarkStat `.stmd` or Quarto nbstata `.qmd`.

My workflow is mainly for papers: Stata 17 SE (Mac & Windows) & German Rodriguez MarkStat. Coding/analysis with Mattias Nordin Sublime Text StataEditor package, report rendering to html with MarkStat embedding formatted tables with `.include ` function with great results. Finally, I use html2docx stata command to convert the html to docx.

Unfortunately, cross-reference of tables does not work in MarkStat. ??

As MarkStat uses Pandoc, I tried using pandoc's fenced divs ( :::{#tbl:tableid} ), but it seems that MarkStat does not recognise pandocs fenced divs.


I have recently tried Tim Huegerich nbstata with Quarto, and followed FernandoRios tips in another thread
```{stata}
*| output: asis
display "```{=html}"
type my_table.html
display "```"
```
but this works only if there is no code chunck option for create cross-referencing. Once you specify *| label: tbl-tableid, the document fails to render.

I even tried using the table as `.md` like
::: {#tbl:tableid}
{{< include my_table.md >}}
:::
and this works for simple tables but complex tables are rendered as a chuck of text.


Has anyone found a working solution to cross-reference formatted tables with either MarkStat or nbstata? Have you been able to use Pandoc's cross-referenced fenced divs MarkStat?

Best,

J
Viewing all 73298 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>