Quantcast
Channel: Statalist
Viewing all 72840 articles
Browse latest View live

Why use dfmethod(satterthwaite) in mixed model?

$
0
0
Hello Statalist!

Can someone explain in which instance would you use Satterthwaite method of calculating degrees of freedom in a mixed model? I understand that Satterthwaite and Kenward-Roger are preferred for smaller sample sizes. What is "small" in this instance? I have 55 participants with 25 years of data for each. Should I use this instead of the default?

Thank you for your help!

Duplicate prescriptions

$
0
0
I am using Stata 16 to analyze a dataset with prescription information. Each person has multiple prescriptions and their unique ID shows up for each prescription. I would like to remove duplicates, in that I would like to create columns for prescription 1, prescription 2, etc. instead of having a new sample ID for each prescription. I believe this may be changing from long form to short form, but that terminology may be wrong.
I appreciate any guidance and apologize if anything is unclear.

Livingston - Lewis (1995) Reliability method

$
0
0
Hello everyone,
I am working on getting the decision accuracy and decision consistency for an exam result (final scores).
What we have on the data set are ID, final score, passing score (Cutoff). The reliability coef alpha was already provided.
It seems that the Livingston Lewis can help get the result.
I was wondering if there were some STATA codes to run in order to get the results.
Or, they are clear formula to use.
Thank you very much in advance for the help as I really appreciate it.

The result of "test" do not align with the "LR test"

$
0
0
Hi all,

I am running logistic regression. Model 1 is the nested model. Model 2 is the comparison model with interaction terms.

*race have four categories
*x1 two categories
*x2 two categories

Model 1:
logit y i.sex c.age i.race i.ses i.x1 i.x2
estimate store model1

Model 2:
logit y i.sex c.age i.ses i.race##i.x1 i.race##i.x2
estimate store model2

testparm i.race#i.x1 i.race#i.x2
lrtest model2 model1

Wald test: chi2 (6) = 11.47, p = .0749
LR test: LR chi2 (6) = 11.57, p = .0723

Why there is a difference here? Thanks!




Company Fixed effects

$
0
0
Hi everyone,

I would like to calculate the acquirer fixed effects, but something is wrong with my stata Command.

My dataset contains of multiple acquirers which have done multiple deals over multiple years.


For every acquirer i have the cusip, permno and the cumulative abnormal return (car5).



The car5 should be the dependent variable my regression and the acquirer fixed effect should be the independent variable.




Im using the following Command:
xtset cusip year
streng car5 cusip, fe


This is what i get: repeated time values within panel, r(451)





But this does not work.


Could someone help me out?


Array

graph twoway by two variables?

$
0
0
I am trying to use graph twoway to plot bar graphs overlay line graph, but by two group variables: regionname (Central/North/South) and milestone (10 to 50).

I first try to plot three separate figures for each regionname, by [milesone] variable.
I then combine these three graphs together, and the result is not ideal.

Array


Apparently the grc1leg somehow does not work well. Combining figure approach also uses too much space. I only need one "milestone" heading as well.

Is there a way to graph twoway by two variables?
My full syntax is as below. Thanks a lot.

Code:
foreach region in "Central" "North" "South" {
graph twoway (bar upto_SolarCur year if regionname=="`region'", yaxis(1) color("255 200 68*2") barwidth(0.7) fintensity(inten100) ) ///
             (bar upto_WindCur year if regionname=="`region'", yaxis(1) color("128 188 0*2") barwidth(0.7) fintensity(inten100)) ///
             (bar upto_Solar year if regionname=="`region'", yaxis(1) color("255 200 68") barwidth(0.7) fintensity(inten100)) ///
             (bar upto_Wind year if regionname=="`region'", yaxis(1) color("128 188 0") barwidth(0.7) fintensity(inten100)) ///             
             (bar upto_Hydro year if regionname=="`region'", yaxis(1) color("84 192 232") barwidth(0.7) fintensity(inten100)) ///
             (bar upto_Other year if regionname=="`region'", yaxis(1) color("167 168 170") barwidth(0.7) fintensity(inten100)) ///
             (bar upto_Gas year if regionname=="`region'", yaxis(1) color("0 130 202") barwidth(0.7) fintensity(inten100)) ///
             (bar upto_Coal year if regionname=="`region'", yaxis(1) color("99 101 105") barwidth(0.7) fintensity(inten100)) ///
             (bar upto_Nuclear year if regionname=="`region'", yaxis(1) color("238 118 35") barwidth(0.7) fintensity(inten100)) ///
             (connected region_nativeloadmw year if regionname=="`region'", yaxis(1) color("black") msymbol(T)) ///
             (connected wgtPrice year if regionname=="`region'", yaxis(2) color("75 133 142*2")), ///
             by(milestone, rows(1) imargin(medium) note("") title("`region'", size(medsmall))) ///
             xtitle("") xlabel(none) ///
             ylabel(0(50)350, labsize(small) ang(horizontal) axis(1)) ///
             ylabel(10(5)30, labsize(small) ang(horizontal) axis(2)) ///
             legend(label(1 "Solar Cur.") label(2 "Wind Cur.") label(3 "Solar") label(4 "Wind") label(5 "Hydro") label(6 "Other") label(7 "Gas") label(8 "Coal") label(9 "Nuclear") label(10 "Native load") label(11 "Wgt. LMP") order(9 8 7 6 5 4 3 2 1 10 11) cols(6) keygap(*0.3) symxsize(*0.3) size(small)) ///
             graphregion(color(white)) plotregion(color(white)) 

gr_edit .style.editstyle boxstyle(shadestyle(color(white))) editcopy
gr_edit .l1title.style.editstyle size(small) editcopy            

graph save Graph\5558_Final_NoSensitivity_MH_Fix\gph\BAR_FuelMixCur_RIIA_`region'_overlay_load_price.gph, replace
}


** combine **
grc1leg "Graph\5558_Final_NoSensitivity_MH_Fix\gph\BAR_FuelMixCur_RIIA_Central_overlay_load_price.gph" ///
        "Graph\5558_Final_NoSensitivity_MH_Fix\gph\BAR_FuelMixCur_RIIA_North_overlay_load_price.gph" ///
          "Graph\5558_Final_NoSensitivity_MH_Fix\gph\BAR_FuelMixCur_RIIA_South_overlay_load_price.gph", ///
        cols(1) imargin(zero) title("MISO Regional FuelMix and Curtailment: by RIIA milestones" "Final with Solutions", size(medsmall)) ///
        l1title("Million MWh", size(small)) r1title("Wgt. LMP ($/MWh)", size(small)) ///
        graphregion(color(white)) legendfrom("Graph\5558_Final_NoSensitivity_MH_Fix\gph\BAR_FuelMixCur_RIIA_Central_overlay_load_price.gph")

foreach x of numlist 1/3{
gr_edit .plotregion1.graph`x'.title.xoffset = -57.5
gr_edit .plotregion1.graph`x'.title.yoffset = -2
}

Do file giving different tables

$
0
0
Hi Guys!
I'm trying to replicate a paper from the American Economic Association- Racial Discrimination in Grading: Evidence from Brazil. However, the tables are coming very differently from executing the DO file. Even the terms in the table do not match.
In the original do files, I replaced the .out to .txt to get a result but somehow it did not work. I have attached the paper and data link, and the table which I'm getting.
https://www.openicpsr.org/openicpsr/...ersion/V1/view (Data Link)
Hope someone helps me out as I'm new at Stata

I tried replacing outreg2 using "Table1.out" to outreg2 using "Table1.txt" but still things did not change.

export excel, sheet and xls document

$
0
0
Hi,

I have been experiencing difficulties with export excel when used with xls documents (not xlsx). The reason I prefer xls docs over xlsx is that it is twice faster to work with. To summarize my issue, I do something like:

Code:
forval i=1/20 {
export excel varlist using doc.xls, sheet(x`i', modify)
}
I am updating my doc.xls with new worksheets containing new data. The worksheets are not especially big but the doc.xls is (as it contains links and stuff).

It works fine for a while but after a few iteration I get the following error:
r(603).​​​​​​
file doc.xls could not be loaded

The weird think is that I can still open the doc.xls file on excel and it looks ok but after getting this message, I cannot upload any new worksheet using stata and export excel. I need to simply suppress my doc.xls and start all over again.

I never get this problem with doc.xlsx file. While much larger, doc.xls files are much faster to work with.

Do you have any idea what is going on?

PS: I suspected my DROPBOX first but after interrupting the sync, it is not the culprit.




Main directory Mac OS

$
0
0
Hi,

I am currently using Mac OS X stata, and I have some problems setting the main directory.

I am sharing the do files with my team and all the others use PC.

The global directory command is:

else if "`c(username)'"=="eddie"{
global maindir1 "/Users/Eddie/Desktop/Project/Data"
global maindir2 "/Users/Eddie/Desktop/Project/Data"


and later on I would like to use the data from this directory:

capture use "$maindir1/__.dta",clear

I have the problem using the main directory.

Thank you for the help!

Exporting marginal effect to excel/word document using outreg2 command

$
0
0
Hi everyone,
I have a long table of marginal effects and their corresponding SEs and CIs. I was wondering to know how can I use a command similar to outreg2 to export the output to excel/word.
Thanks.
Nader

How can I calculate the standard deviation with weight??

$
0
0
Dear All,
I have survey data. It's collected through stratified sampling method. I set a weight which means the inverse of the probability of the observation is included. Therefore,when I calculate the mean or run regression, I should use "pweight". But pweight can't be used to calculate sd, then what should I do to calculate the standard deviation? (I use "collapse" to calculate mean\median\sd)
Thank you!

How to export my Stata result in target?

$
0
0
Hi Statalisters,

During my coding work, I want to export my result in real time in whatever file form. So far, I just know to use code

"translate @Results mylog.txt
type mylog.txt"

to export txt with my full result.

However, I want to realize to export specified rows of results instead of the full result as well as exporting them into other file forms, like word or pdf. Is there any more salient solution to this?

Thanks in advanced

Generating a new variable conditional on multiple values - using 'or' command?

$
0
0
Hey,
I Hope someone can help me with this:

I have a dataset of households in various zip codes. I want to generate a variable that places the hholds into five different categories according to an area index of multiple deprivation using the zip codes that were correspondingly identified.

The logical way to do this in stata is to write a command that looks like this:

gen if aimd5=="12345" | zip=="82456" | zip=="56234" | zip==..............
replace if aimd5=="75687" | zip=="45688" | zip=="95689" | zip==..............
replace if aimd5=="14687" | zip=="34687" | zip=="64687" | zip==..............
replace if aimd5=="54687" | zip=="54645" | zip=="54687" | zip==..............
replace if aimd5=="64687" | zip=="78987" | zip=="21387" | zip==..............

There would be about 8000 "or" conditions The problem is, Stata doesn't allow this many "OR" conditions in that statement.
This command works fine if I only put in 10 zips or so.

Is there any way for me to perform this operation?
Kind regards

How do I export multiple diagnostic tests to one .csv table?

$
0
0
Hello STATAlist friends!
I'm working with STATA 15 on a project validating a several tests to gold standard. For this I use the "diagt" command, "ref" is gold standard

diagt ref1 test1
diagt ref2 test2
diagt ref3 test3
diagt ref4 test4
diagt ref5 test5

Each line produce a nice table:

[95% Confidence Interval]
---------------------------------------------------------------------------
Prevalence Pr(A) 33% 33% 33.5%
---------------------------------------------------------------------------
Sensitivity Pr(+|A) 44.3% 43.4% 45.2%
Specificity Pr(-|N) 78.8% 78.3% 79.3%
ROC area (Sens. + Spec.)/2 .615 .61 .621
---------------------------------------------------------------------------
Likelihood ratio (+) Pr(+|A)/Pr(+|N) 2.09 2.02 2.16
Likelihood ratio (-) Pr(-|A)/Pr(-|N) .707 .695 .719
Odds ratio LR(+)/LR(-) 2.95 2.82 3.1
Positive predictive value Pr(A|+) 50.7% 49.8% 51.7%
Negative predictive value Pr(N|-) 74.2% 73.6% 74.7%
---------------------------------------------------------------------------

How do I add all tests into one single table (each test with columns reporting estimate and ci) and export it as .csv? I´ve tried esttab but it doesn't work.

Thank you for your help!

Yours sincerely,
Mats

​​​​​​​

IV Regression Panel Data - xtivreg - lagged Variables

$
0
0
Hello,

I am working with StataIC16 and the panel data set mathpnl**, for the years 1992 - 1998-
I want to analyze the effect of real expenditure per pupil (lrexpp) on the pass rate of a standardized math test (math4).
First I have done a fixed effects regression.
Now I want to use the policy change (the law concerning school financing has changed in 1995) to perform an IV Regression, using the exogenous foundation Grant (found / fnd_1) as an instrument for my (potential endogenous) spending variables.

The first stage regression gives me a large R2 and F statistic, but unfortunately I am not getting convincing results when applying the xtivreg command.
I think that I have a potential problem with the instruments, and moreover I am unsure how to handle the lagged effect of lrexpp_1 in an IV Regression.

I also considered to use the average of spending as an explanatory variable.

Does anybody know, how to do an IV Regression with two potential endogenous variables that are lagged?



1.
global control y94 y95 y96 y97 y98 lunch lunchsq lenrol lenrolsq

. [xtreg math4 lrexpp lrexpp_1 $control , fe]

Fixed-effects (within) regression Number of obs = 3,300
Group variable: distid Number of groups = 550

R-sq: Obs per group:
within = 0.6029 min = 6
between = 0.0323 avg = 6.0
overall = 0.3132 max = 6

F(11,2739) = 378.06
corr(u_i, Xb) = -0.0610 Prob > F = 0.0000

------------------------------------------------------------------------------
math4 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrexpp | .449403 2.564402 0.18 0.861 -4.578955 5.477761
lrexpp_1 | 7.309974 2.384014 3.07 0.002 2.635326 11.98462
y94 | 6.116057 .5627774 10.87 0.000 5.012546 7.219568
y95 | 17.91013 .7078583 25.30 0.000 16.52214 19.29812
y96 | 17.71773 .7803005 22.71 0.000 16.18769 19.24776
y97 | 14.95576 .8245826 18.14 0.000 13.3389 16.57263
y98 | 29.65786 .8596281 34.50 0.000 27.97227 31.34344
lunch | .0730933 .1196749 0.61 0.541 -.1615689 .3077556
lunchsq | -.0002439 .0016056 -0.15 0.879 -.0033922 .0029044
lenrol | 10.09453 8.467345 1.19 0.233 -6.508501 26.69756
lenrolsq | -.7154355 .6097558 -1.17 0.241 -1.911063 .4801923
_cons | -58.92273 43.51133 -1.35 0.176 -144.2411 26.39561
-------------+----------------------------------------------------------------
sigma_u | 11.587473
sigma_e | 8.9971971
rho | .62387373 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(549, 2739) = 5.75 Prob > F = 0.0000


2.

[xtivreg math4 (lrexpp lrexpp_1 = lfound lfnd_1 ) $controliv y96 y97 , fe vce(cluster distid)]

Fixed-effects (within) IV regression Number of obs = 1,602
Group variable: distid Number of groups = 538

R-sq: Obs per group:
within = 0.3133 min = 1
between = 0.0053 avg = 3.0
overall = 0.0260 max = 3


Wald chi2(9) = 2191.22
corr(u_i, Xb) = -0.8524 Prob > chi2 = 0.0000

(Std. Err. adjusted for 538 clusters in distid)
------------------------------------------------------------------------------
| Robust
math4 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lrexpp | -10.48314 226.7776 -0.05 0.963 -454.959 433.9927
lrexpp_1 | 137.3179 262.9871 0.52 0.602 -378.1274 652.7632
lunch | .2588259 .6851738 0.38 0.706 -1.08409 1.601742
lunchsq | .0009495 .0094053 0.10 0.920 -.0174844 .0193835
lenrol | -27.92626 205.4506 -0.14 0.892 -430.602 374.7495
lenrolsq | 2.565563 10.14549 0.25 0.800 -17.31922 22.45035
y96 | -6.357215 3.641985 -1.75 0.081 -13.49537 .7809439
y97 | -11.55872 4.019192 -2.88 0.004 -19.43619 -3.681246
_cons | -973.5975 823.6287 -1.18 0.237 -2587.88 640.6851
-------------+----------------------------------------------------------------
sigma_u | 27.482767
sigma_e | 9.9361013
rho | .88439953 (fraction of variance due to u_i)
------------------------------------------------------------------------------
Instrumented: lrexpp lrexpp_1
Instruments: lunch lunchsq lenrol lenrolsq y96 y97 lfound lfnd_1

Store tab command's results

$
0
0
Hi all!

I created a variable indicating in which income decile individuals of a population are. I tabulated this variable:

. tab decile [aw=dwt]

decile | Freq. Percent Cum.
------------+-----------------------------------
1 | 4,824.751 10.01 10.01
2 | 4,821.9915 10.00 20.01
3 | 4,817.9334 9.99 30.01
4 | 4,819.9383 10.00 40.00
5 | 4,826.7076 10.01 50.02
6 | 4,812.9441 9.98 60.00
7 | 4,821.7949 10.00 70.00
8 | 4,828.6192 10.02 80.02
9 | 4,814.1271 9.99 90.01
10 | 4,818.1928 9.99 100.00
------------+-----------------------------------
Total | 48,207 100.00

I need to store in some locals the ten % (e.g. local dec01 should be equal to 10.01, local dec02 should be equal to 10.00 and so on).
In a second step, I'd like to do a bar chart in which to show as a bar the mean income of each decile and as a dot the % of inidividuals in the same decile.

Thank you in advance.

How to extract one variable from a string?

$
0
0
We have a dataset of different movies but genres contain more than one variable. For example, under the var genre, each movie can be "fantasy, horror, action" all together. how to make stata
extract only one var?
Array

Robust option in a ols model

$
0
0
Hi everyone,
I was wondering if one can use robust option in a ols regression model that includes both individual and higher level data. In other words, does robust option in such a ols model provides unbiased SEs in a two-level model or a multilevel model must be used. Thanks

How do I create a constant sample of firms

$
0
0
Hi, I would need some help with the following problem please, I'm a new STATA user:

I have a list of firms from COMPUSTAT from 1950 to 1998. I need to hold the number of firms constant between 1968 and 1998 (meaning that I need to keep the firms that existed for all those 31 years). Also, I have to keep those firms between 1950 and 1967, that still existed in 1998. I'm posting a photo of the sample I need to create- firms from 1668 and 1998 are steady 896, but firms before that period are those that existed both in that year and in 1998.
I know how to create a constant sample between 1968 and 1998- that is I need to use:
keep if fyear>=1968
keep if fyear<=1998
bys conm: gen nyear=[_N]
keep if nyear==31
But I don't know how to create the additional period of 1950- 1967 in the same file.

How to deal with error correlation model with Stata

$
0
0
regression function is : Array
Array




In the following data, variable value is the value of 1952, pi1952 is the price index based on 1952
and the time range is from 1953-1978. So, I drop data if year is 1952 and 1990 after changing based price using 1990 price index.
note: when t=0, it means 1978 and t-1 is 1977.

data:
[CODE]
* Example generated by -dataex-. To install: ssc install dataex
clear
input int t double(value pi1952)
1952 80.7 1
1953 115.3 .988
1954 140.9 .982
1955 145.5 .94
1956 219.6 .937
1957 187 .897
1958 333 .901
1959 435.7 .976
1960 473 .973
1961 227.6 .955
1962 175.1 1.025
1963 215.3 1.075
1964 290.3 1.053
1965 350.1 1.018
1966 406.8 .998
1967 323.7 1.002
1968 300.2 .967
1969 406.9 .945
1970 545.9 .945
1971 603 .955
1972 622.1 .967
1973 664.5 .968
1974 748.1 .969
1975 880.3 .981
1976 865.1 .988
1977 911.1 1.002
1978 1073.9 1.008
1990 . 1.862
end

I change the value to 1990 price, and related codes are as following:
gen index=1/1.862
gen pi1990=pi1952*index
gen year=t-1978
drop in 1
tsset year,yearly
replace value=value/pi1990
gen lv=log(value)
drop if t==1990
reg lv year
predict e,res
twoway (scatter e e2)(lfit e e2)
twoway (scatter e e3)(lfit e e3)
estat bgo // Prob > chi2 0.0036<0.05
wntestq e1 //Prob > chi2(11) = 0.0469
di 26^(1/4) // and gain 3

newey lv year,lag(3)

I actually regress the data and cannot obtain the same coefficients as the author.
I don't know what problems I have in my codes, please help me.

and my results: Array





but the author gains coefficients like this: Array
Array
Viewing all 72840 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>