Quantcast
Channel: Statalist
Viewing all 73250 articles
Browse latest View live

Difference of a variable with time and id

$
0
0
Hi
My data is of the following form:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(month id return)
555  1 .00057206413
555  2   -.07322104
555  3   -.06859436
555  4   -.07735524
555  5   -.03938038
555  6   .007824589
555  7   -.04908807
555  8   .003092327
555  9   -.10695495
555 10   -.04521626
556  1   -.04298264
556  2   -.05878122
556  3  -.033808254
556  4  -.011589252
556  5  -.017557062
556  6  -.004073501
556  7   .001721355
556  8  -.010957217
556  9   .015607742
556 10  -.013908166
557  1  -.010737965
557  2  -.018535594
557  3    .03754317
557  4   -.07786346
557  5  -.074161746
557  6   -.05729289
557  7   -.12252308
557  8    -.0617717
557  9  -.020674044
557 10   -.03216382
558  1    .03763035
558  2   .009898664
558  3   .024982596
558  4   .027306067
558  5    .02957879
558  6    .02146479
558  7   .032690376
558  8    .02783219
558  9    .02452107
558 10    .03302523
559  1    .02270367
559  2  .0017711997
559  3   .014527231
559  4   .017970225
559  5   .007367407
559  6   .031037053
559  7   .028186474
559  8   .023721296
559  9   .019167695
559 10   .018264148
560  1    .04948542
560  2    .03846701
560  3    .02795666
560  4    .02451953
560  5   .037013665
560  6   .034662504
560  7    .04399181
560  8    .04507814
560  9    .04191253
560 10     .0580038
561  1   .019825436
561  2   .032455444
561  3    .03048469
561  4    .01998458
561  5   .011587504
561  6    .02427252
561  7    .03101354
561  8   .015491064
561  9    .01145711
561 10   .017419985
562  1   .003733994
562  2  -.002126549
562  3 -.0025371325
562  4   .015725277
562  5  -.005115598
562  6   .006009568
562  7   .004166465
562  8   .011407542
562  9   .004518013
562 10   .002893288
563  1    .01868514
563  2   .014283576
563  3    .01902623
563  4   .012530405
563  5   .021287354
563  6   .008199372
563  7    .01771125
563  8   .019615415
563  9    .01664819
563 10  -.006281366
564  1  -.004248609
564  2  .0010157756
564  3  -.007604746
564  4    .01184316
564  5 .00020535664
564  6  -.003318424
564  7  -.007787342
564  8   .005721211
564  9  -.004952917
564 10   .006883996
end
format %tm month
For every month (I guess the months have been converted to Stata numbers, when I pasted them) I want to take the difference in returns between stocks with id 10 and id 1. Would appreciate if someone could help me out in taking the difference. Thank You.

How to put a mata ml evaluator in a SSC package

$
0
0
I have two files: File1 (ado) for an estimation command, and File2 (mata) for the mata based ml estimation function. File2 has mata mlib lines to create a library, and when I compile File2 on my computer, File1 works well. File1 would only work if File2 is run at least once by the user.

I want to make this program available to other users on SSC, and I know about how to upload ado files on SSC, but I am not sure how to upload an ado file on SSC that relies on the compilation of a mata file. How is it generally done? Do you put those two files in a folder and a readme text file to tell the user to compile the mata file first? Or, when you install a package from SSC, does it automatically compile the mata file for the user so that the ado file would work without a need to manually compile the mata file first? Or do you do something in File1 so that File1 would compile File2 when File1 is run by the user?

Below is a simplified example:
Code:
// File1 (ado)
...
program Estimate, eclass
...
ml model lf myprob() (...)
ml max
...
end


// File2 (mata)
mata:
void myprob(transmorphic scalar ML, real rowvector b, real colvector lnfj) {
    depvar = moptimize_util_depvar(ML, 1)
    xb = moptimize_util_xb(ML,b,1)
    lnfj = ...
}
end
mata: mata mlib create lmyprob, dir(PLUS) replace
mata: mata mlib add lmyprob myprob(), dir(PLUS)

Smooth out Graph

why is there big difference between zip and zinb using same variables?

$
0
0
Hi everyone. I am running zero-inflated poisson models because of many zeros (1/4 of the outcome). I run zip and zinb before and found two models including same predictors produced similar results. However in the present study, zip and zinb models produced quite different results in count equations (similar results in logistic regression part). It's wired. The results are as follows:

ZIP model
---------------------------------------------------------------------------------
Outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
outcome (count equation) |
1sc| .103509 .0356089 2.91 0.004 .0337169 .1733011
Rpr| -.0314016 .0062603 -5.02 0.000 -.0436716 -.0191316
Rex| .0233748 .0058477 4.00 0.000 .0119134 .0348361
Rsu| -.0073679 .0051857 -1.42 0.155 -.0175317 .0027959
Ppr| .0176424 .0052335 3.37 0.001 .007385 .0278999
Pex| .0106725 .0048924 2.18 0.029 .0010836 .0202615
Psu| .0108973 .0035661 3.06 0.002 .003908 .0178866
_cons | 1.210286 .1851025 6.54 0.000 .8474913 1.57308
----------------+----------------------------------------------------------------

ZINB model
---------------------------------------------------------------------------------
outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
outcome (count equation) |
1sc | .116671 .067785 1.72 0.085 -.0161852 .2495271
Rpr | -.0364287 .0113316 -3.21 0.001 -.0586382 -.0142193
Rex| .0219059 .0112583 1.95 0.052 -.0001599 .0439717
Rsu| -.0055789 .0094421 -0.59 0.555 -.0240852 .0129273
Ppr| .0168384 .0097909 1.72 0.085 -.0023514 .0360282
Pex| .0142822 .0094063 1.52 0.129 -.0041538 .0327183
Psu| .0104673 .0067804 1.54 0.123 -.002822 .0237566
_cons | 1.131097 .3383747 3.34 0.001 .4678948 1.794299

I would appreciate it if anyone could help

Is there a command to test sphericity of a given polychoric correlation matrix?

$
0
0

I am doing confirmatory factor analysis with ordinal variables. I know that the factortest command runs the barttlet sphericity test, but under the assumption of normality. Anyone know if there is any command that allows me to run this type of test using as input a given matrix of polychoric correlations. This would be analogous to test the null hypothesis that the determinant of that matrix is ​​different from zero. Thank you

encode a string variable according to subsrt

$
0
0
Hi, I have a string variable, vq9_breakupdivorce, which has only 2 string values.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str23 vq9_breakupdivorce
"NO TO: Break up/divorce"
"NO TO: Break up/divorce"
"Break up/divorce"       
"Break up/divorce"       
"NO TO: Break up/divorce"
end
I want to encode it to 0 and 1. And 0 denotes NO TO: and 1 denotes otherwise. How to use substr? Because I have multiple such variables, and all of them are in this form (NO TO: = 0).

Can esttab produce percent change results for mixed models with log-transformed dependent variable?

$
0
0
Hi Statalist,

Is it possible to create a coefficients table that produces percent change instead of covariates on the log scale? In other words, I want to exponentiate beta, subtract 1 from that value, and then multiply that value by 100? I know that eform does one part of this but is it possible to use esttab to show the percent change instead of only exponentiating beta ?

Also, this article does provide a mixed model example but I am still not sure how to incorporate: (exp(beta)-1)*100

One idea I had was to try and follow this example of how to use margins but I don't think it will work with my mixed model that has random effects and a Toeplitz residual structure.

Thank you for any insights you can provide!

multilevel ordinal probit model_meoprobit_icc

$
0
0
I am intending to establish a multilevel ordinal probit model. The dependent variables is injury severities and the model has three level: injury level, vehicle level and crash level. The estimation codes are as below:

meoprobit injuryextent i.age i.positioninveh i.thrownout i.seatbelt i.vehyear || reportid:|| vehid:
However, when I use the order "estat icc" to calculate the intraclass terms, the Stata presents that "requested action not valid after most recent estimation command". Is it the error of my data construction, or some else? Do you know how to calculate the intraclass terms in this situation?

outreg2 after tab1

$
0
0
Using the below code, I generate several one-way frequency tables. I would like to send all the tables to excel/word using a single outreg2 command. Also, I was wondering if I can have all of the tables in a single table together.

by NH_RCF, sort : tab1 BQ4 BQ5 agecat_ agecat Race BQ6 BQ8 BQ7 BQ10 size_cat Ownership urban, m

asdoc : addition of bysort prefix with tab, tab1, tab2 commands

$
0
0
I have just added support for the bysort prefix with tabulation commands in asdoc. Details and examples can be found here https://fintechprofessor.com/2020/03...n-asdoc-stata/

The new version of asdoc can be installed from my site. Copy and paste the following line in Stata and press enter.
Code:
net install asdoc, from(http://fintechprofessor.com) replace
Please note that the above line has to be copied in full. After installation of the new version, then restart Stata.

Please do remember to cite asdoc. To cite:
In-text citation
Tables were created using asdoc, a Stata program written by Shah (2018).

Bibliography
Shah, A. (2018). ASDOC: Stata module to create high-quality tables in MS Word from Stata output. Statistical Software Components S458466, Boston College Department of Economics.

Poisson command on panel data

$
0
0
when i run the following commnad
xtpqml lexp_food year mig_stats_16 mig_stats_yr sex sector edu_hh employ marital_st HSIZE, fe i(HHID)
note: you are responsible for interpretation of non-count dep. variable

the result reveals "note: you are responsible for interpretation of non-count dep. variable".. why does this happen

Within estimator estimate

$
0
0
I have the following equation:
Array

in my data I have year and Id

I want to get the within the parameter estimates of the above regression
Which one of the following would be correct and why?

xtreg A L1.A L1.B L1.C yr*, fe or

xtreg A L1.A L1.B L1.C, fe or

xtreg A B C yr*, fe or

xtreg A B C, fe


Can anyone help me? Should I include vce(robust)? If not, why?

Querying multiple variables at a time

$
0
0
Hi everyone,
I am working in a database where each patient has fifteen procedure variables: proc1 proc2 proc3 ... proc14 proc15
I would like to generate a new variable, "int", if any variable from proc1-proc15 == "intubation".
Is there a more efficient way to achieve this than:

generate int = 0
replace int = 1 if proc1=="intubation"|proc2=="intubation"|proc3=="in tubation"|proc4=="intubation"|proc5=="intubation"| etc etc etc

Thank you in advance!

Repeated time values within panel when declaring Panel Data

$
0
0
Hi all,

I have a panel data on bilateral trade volume of 55 country pairs for 15 years time period (so an unbalanced 15*55*15 matrix).
So far, I am done with reshaping and trying to declare the panel data before running the regression.

when I ran the following command:

egen group = group(countryi countryj)
su group, meanonly
xtset group year


the result revealed
xtset group year
repeated time values within panel
r(451)
;


In an attempt to fix it, I did

code
drop if year==year[_n-1]
xtset group year, yearly


and I got the panel with weak balanced. Then I ran the regression and got the r2000, no observation.

I could not figured out how best can I fix it. Any help or guidance would be much appreciated.
I am happy to provide with more details if needed.

Cheers,
Thanh Hao

three way panel data regression with fixed effects and clustered error term

$
0
0
Hi!

I want to do a regression on panel data, the data consists of a bank (subscript i), in a country (subscript c), in a specific year (subscript t); and other dependent variables A_ict, B_ict, C_ict.
Please note that I need to included fixed effects (FE) in bank (i) and year (t). (but not report them, neither should the constant be reported)
Next, standard errors should be clustered at bank (i) level.

Can you confirm if the following is the right code to use in Stata?


INDEP_ict = α + β1 A_ict + β2 B_ict + β3 C_ict + u_i + v_t + ε_it

egen panel_id = group(bank country)
xtset panel_id year
xtset id year
xtreg INDEP A B C i.year, fe vce(cluster bank)
outreg2 using regression_results, replace excel dec() drop(i.year)


Thank you in advance.

Is Regression and Granger Causality Related?

$
0
0
Hi Everyone!


Short Version:
Is it normal that "variable A" granger causes "variable B" but "variable A" is not a good regression independent variable for "variable B". So when you regress A as Independent and B as the dependent, A has really high P-value/low significance.


Long Version:
I am currently doing my dissertation investigating the relationship between Google search volume (search volume index = SVI) and stock prices, and I have two main objectives:

1. To investigate whether adding SVI into Fama-french's 3-factor model actually helps improve the model
(For those who are not familiar with the model: Dependent is the stock price premium or can also be seen as simply stock prices, Independent is some parameters not important for this question and the added Google SVI).
2. To investigate whether Google SVI granger causes stock prices movements (or the other way around)

So cut to the chase, I have analysed them using panel regression and found that Google SVI is a bad addition to the Fama-french's model (high SVI P-value and falling adjusted R squared when SVI is added). I.e. from this analysis, Google SVI is not an important variable to consider for predicting stock prices.

On the other hand, I found that Google SVI granger causes stock prices movements (Very low P-values from 0.17-0.47 between 2 periods lag to 7 periods lag).

I found that these results are counter-intuitive: if SVI granger causes the stock price movements it should also be a good variable addition to the Fama-French model, but it isn't. Is this normal, i.e. does not perform well in regression but does well in Granger test?

Thanks in advance!

loop or other methods?

$
0
0
Dear All, I have this dataset.
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte country float year double I float K2007
1 2007 47639371944.1504 1.027757e+12
1 2008 47079741922.4759 1.027757e+12
1 2009 23310776752.8605 1.027757e+12
1 2010 23169892635.7175 1.027757e+12
1 2011 25140632088.3109 1.027757e+12
1 2012 26392333282.9276 1.027757e+12
1 2013 24184560713.7457 1.027757e+12
1 2014 18385377287.1616 1.027757e+12
1 2015 16695019910.2777 1.027757e+12
1 2016  20107616311.306 1.027757e+12
1 2017 23337869852.3111 1.027757e+12
1 2018  26664902464.842 1.027757e+12
2 2007 6067174174.07463 276483178496
2 2008 7784184466.76024 276483178496
2 2009 9714662213.96143 276483178496
2 2010 10608411137.8261 276483178496
2 2011 11446475617.9555 276483178496
2 2012 12774266788.7859 276483178496
2 2013 14026144934.6544 276483178496
2 2014 15551083505.0151 276483178496
2 2015 17012885354.2527 276483178496
2 2016 17710413653.7783 276483178496
2 2017 21146233903.1325 276483178496
2 2018 24973702239.7895 276483178496
end
label values country country
label def country 1 "乌克兰", modify
label def country 2 "乌兹别克斯坦", modify
For each `country', I'd like to generate a variable `K' which is K_t=(1-0.06)*K_{t-1} + I_t. The value of K in year 2007 is K2007. Any suggestions? Thanks.

How to change the font in GRAPHIC

$
0
0
Hi everyone! A journal asks me to set the font as Arial in a graphic. I was not able to find a way to do this. Can you help me? Thanks in advance

How to convert monthly treasury bill rate into daily rate ?

$
0
0
Respected All,
Can anyone help me out that how to convert monthly treasury bill rate into daily rate ?

Regards,
Saba Kausar

How do I keep "missing values" missing?

$
0
0
Hello everyone,

I am currently working with Panel Data (firm, year) for my seminar paper and I am in the phase of preparing the data for the analysis. My problem now is as followed:

I generated the variable CETR (Cash Effective Tax Rate) with the command
gen CETR = CF_TAXATION / PRETAX_INCOME

The results included some negative values, some values larger than 1 and also missing values.

Now, in an effort to control for outliers I wanted to winsorize the values for CETR to 0 and 1, i.e. if CETR has a value >1 it should be defined as 1 and if CETR<0 it should be defined as 0.
replace CETR=0 if CETR<0
replace CETR=1 if CETR>1

After looking at the results, I observed that Stata now assigned the value 1 to originally missing data of CETR, because Stata treats missing values as positive infinity. Since I have a significant amount of missing data this biases my results substantially. So my question is therefore, how do I have to alter the previous commands to prevent such a biased result or i.e. how do I tell Stata to keep missing values missing in such a setting?

Thanks in advance and kind regards,

Lucas
Viewing all 73250 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>