Quantcast
Channel: Statalist
Viewing all 72762 articles
Browse latest View live

countfit macro problems

$
0
0
Hello, I am trying to download the countfit macro for use with count models (http://www.ats.ucla.edu/stat/stata/faq/countfit.htm). For some reason, the macro is not coming up as an option when I use the findit or ssc install command. Can anyone advise on this or should I just contact the author of the command?

Stata appears to be rounding at random

$
0
0
Nevermind. It is late and I am being stupid. Feel free to delete this post.

Comparisons of means at timepoints (beginner question)

$
0
0
Hi all,

I have fit the model below, and just wanted to make sure of the correct way to do a certain set of pairwise comparisons. I want to see whether the average on the outcome variable (adhdsev) was significantly higher than at the starting point (TimeWeight=0), for each of the four remaining timepoints (TimeWeight 1-4). I'd like to do this separately by i.adhdsubtype (3 possible subtypes, for a total of 12 pairwise comparisons). I would also like to do the same for overall means (not broken down by subtype, 4 pairwise comparisons). I was able to run margins with confidence intervals but I thought something like pwcompare might be necessary in order to do multiple comparisons. However I'm uncertain as to how to specify the levels for pwcompare for a continuous variable.

I also wanted to find out if there's any caveats in interpreting comparisons (or margins) given the presence of the additional interaction in the model of i.adhdsubtype#c.TimeWeight#c.TimeWeight. E.g. are the marginal means set for a certain value in the deceleration of the slope that might make interpretation difficult?

Code:
. mixed adhdsev c.TimeWeight c.TimeWeight#i.adhdsubtype c.TimeWeight#c.TimeWeight i.adhdsubtype#c.TimeWeight#c.TimeWeight, ///
>            || id: TimeWeight, variance mle covariance(unstructured) ///
>            residuals(independent,t(TimeWeight)), 
Note: t() not required for this residual structure; ignored

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -434.58713  
Iteration 1:   log likelihood = -432.68323  
Iteration 2:   log likelihood = -432.66667  
Iteration 3:   log likelihood = -432.66667  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =        244
Group variable: id                              Number of groups  =         93

                                                Obs per group:
                                                              min =          1
                                                              avg =        2.6
                                                              max =          4

                                                Wald chi2(6)      =      62.99
Log likelihood = -432.66667                     Prob > chi2       =     0.0000

-----------------------------------------------------------------------------------------------------------------
                                        adhdsev |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------------------------------------+----------------------------------------------------------------
                                     TimeWeight |  -.1340633    .360817    -0.37   0.710    -.8412517     .573125
                                                |
                       adhdsubtype#c.TimeWeight |
         ADHD, Predominantly Innattentive Type  |  -1.540448   .4016991    -3.83   0.000    -2.327764   -.7531322
ADHD, Predominantly Hyperactive-Impulsive Type  |  -2.915497   .9290948    -3.14   0.002    -4.736489   -1.094505
                                                |
                      c.TimeWeight#c.TimeWeight |  -.0286886   .1192213    -0.24   0.810    -.2623581    .2049809
                                                |
          adhdsubtype#c.TimeWeight#c.TimeWeight |
         ADHD, Predominantly Innattentive Type  |   .4004778   .1331424     3.01   0.003     .1395235    .6614322
ADHD, Predominantly Hyperactive-Impulsive Type  |   .7815377   .3276643     2.39   0.017     .1393275    1.423748
                                                |
                                          _cons |   4.752859   .1239529    38.34   0.000     4.509916    4.995802
-----------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured             |
               var(TimeWe~t) |   .2278224   .0975934      .0983927     .527509
                  var(_cons) |   .0480474   .0567667      .0047424    .4867911
         cov(TimeWe~t,_cons) |   .1046244   .0546602     -.0025076    .2117563
-----------------------------+------------------------------------------------
               var(Residual) |   1.463658   .1712223      1.163761    1.840836
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 21.75                 Prob > chi2 = 0.0001

Note: LR test is conservative and provided only for reference.

. 
.                    estimates store quadpredict ,

.                   lrtest final quadpredict

Likelihood-ratio test                                 LR chi2(2)  =     10.69
(Assumption: final nested in quadpredict)             Prob > chi2 =    0.0048

. margins i.adhdsubtype, at(TimeWeight=(0(1)4)) vsquish

Adjusted predictions                            Number of obs     =        244

Expression   : Linear prediction, fixed portion, predict()
1._at        : TimeWeight      =           0
2._at        : TimeWeight      =           1
3._at        : TimeWeight      =           2
4._at        : TimeWeight      =           3
5._at        : TimeWeight      =           4

-------------------------------------------------------------------------------------------------------------------
                                                  |            Delta-method
                                                  |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------------------------------------+----------------------------------------------------------------
                                  _at#adhdsubtype |
                           1#ADHD, Combined Type  |   4.752859   .1239529    38.34   0.000     4.509916    4.995802
         1#ADHD, Predominantly Innattentive Type  |   4.752859   .1239529    38.34   0.000     4.509916    4.995802
1#ADHD, Predominantly Hyperactive-Impulsive Type  |   4.752859   .1239529    38.34   0.000     4.509916    4.995802
                           2#ADHD, Combined Type  |   4.590107   .2435981    18.84   0.000     4.112664    5.067551
         2#ADHD, Predominantly Innattentive Type  |   3.450137   .1640803    21.03   0.000     3.128545    3.771728
2#ADHD, Predominantly Hyperactive-Impulsive Type  |   2.456148   .5955657     4.12   0.000      1.28886    3.623435
                           3#ADHD, Combined Type  |   4.369978   .3426055    12.76   0.000     3.698483    5.041473
         3#ADHD, Predominantly Innattentive Type  |   2.890993   .2343395    12.34   0.000     2.431696     3.35029
3#ADHD, Predominantly Hyperactive-Impulsive Type  |   1.665134   .7704188     2.16   0.031     .1551414    3.175128
                           4#ADHD, Combined Type  |   4.092471   .4908227     8.34   0.000     3.130477    5.054466
         4#ADHD, Predominantly Innattentive Type  |   3.075428   .3050299    10.08   0.000      2.47758    3.673275
4#ADHD, Predominantly Hyperactive-Impulsive Type  |   2.379819   1.083324     2.20   0.028     .2565441    4.503094
                           5#ADHD, Combined Type  |   3.757588   .9069076     4.14   0.000     1.980081    5.535094
         5#ADHD, Predominantly Innattentive Type  |   4.003441    .508255     7.88   0.000     3.007279    4.999602
5#ADHD, Predominantly Hyperactive-Impulsive Type  |   4.600202    2.19747     2.09   0.036     .2932402    8.907164
-------------------------------------------------------------------------------------------------------------------

. marginsplot, x(TimeWeight)

  Variables that uniquely identify margins: TimeWeight adhdsubtype



Unable to Run Accrual Model

$
0
0
Hello, I am trying to run the accruals model found at http://www.statalist.org/forums/foru...=1472789396393 but I am having trouble. For the first set of code, I receive the error message "no observations" after obtaining a few dozen values. For the second set, which includes capture, I get more observations (about 110) but then I receive the error message "no variables defined". Then the final set produces even more observations, but when I check back a few hours later, the code appears to be somehow paused as Stata is no longer producing any new changes to the data, but the gear status wheel is still spinning and I cannot break the code without exiting Stata through task manager. I am not sure how to get this code to run correctly after these three attempts at achieving accrual glory.

(1)
Code:
gen uhat = .
forvalues j = 1/`=_N' {
regress totalaccruals assetsinverselag dev ppe if sic_2 == sic_2[`j'] & year == year[`j'], nocons
if e(N) >9 {
replace uhat=totalaccruals-(_b[assetsinverselag]*assetsinverselag + _b[dev]*dev + _b[ppe]*ppe) in `j'
}
}
}
(2)
Code:
gen uhat = .
forvalues j = 1/`=_N' {
capture regress totalaccruals assetsinverselag dev ppe if sic_2 == sic_2[`j'] & year == year[`j'], nocons
if e(N) >9 {
replace uhat=totalaccruals-(_b[assetsinverselag]*assetsinverselag + _b[dev]*dev + _b[ppe]*ppe) in `j'
}
}
}
(3)
Code:
gen uhat = .
forvalues j = 1/`=_N' {
capture regress totalaccruals assetsinverselag dev ppe if sic_2 == sic_2[`j'] & year == year[`j'], nocons
if !_rc & e(N) >9 {
replace uhat=totalaccruals-(_b[assetsinverselag]*assetsinverselag + _b[dev]*dev + _b[ppe]*ppe) in `j'
}
}
}

ARDL panel model in STATA

$
0
0
Hi mates, I have a question about ARDL model in Stata
As I know The XTPMG command can calculate the ARDL (1 1). here is my command :
xtpmg d.lreerq d.lgdppcq d.nfagdpq d.ltotq d.lopenq d.lgovgdpq d.flo, lr(lreerq lgdppcq nfagdpq ltotq lopenq lgovgdp flo) ec(ECT) replace omg
After running this command I am not getting significant coefficients , so i want to run the ARDL(121111)
I generated a new variable which is the lagged (1) of the lgdppcq and named it lgdppcq_lag using the following command :
gen lgdppcq_lag = lgdppcq[_n-1]
Then I substituted the lgdppcq by lgdppcq_lag in the xtpmg command , as following :
xtpmg d.lreerq d.lgdppcq_lag d.nfagdpq d.ltotq d.lopenq d.lgovgdpq d.flo, lr(lreerq lgdppcq_lag nfagdpq ltotq lopenq lgovgdp flo) ec(ECT) replace omg
MY QUESTION is :
Am I running an ARDL(121111) model by this command ?
waiting your answer, and thank you very much in advance ...


Regards

Reshape long to wide: need to drop not constant observations

$
0
0
Hello Statalist members!

I have a long dataset, which I need to reshape to wide. Some observations for some variables within i are not constant. I need to tag these observations to be able to drop them. Any ideas on how I can achieve this?

Thanks!

Best regards

Mikael Lundholm
PhD student
sociology of law
Lund University
Sweden

Creating a time index

$
0
0
My data currently has a period variable which is a number with the first 4 digits being the year, the second two being the month - e.g. Jan 2012 is 201201. These run from Jan 2012 - Dec 2015. Is there a way to create a new variable where these periods are numbered from 1 to 48?

Calculating returns from CRSP dataset: 5 quintiles portfolio's

$
0
0
I am working on a thesis project. Part of my project is calculating returns in 5 generated portfolio's based on 36 month volatility in the 1960 - 2015 period using the CRSP dataset.

Sample of dataset:

permno date comnam prc ret shrout
10001 30sep1986 GREAT FALLS GAS CO 6.375 -.003076924 991
10001 31oct1986 GREAT FALLS GAS CO 6.625 .039215688 991
10001 28nov1986 GREAT FALLS GAS CO 7 .056603774 991
10001 31dec1986 GREAT FALLS GAS CO 7 .015 991
10001 30jan1987 GREAT FALLS GAS CO 6.75 -.035714287 991
10001 27feb1987 GREAT FALLS GAS CO 6.25 -.074074075 991
10001 31mar1987 GREAT FALLS GAS CO 6.375 .036800001 991
10001 30jun1987 GREAT FALLS GAS CO 5.875 .051428571 991
10001 31jul1987 GREAT FALLS GAS CO 6 .021276595 991
10001 31aug1987 GREAT FALLS GAS CO 6.5 .083333336 991
10001 30sep1987 GREAT FALLS GAS CO 6.25 -.022307692 992
10001 30oct1987 GREAT FALLS GAS CO 6.375 .02 992
10001 30nov1987 GREAT FALLS GAS CO 6.1875 -.029411765 992
10001 31dec1987 GREAT FALLS GAS CO 5.875 -.033535354 992
10001 29jan1988 GREAT FALLS GAS CO 6.25 .063829787 992
10001 29feb1988 GREAT FALLS GAS CO 6.75 .079999998 992
10001 31mar1988 GREAT FALLS GAS CO 6.125 -.0762963 992
10001 30jun1988 GREAT FALLS GAS CO 6.25 -.012038835 992

prc = Price
ret = monhtly return including dividends
shrout = Shares outstanding

This is my code so far.

Generating 36 month volatility:
Code:
gen monthlydate = mofd(date)
format monthlydate %tm
rangestat (sd) sd_ret= ret, interval(monthlydate, -36, -1) by(permno)
by permno (monthlydate), sort: replace sd_ret = . if _n < 37
Generating quintiles:
Code:
egen quintiles = xtile(sd_ret), by(monthlydate) nq(5)
Generating Market Capitalisation and Portfolio weights. The weight of the individual stock is determined by the market capitalisation of the company divided by the total market capitalisation of all the companies in the certain month in the quintile portfolio.
Code:
gen mktcap=shrout*prc
tostring quintiles, generate (qstr)
tostring monthlydate, generate (mtdstr)
gen qmth = qstr + "x" + mtdstr
by qmth, sort: egen mktcap_q_month = sum(mktcap)
gen value_weight = mktcap / mktcap_q_month
Generating Portfolio returns:
Code:
gen wght_ret = value_weight * ret
by qmth, sort: egen qmthreturn = sum(wght_ret)
by quintiles, sort: egen avg_qmthreturn = mean (qmthreturn)
gen avg_q_yr_return = avg_qmthreturn * 12
This gives me yearly portfolio returns that do not make any sense to me, not in line with academic literature:

Q1: 13.48%
Q2: 17.43%
Q3: 21.49%
Q4: 27.78%
Q5: 39.97%


Here's a dropbox with my dataset in which I've executed the code above: Link

I'd like to know what went wrong. Can you give advice? Thanks in advance!











How can I drop observations equal certain numbers from two separate variables?

$
0
0
How can I drop observation equal certain numbers (in my case 0, -9) from two separate variables? The requirement is: drop only if observations in both variables equal 0 or -9, meaning that if one of the observations in V1 or V2 is any other number, it should not be dropped.

PS. I am new to stata

Code setting for regression by city and year

$
0
0
In the topic entitled "regression by city and year", which I created the in the forum in March, I got help to build a syntax that would allow me to apply the command "hoi" for each state (uf) and year (ano) on a database composed of 27 states and 9 years, whereas logistic regression that the command "hoi" generates would consist of one outcome and nine predictors. The code I'm using is as follows:

capture postutil clear
postfile handle ufs ano hoi_1 d_1 p_1 using hoi_agua, replace

local predictors presmae metrop area logrenpcdef nmorad refsexo refraca medescresp difescresp

local outcomes agua

levelsof uf, local(ufs)
foreach c of local ufs {
levelsof ano if uf == `c', local(anos)
foreach y of local anos {
display "uf = `c', ano = `y'"
foreach o of varlist `outcomes' {
capture noisily hoi `o' `predictors' [fw = pesopes] ///
if uf == `c' & ano == `y', format(%9.3f) estimates decomp1
if c(rc) == 2000 { // hoi FAILED DUE TO NO OBSERVATIONS
display "Nao ha observacoes ou o outcome nao e dicotomico `o': analise ignorada"
}
else if c(rc) != 0 { // SOME OTHER ERROR AROSE ATTEMPTING hoi
display in red "Erro encontrado ao executar o ioh com o outcome `o', uf = `c', ano = `y'"
exit c(rc) // SHOW ERROR CODE AND STOP
}
if c(rc) ==0 post handle (`c') (`y') (`r(hoi_1)') (`r(d_1)') (`r(p_1)')
}
}
}


postclose handle
I repeated this seven times code because I needed to do the same thing for seven different outcomes.
At the moment I need to adjust the code to apply the command "hoi" and capture the results for five specific years and if possible for all outcomes at once, so it is not necessary for me to repeat the new code seven times.
I would like your help again.

Thanks in advance.

Pevious multiplication vs single variables for instrumenting

$
0
0
I think, that this is a more theoretical question, but I can't find a solution in literature.

I am running a 2sls regression, where one variable cause a causality problem. Therefore I have instrumented it. The problem is, that the instrumented value is panel data (value for each country and year) and the instruments are country specific ( constant over time ) and time specific ( constant for all countries) . Does it make a difference between premultiplying all instruments to get a panel "instrument variable " or connecting all linear and run a ols regression ?

kind regards, Noah

R-Squared oaxaca (mi estimate)

$
0
0
Dear Statalist members,
Does anyone know whether it is possible to obtain r-squared (and adj. r-squared) values for oaxaca analyses using mi estimate?
Thank you,

Industry weights for post event years from a pre event year without producing only missing values

$
0
0
Hello,

I have panel data with 5,651 observations of a Mergers and Acquisitions Sample. I want to compute the relative industry weights for the target and acquirer industries (for weighting returns, operating profits, etc.) in the three years before and after the year the merger took place.
In the three premerger years this is no problem since it can be easily computed for each year. In all of the three postmerger years however the relative industry weights shall be the weights of the premerger year -1. So in my example below for the 1st deal id the variable post_trgt_weight (Target industry weight postmerger years) shall be 0.05319372 if dif >=1 & dif<=3.

Id is the deal id, fyear is the fiscal year of the data, eff_year the year the merger took place and dif the difference between both. Pre is a dummy variable taking the value 1 in the premerger years (dif>=3 & dif<=-1) whereas post is a dummy variable taking the value 1 in the postmerger years (dif >=1 & dif<=3).

pre_trgt_weight is the relative weight of the target industry premerger, which is Assets of the target company divided by Total Assets (target and acquirer assets). For this I typed:
gen pre_trgt_weigt = trgt_pre_totalassets / mergedcomp_pre_totalassets if pre==1 (total assets variables are not displayed in the example below).

pryr_tgt_weight is the relative weight in premerger year -1 (dif==-1).

However when I tried to generate the postmerger industry weights with

gen post_trgt_weight = pryr_tgt_weight if post ==1, it only creates missing values for all my 5,651 observations.

Is there any solution for my problem to get the weights in the postmerger years? I need them for the next steps in my Thesis and I´m struggling with this problem for a while now without having found a solution yet.

Thanks!


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input id fyear eff_year dif pre post pre_trgt_weight pre_acqr_weight pryr_trgt_weight pryr_acqr_weight post_trgt_weight post_acqr_weight
    1  1997       2000 -3  1   .      .08717766      .9128223            .             .                 .             .
    1  1998       2000 -2  1   .      .07802196      .9219781            .             .                 .             .
    1  1999       2000 -1  1   .      .05319372      .9468063          .05319372      .9468063             .             .
    1  2000       2000  0  .   .              .        .                  .             .                 .             .
    1  2001       2000  1  .   1              .        .                  .             .                 .             .
    1  2002       2000  2  .   1              .        .                  .             .                 .             .
    1  2003       2000  3  .   1              .        .                  .             .                 .             .
    1  2004       2000  4  .   .              .        .                  .             .                 .             .
    1  2005       2000  5  .   .              .        .                  .             .                 .             .
    2  1997       2001 -4  .   .              .        .                  .             .                 .             .
    2  1998       2001 -3  1   .       .5979492      .4020509            .             .                 .             .
    2  1999       2001 -2  1   .        .598115       .401885             .             .                 .             .
    2  2000       2001 -1  1   .       .4489449      .5510551           .4489449      .5510551             .             .
    2  2001       2001  0  .   .              .        .                  .             .                 .             .
    2  2002       2001  1  .   1              .        .                  .             .                 .             .
    2  2003       2001  2  .   1              .        .                  .             .                 .             .
    2  2004       2001  3  .   1              .        .                  .             .                 .             .
    2  2005       2001  4  .   .              .        .                  .             .                 .             .
    2  2006       2001  5  .   .              .        .                  .             .                 .             .
              
end


Graph with two different scales

$
0
0
Hello,

so this time I have a graph with two Y-axis scales. But the second scale seems very small. As shown down. I can't use rescale. Do you have any suggestions?

Thank you very much in advance a lot!!



Array


twoway line flowUE unempr year, sort clpattern(1 shortdash) ///
clcolor(black black) ///
legend(row(2) col(1) pos(11) order(1 "Job Finding Rate (UE)" 2 "Unemployment Rate") ///
ring(0) size(medsmall) region(lc(white)) ) graphregion(color(white)) ///
yaxis(1 2) ytitle(Annual Transition Rate) ///
ylabel(0(0.05)0.5, nogrid angle(horizontal)) xscale(titlegap(3)) ///
xtitle(Years) ylabel(, axis(2) angle(horizontal)) ///
ylabel(0 "0" 0.02 "0.02" 0.04 "0.04" 0.06 "0.06" 0.08 "0.08", axis(2)) ///
ytitle(Fraction of Labor Force, axis(2)) ///
scale(0(0.02)0.15,axis(2))
yscale(titlegap(3) axis(2)) yscale(titlegap(3) axis(1)) ///
title(Figure 5)

Float vs. double - data precision

$
0
0
Dear all,

Two variables in my dataset , call it xvar and yvar

Code:
xvar  float   %9.0g                
yvar        double  %12.0g
I use command

Code:
list xvar  yvar if xvar<  yvar &  yvar !=.
it listed observations in log window
Code:
xvar  yvar
5.1 5.1
To naked eyes, they are the same value. I need to delete observations that meet condition of

Code:
if xvar<  yvar &  yvar !=.
My question is:
  1. should I change the variable to the same types?
  2. Should I use double or float
Thanks,

Rochelle



wilcoxon signed rank test z-score

$
0
0
Hi all, I'm stuck in a wilcoxon signed rank test z-score problem,can some one tell me how exactly command "signrank" calculate p.value and z , cause I found out the give different result compare to r when there are lot of zeros in the testing data,

here is the data , you can try on R (wilcox.test(Export,Export_U, alternative = "two.sided",paired = TRUE))and STATA (signrank Export=Export_U ), the result will be different , even the p.value
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(Export Export_U)
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
  .02333072           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
  .03802535           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0  .012133468
          0           0
          0           0
  .01906245           0
   .2665769           0
          0           0
          0           0
          0           0
.0013075314           0
          0           0
          0           0
          0           0
          0           0
  .19016194  .026820315
          0           0
          0           0
          0           0
          0           0
          0           0
          0    .4625474
          0           0
          0           0
          0           0
          0           0
          0           0
          0    .5120914
          0           0
 .006225195           0
          0           0
          0           0
  .24785425   .03528966
          0           0
          0           0
  .03483107           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0   .11168677
          0           0
          0   .05568282
          0   .01983493
          0           0
          0    .3025498
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          0           0
          1           0
          0  .012808466
          0           0
          0     .406177
          0           0
          0           0
          0 .0046266057
          0           0
          0           0
          0           0
          0 .0042638998
          0           0
          0           0
end

Moving Averages

$
0
0
Hi,

I am newly stuck on something I am working on. And so i really need your precious help.
I have a dataset of closing prices of a stock(nbmps) on which i want to calculate the moving averages and the upper and lower bands such as the bollinger band indicator.
I found somewhere few hints about what i am trying to do and i write my code as follow:

gen moveave1= (F1.nbmps+nbmps+L1.nbmps)/4

gen upperband= moveave1*1.04

gen lowerband= moveave 1*0.96


Here's the problem:

line nbmps moveave1 uppperband lowerband Data

when I plot the drawing shows me 4 lines (1 of prices and it's ok) and the other 3 are give or take equal, they do not differ in position, they are overlaid.


I read that usually to generate the upper band somebody use 0.04+SMA.
But i don't know, even because I am a newbie user of stata so maybe I am writing something wrong.

All answers are welcome
Thank for your collaboration

Chris

Rolling market beta calculation (rolling rejected results from regress while using the entire dataset)

$
0
0
Hi everyone!
I am using Stata 13. I have a strongly balanced panel data containing 473 firm's stocks from July 2000 to June 2016 . I am currently trying to calculate rolling market betas for each firm's stock using a maximum of 60 and a minimum of 24 monthly excess stock returns from months prior to month t. If less than 24 returns are available, market beta is set to missing.
I was able to try some Stata codes which I found in Stata forum (http://www.statalist.org/forums/foru...tas-for-stocks)

Code:
*Regression of stockexcessret on market value-weighted excessreturn (mktrf)
levelsof permno, local(permno)
foreach s of local permno {
    rolling _b, window(60) saving(betas_`s', replace) reject(e(N) &lt; 24):  ///
       regress stockexcessret mktrf if permno == `s'                
}
levelsof permno, local(permnos)
foreach p of local permno {
    merge 1:1 permno dm using betas`p', nogenerate
}
I also tried the rangestat code too from the same link as above:
Code:
xtset permno month
save "test_data.dta", replace

* ------------ regressions over a window of 60 periods using -rangestat- --------
* define a linear regression in Mata using quadcross() - help mata cross(), 
mata:
mata clear
mata set matastrict on
real rowvector myreg(real matrix Xall)
{
    real colvector y, b, Xy
    real matrix X, XX

    y = Xall[.,1]                // dependent var is first column of Xall
    X = Xall[.,2::cols(Xall)]    // the remaining cols are the independent variables
    X = X,J(rows(X),1,1)         // add a constant
    
    XX = quadcross(X, X)        // linear regression, see help mata cross(), example 2
    Xy = quadcross(X, y)
    b  = invsym(XX) * Xy
    
    return(rows(X), b')
}
end

* regressions with a constant over a rolling window of 60 periods by permno
rangestat (myreg) stockexcessret mktrf, by(permno) interval(time -59 0) casewise

* the Mata function returns first the number of observations and then as many
* variables as there are independent variables (plus the constant) for the betas
rename (myreg1 myreg2 myreg3) (nobs rs_mktrf rs_cons)

* reject results if the window is less than 60 or if the number of obs < 24
isid permno month
by permno: replace rs_mktrf = . if _n < 60 | nobs < 24
by permno: replace rs_cons = . if _n < 60 | nobs < 24
save "rangestat_results.dta", replace

* ----------------- replicate using -rolling- ----------------------------------
use "test_data.dta", clear
levelsof permno, local(permno)
foreach s of local permno {
    rolling _b, window(60) saving(betas_`s', replace) reject(e(N) < 24):  ///
       regress stockexcessret mktrf if permno == `s'                
}

clear
save "betas.dta", replace emptyok
foreach s of local permno {
    append using "betas_`s'.dta"
}
rename end month
merge 1:1 permno month using "rangestat_results.dta"
isid permno month, sort

gen diff_mktrf =  abs(_b_mktrf - float(rs_mktrf))
gen diff_cons =  abs(_b_cons - float(rs_cons))
summ diff*


My problem is that for the first 44 stocks, the regressions are run properly,however, the loop stops on the 45th stock with the following message below:
rolling rejected results from regress while using the entire dataset
r(9);
All the two above codes still showed me the same result.
-> permno = 42

Rolling replications (133)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
.................................
file betas_42.dta saved
(running regress on estimation sample)

-> permno = 45

Rolling replications (133)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 50
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx................ 100
.................................
file betas_44.dta saved
(running regress on estimation sample)

-> permno = 45

Rolling replications (133)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 50
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 100
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
file betas_45.dta saved
(running regress on estimation sample)
rolling rejected results from regress while using the entire dataset
r(9);

end of do-file

r(9);


I have tried to regress stock 45 alone and even some other prior stocks as suggested in the stataforum link above , I got the result as no observation


regress stockexcessret mktrf if stock == 45
no observations
r(2000);




Below is a sample of my panel data
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(month permno stockexcessret mktrf) int time
486 1 .          .   1
487 1 .          .   2
488 1 .          .   3
489 1 .          .   4
490 1 .          .   5
491 1 .          .   6
492 1 .          .   7
493 1 .          .   8
494 1 .          .   9
495 1 .          .  10
496 1 .          .  11
497 1 .          .  12
498 1 .  -2.666597  13
499 1 .  3.5859656  14
500 1 .  -4.768708  15
501 1 . -4.7542515  16
502 1 .   .8977337  17
503 1 . -20.062456  18
504 1 .  13.092803  19
505 1 .  -9.423961  20
506 1 . -.01542697  21
507 1 .   9.851523  22
508 1 .   18.13945  23
509 1 .  -2.840696  24
510 1 .   2.641178  25
511 1 .  -8.554174  26
512 1 .    5.11478  27
513 1 .   5.957329  28
514 1 .  11.100168  29
515 1 .    5.30089  30
516 1 .   7.313002  31
517 1 .   2.284107  32
518 1 .   -3.55335  33
519 1 .   6.486986  34
520 1 .   .9358135  35
521 1 .   5.765306  36
522 1 .   4.773961  37
523 1 .   5.940833  38
524 1 .   10.34814  39
525 1 .   5.599309  40
526 1 .   9.423548  41
527 1 .  -.4387302  42
528 1 .  2.1820347  43
529 1 .  11.583542  44
530 1 .   3.107684  45
531 1 .  11.022295  46
532 1 .  -3.569324  47
533 1 .   3.057642  48
534 1 .   8.880916  49
535 1 .  -1.249484  50
536 1 .   9.944527  51
537 1 .   9.023089  52
538 1 .   25.71964  53
539 1 .  10.824612  54
540 1 .   7.584715  55
541 1 .  14.726665  56
542 1 .   5.745068  57
543 1 .  -5.350675  58
544 1 .  -1.273437  59
545 1 .  1.9274507  60
546 1 .   9.804824  61
547 1 .    8.53234  62
548 1 .   9.051862  63
549 1 .   8.378527  64
550 1 .  10.610908  65
551 1 .    7.01289  66
552 1 .  21.654003  67
553 1 . -1.6742957  68
554 1 .  -2.859784  69
555 1 .  10.776278  70
556 1 .  -2.765255  71
557 1 .  -9.189032  72
558 1 .  1.2453502  73
559 1 .   8.592383  74
560 1 .   6.241508  75
561 1 .   7.328311  76
562 1 .  18.458912  77
563 1 .   9.132421  78
564 1 .   6.122295  79
565 1 .   6.564981  80
566 1 .   7.204051  81
567 1 .   7.411033  82
568 1 .  -2.748398  83
569 1 .  -.3336661  84
570 1 .  10.429055  85
571 1 .   4.844315  86
572 1 .   54.17649  87
573 1 .  2.2536652  88
574 1 .   5.300388  89
575 1 . -3.4062684  90
576 1 .   5.909176  91
577 1 .   6.878344  92
578 1 .   6.149116  93
579 1 .   9.310979  94
580 1 .   3.699365  95
581 1 .  1.4750223  96
582 1 .  -9.142094  97
583 1 .  -5.088081  98
584 1 .  -3.735739  99
585 1 .  -26.71731 100
end
format %tm month




Please I need your help on what to do so as to generate betas for the entire dataset,thank you.

xtivreg and first stage

$
0
0
I would like to ask a question about xtivreg, fe. Let me pose the question without getting into the details of the model because I am not sure if the details are essential to the nature of the question. When I carry out a fixed effects, instrumental variables regression using xtivreg, fe, I get economically sensible results with no errors in the estimation output. My question is about the first-stage regression. As I understand, xtivreg, fe considers a fixed effects OLS regresssion in the first stage. Instead, I have just tried xtlogit, fe for the first stage, and it gives the note "multiple positive outcomes within groups encountered. 10,199 groups (30,589 obs) dropped because of all positive or all negative outcomes." Having no within group variation for some groups for the outcome variable in the first stage regression does not mean that I should also drop the same groups from the analysis in the second stage because the outcome variable of the second stage is, of course, conceptually a different variable. But still, the note xtlogit produces is discomforting. xtlogit, fe suggests me to drop some panel units because there is no variation over time in the outcome variable (of the first stage regression). But as I consider xtivreg, I implicitly ignore this sugegstion for the first stage, and consider the groups which, according to xtlogit fe, have no within group variation. I could expect xtivreg to produce a similar note, but it does not. Should it? Should I be concerned about my xtivreg regression, given the note produced by xtlogit for the first stage?


Range plot with bars problem

$
0
0
Dear all

I am trying to create a range plot with bars similar to the one found in this post: http://www.stata.com/statalist/archi.../msg00294.html

However, the twoway rbar command appears to sort the xvar every time before the graph is plotted. This prevents me from sorting my data using a different variable (in the above example using the variable sens).

Is there a way to prevent twoway rbar from sorting the data or is this a bug?

Thank you.

Viewing all 72762 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>