Quantcast
Channel: Statalist
Viewing all 72944 articles
Browse latest View live

Exporting tables produced with -tabcount-

$
0
0
Dear all,

I have been using -tabcount- to produce frequency tables mainly because I need the zero frequencies to be included. Unfortunately, I cannot see any way to export these tabcount tables into excel. Would anyone know how I can do this?

Many thanks in advance,

Aurelie.

conducting a group mean

$
0
0
Dear statalisters,

I am aware that the solution to my question might be really easy, but I am searching the web for over a day by now and I still can't find the answer, so I really hope that you can help me out. I have always used SPSS and recently switched to STATA.

I runned my factor analyses on a "safe sex behavior" questionnaire, next step will be to run an ANOVA to compare several conditions. My scale is build from several items. How do I create a new variable with the mean of all the items, so that I will be able to use that variable in my ANOVA.

I really hope that somebody can provide me with the answer! Thanks very much!

Kind regards Corine

count of experiences using a moving 5yr window

$
0
0
Hello!
I am working with data on M&A.
I am trying to calculate the number of times a certain type of acquisition experience (coded as a dummy) occurs but only when it occurs within 5 years prior to the focal deal.
For example:
firm_id laggains announcedyear count_sg
1 0 1989 0
1 1 1994 1
1 1 1998 2
1 1 2000 2
1 0 2001 2
1 0 2003 2
1 1 2004 2
2 1 1999 1
2 0 2000 1
2 0 2001 1
2 0 2002 1
2 1 2003 2
2 1 2005 2
where laggains is the type of experience (dummy variable), announcedyear is the year that the experience occured and count_sg is the variable I would like to create.
(count_sg=the total number of times laggains occured within the 5 yrs prior to announcedyear)
(fyi, the variable laggains is a lagged variable so it is actually missing for the very first observation of each firm_id)

Based on Stata Tip 51: http://www.stata-journal.com/sjpdf.h...iclenum=pr0033

i tried the following code in Stata/SE 12.1

:
gen count_sg=.
quietly forval i=1/'=_N' {
count if inrange(lagsmallg5d_h, 1, .)  &
inrange(announcedyear['i']-announcedyear, 1, 5)
replace count_sg=r(N) in 'i'
but i get invalid syntax r(198)

any suggestions/ideas as to what is wrong how I can generate the count_sg variable?
Thanks in advance!

Detection and removal of same year announcements

$
0
0
Dear statalisters,

I have searched the internet and forums extensively, but can't find the answer I'm looking for. Hopefully you can help me out.

My objective:
Exclude firms that announce (share repurchases) more than once in a year, in order to avoid an overlapping problem.

Example of my dataset:
Date-------------CUSIP_8 --------- Company
02oct2009----- 02888410 ----- American Physicians Capital
23jun2009 ----- 02888410 ----- American Physicians Capital
11dec2008 -----02888410 ----- American Physicians Capital
04dec2008----- 02888410 ----- American Physicians Capital
16aug2007 ----- 02888410 ----- American Physicians Capital
22may2007 ----- 02888410 ----- American Physicians Capital
11sep2003 ----- 02888410 ----- American Physicians Capital

I would like to detect and remove announcements that occur within 250 days (252 trading days in a year). In this case: 11sep2003 would remain.

How would I do this with stata code?

Thank you in advance for your help!

Michiel

stmixed does not work

$
0
0
Dear All,
I'm trying to fit a Flexible Parametric Model by using -stmixed- (ssc install stmixed, replace) in W10 (Stata 14). The code is the following
:
stmixed BP DM Gender || Country:, df(3) d(fpm)
We get the following error:
:
BLUP calculation failed in adaptive quadrature algorithm
Try increasing gh()
r(1986);
We have tried different gh(#) but error is always the same.
Obviously data are st setted.
Can someone help me?
Thank you in advance

Commands for rank testing

$
0
0
I am drawing random observations from various ordered populations and for each draw I am recording the rank of each observation (1 through n) with respect to the total population size for that draw (n). That gives me three columns:

draw rank n
1 4 16
2 2 7
3 8 9
.
.
.


I am trying to find a procedure that tests whether these observations were really drawn at random or were they systematically higher or lower in the rank order. Any suggestions?

Large database: op. sys. refuses to provide memory

$
0
0
Hello, my database is 19GB, I'm using Stata 12 SE and my computer is windows 7 - 64 bits 3GB RAM. I´m trying to load this large database in stata and the program shows this:

op. sys. refuses to provide memory
Stata's data-storage memory manager has already allocated 4g bytes and it just attempted to allocate another 32m bytes. The operating system said no.
Perhaps you are running another memory-consuming task and the command will work later when the task completes. Perhaps you are on a multiuser system that is
especially busy and the command will work later when activity quiets down. Perhaps a system administrator has put a limit on what you can allocate; see help
memory. Or perhaps that's all the memory your computer can allocate to Stata.
r(909);

and this is the query memory:


-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Memory settings
set maxvar 2048 (not settable in this version of Stata)
set matsize 400 10-800; max. # vars in models
set niceness 5 0-10
set min_memory 0 0-1600g
set max_memory . 32m-1600g or .
set segmentsize 32m 1m-32g

what can I do? I need the entire database

thank you

Interpreting time ratios in ATF models (streg)

$
0
0
Hello,
I read a post from Robert Gutierrez on the interpretation of time ratios in accelerated failure time (AFT) models which really confused me.
http://www.stata.com/statalist/archi.../msg00698.html

Here, Roberto argues that a time ratio of 0.88 means in case of a dummy variable that the treated group dies at a 12% slower rate.
From my understanding time ratios (the tr option in streg) are exponentiated coefficients. Thus, the coefficient is -0.13 from ln(0.88). According to other examples this means that treated group dies at a 14% faster rate due to exp(-0.88)=0.14 as explained here for example:
http://data.princeton.edu/pop509/recid1.html

Or am I totally wrong?

Thanks,
Sven

econometrically correct MODEL???

$
0
0
Hello statalisters,

I would like to ask if what I have done with this model makes sense, its an OLS model:


Where une_rt_a = unemployment rate, lagune_rt_a = lagged unemployment

:
reg $ylist lagune_rt_a ubdur replacementrate uegen unionden emp_protn lmp_exp tax_wedge i.year i.id, robust

Linear regression                                      Number of obs =     106
                                                       F( 28,    77) =   46.43
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.9408
                                                       Root MSE      =  .87863

---------------------------------------------------------------------------------
                |               Robust
       une_rt_a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
    lagune_rt_a |   .3605935   .1465451     2.46   0.016      .068785     .652402
          ubdur |   .0123056   .0086331     1.43   0.158    -.0048851    .0294963
replacementrate |   1.598295   7.443358     0.21   0.831    -13.22332    16.41991
          uegen |  -1.452665   .5516586    -2.63   0.010    -2.551158   -.3541728
       unionden |  -.1067704   .1543825    -0.69   0.491    -.4141853    .2006445
      emp_protn |  -2.450153   1.034245    -2.37   0.020    -4.509598   -.3907086
        lmp_exp |   3.629828   .5530729     6.56   0.000     2.528519    4.731137
      tax_wedge |  -.1181645   .1406742    -0.84   0.404    -.3982826    .1619536
                |
           year |
          2001  |   .7782077   .5621745     1.38   0.170    -.3412248     1.89764
          2002  |   1.297452   .6146678     2.11   0.038     .0734924    2.521412
          2003  |   1.283098   .6103078     2.10   0.039     .0678197    2.498376
          2004  |   1.300198   .6100104     2.13   0.036     .0855122    2.514884
          2005  |   1.432018    .599333     2.39   0.019     .2385939    2.625443
          2006  |   1.706212   .5766161     2.96   0.004     .5580227    2.854401
          2007  |   1.899565    .630555     3.01   0.004     .6439694     3.15516
          2008  |   1.786158   .6472408     2.76   0.007     .4973368    3.074979
          2009  |   1.952421   .7482888     2.61   0.011     .4623878    3.442454
          2010  |   1.872816   .6351732     2.95   0.004     .6080245    3.137607
          2011  |   2.095627   .7059671     2.97   0.004     .6898666    3.501387
                |
             id |
             2  |   2.694235   5.858096     0.46   0.647    -8.970724    14.35919
             3  |    5.66941   9.800374     0.58   0.565    -13.84563    25.18445
             4  |   6.679606   8.865948     0.75   0.454    -10.97475    24.33396
             5  |   9.465913   7.320384     1.29   0.200    -5.110833    24.04266
             6  |   4.296414   7.863431     0.55   0.586    -11.36168    19.95451
             7  |   4.413199   6.180286     0.71   0.477    -7.893323    16.71972
             8  |   11.84305   7.051535     1.68   0.097    -2.198347    25.88445
             9  |   7.300208   7.688401     0.95   0.345    -8.009354    22.60977
            10  |   5.844592   8.993982     0.65   0.518    -12.06471     23.7539
                |
          _cons |    15.7638   7.909678     1.99   0.050     .0136175    31.51398
---------------------------------------------------------------------------------

. testparm i.year

 ( 1)  2001.year = 0
 ( 2)  2002.year = 0
 ( 3)  2003.year = 0
 ( 4)  2004.year = 0
 ( 5)  2005.year = 0
 ( 6)  2006.year = 0
 ( 7)  2007.year = 0
 ( 8)  2008.year = 0
 ( 9)  2009.year = 0
 (10)  2010.year = 0
 (11)  2011.year = 0

       F( 11,    77) =    1.21
            Prob > F =    0.2926

. testparm i.id

 ( 1)  2.id = 0
 ( 2)  3.id = 0
 ( 3)  4.id = 0
 ( 4)  5.id = 0
 ( 5)  6.id = 0
 ( 6)  7.id = 0
 ( 7)  8.id = 0
 ( 8)  9.id = 0
 ( 9)  10.id = 0

       F(  9,    77) =    5.37
            Prob > F =    0.0000
In past studies i have looked at their tables and it includes both wald test for country and time effects, so would this be the best way to do this, i understand that the r-squard is really high, is there a way i can get the adjusted r^2 instead?





thank you for your help,


Lamie









Help with error: interactions not allowed

$
0
0
I am using the xtlsdvc command (bias corrected least squared dummy variable).

When I am running the code, I am getting an error that interactions are not allowed. I would have manually generated new variables for the interacted terms but I have to plot marginsplots which require the use of the factor variable notation of Stata.

Any advice on how I can avoid this problem? Does xtlsdvc not allow interactions?

:
xi: xtlsdvc wage i.male##income, initial(ab) bias(2) vcov(100)
interactions not allowed
r(101);
Thanks,
Josh

Simple stata question..

$
0
0
I have looked at archive upon archive regarding this, tried about 40 different things and I still can't get this. I'm very new to Stata

All I want to do is generate a new variable that contains the max from the 52 other variables.

I thought it might be this:
egen var1=rowmax(varlist)

but that's not working...

Can someone please help? Also, while I'm here. Is there some way to avoid listing out all the variables and instead saying do the command to all of them?

Quantile regressions - coefficients near zero at median

$
0
0
Hi all,

I'm running quantile regressions for the first time and getting some funny-looking results - I'm just hoping I haven't missed anything obvious like an option I should be specifying or something like that.

My commands are of the form:

:
sqreg ch_consumption dummy#c.(continuous variables) i.(dummy#(factor variables)), reps(100) q(.01 .05 .1 .15 .2 .25 .3 .35 .4 .45 .5 .55 .6 .65 .7 .75 .8 .85 .9 .95 .99)
I'm using household-level data: my dependent variable is the annual change in consumption, my continuous variables include things income and house value (also in changes), the dummy splits households into 'types', and the factor variables are things like employment status.

The puzzle is that when I graph the estimated coefficients, I always get a U-shaped distribution with coefficients close to (but not exactly) zero at the median of the dependent variable. I've never seen anything like that in the papers I've read on quantile regressions - if anything, the coefficients usually seem to look like they either increase or decrease over the distribution (see for example http://www.econ.uiuc.edu/~roger/research/intro/rq.pdf, pg 8). I've had a bit of a look around to see whether it's the sort of thing others have asked about in the past but have had no joy. I guess it might well be right, it's just a bit odd.

I'm using Stata 13.0 SE in Windows.

Any thoughts would be much appreciated. Thanks!

Comparing coefficients over pre, during and after sub-samples

$
0
0
Hi, can I check what is the most appropriate way to compare coefficients estimate (for interpretation) for sub-samples of pre, during and after GFC periods? The same model eg. "reg y x" is run on on the 3 sub-samples. And I think I cannot just interpret the coefficient estimates by looking at how it change from pre to after GFC because they are from different samples.

For example, the coefficient is 2.42, 5.78 and 9.28 respectively on X variable. I cannot conclude that the effect of X increases from pre to after GFC. I am not sure whether I have to do a difference in coefficient test? Any advice is appreciated.

Three stage heckprob model or triple hurdle model

$
0
0
Hey guy,

I have dataset which is biased due to two selection criteria(hurdles).
Now I would like to know whether it is possible to do a heckprob model with 2 selection criteria or a triple hurdle model with three hurdles.
Furthermore I would like to know whether it makes sense to have a binary variable as dependent variable in a triple hurdle model.
I´ve only found the following articles dealing with this Topic.
http://fsg.afre.msu.edu/zambia/IAPRI...h2013.pptx.pdf
http://www.stata.com/statalist/archi.../msg00747.html

Any help would be highly appreciated

xtmelogit and cluster

$
0
0
Hello,
I work on health data.
My model is a logit model with hospital random effect and cluster of patient residence.
I then use the following command:
"Xtmelogit y x1 x2 ... || hospitals: covariance (unstructured) or cluster (residences)"
Stata replied that "cluster option () not allowed"
I do not know how to set that.
Thanks for your feedback.

calculating hit rate after logistic regression

$
0
0
hi,

I'd like to know how to calculate hit rate after a logistic regression. I have used the survey set in my data.

Also, in terms of checking the predictive power of logistic model, is there any difference between k fold cross validation and hit rate?

Thanks

Ommited residual variable

$
0
0
Please help me. I run a long run levels model, obtained residuals using the "predict r, resid" command. Then i ran an ECM model including a lagged residual variable. When i ran an ECM the residual variable gets ommited. How can i overcome this problem? or what is it that i am doing wrong?

Timeline graph for presentation

$
0
0
Hello
I just wasted a day trying to get excel to make me a timeline. I have failed.
I have four registries for every registry there are 3-4 dates of interest.
So i have the following
Reg_var start_date date1 date2 end_of_study
Is it possible to create time lines using Stata that indicate start date, dates of interest and end_of_study.

I what them to be horisontal lines with a point indicating a date of interest. It would be lovly if i could get different color on the connecting lines from one point to the next.

Does this make sense?


Hope i can crowdsource me a Stata do file here.

Lars

Difference in Difference with compustat data

$
0
0
This is a pretty basic question, i want to establish if a marketing campaign undertook by a company was successful or not? I have the cash flows for that company (company A) before and after the marketing campaign, i also have cash flows for the company's major rivals (company B and company C) for the same time period. Therefore i want to estimate a difference in difference model using the rivals as the control since they did not carry out the campaign, how do i code that in Stata i am really struggling

Create a Calendar for each ID

$
0
0
I'm trying to create a calendar for each each ID for which I only have dates of interest. For instance, my data looks something like this:
ID start stop var1
1 1-Jan-2014 1-Sep-2014 L1
1 1-Jan-2015 31-Aug-2015 L5
2 20-Jun-2014 31-Aug-2015 L1
and I'm trying to expand it into:
ID calday var1
1 1-Jan-2014 L1
1 2-Jan-2014 L1
1 3-Jan-2014 L1
1 1-Sep-2014 L2
2 20-Jun-2014 L1
2 21-Jun-2014 L1
2 31-Aug-2015 L1


I tried to use user-written tsmktim and the built-in command expand followed by various merges. Though it illustrates my line of thought, none of this works:
:
*Generic calendar 01jan2012 to 31dec2015
clear all
set obs 1461
tsmktim dte_atrisk, start(01jan2012)
save My_Cal.dta, replace

*Expand each ID to cover span
use mydata, clear
expand td(31aug2015) - td(01jan2012) + 1
save ID_Expanded.dta, replace

*Merge 1


*Merge 2
Viewing all 72944 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>