Quantcast
Channel: Statalist
Viewing all 72772 articles
Browse latest View live

Conditional variable generation

$
0
0
Hello,

I’m struggling with generating a variable conditional on the outcome of another variable.
I created a dummy variable which gets value of 1 when day of a month is 1 to 15 and 0 otherwise. I would like a new variable to get lagged value of umich (already created) if day of a month is between 1 and 15 (dummy == 1) and current value of umich when day of month is greater than 15 (dummy == 0).
I experimented with egen and xi, but can’t figure it out by myself.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str8 ticker double(date umich) float(lag_umich day first_half)
"AAPL" 13165 105.8 102.4 17 0
"AAPL" 13257 107.8 105.4 18 0
"AAPL" 13348 107.5 105.4 18 0
"AAPL" 13439 106.6   102 17 0
"AAPL" 13530 106.8 104.9 16 0
"AAPL" 13621 115.2 109.8 17 0
"AAPL" 13712   114 113.2 17 0
"AAPL" 13803 109.8 114.1 16 0
"AAPL" 13894 113.5 111.4 15 1
"AAPL" 13985 115.5 113.7 16 0
"AAPL" 14076 113.3 115.4 16 0
"AAPL" 14166 112.8 111.7 14 1
end
format %td date
Thank you for your time!

Frank

dummy variable trap

$
0
0
Hello
Everyone,

I seek for your guidance: how to write codes for a panel data model which includes 4 dummy variables?

I wrote some codes for my panel data model to explore what are the determinants of inward foreign direct investment in China by considering investing/home countries factors from 2003 to 2016. The number of countries is 141.

In my baseline model, I only have the financial crisis as a dummy variable.

For my robust tests, I plan to add economic associations as control variables, namely APEC, G_20, ASEAN. Thus, in total, I will have 4 dummy variables in model 2.

Could you please give me some suggestion for my codes? My professor said that if we have 3 or more dummies in a model, we can't add constant value in model otherwise we will face dummy variable trap. I'm now facing this issue as proved by Stata. But I'm just a beginner of Stata, I doubt whether I wrote the codes correctly or not.

***Difference and System GMM - using xtabond2
*_________________________________________________ ______________________________

xtabond2 lifdic l.lifdic lgdp lgdppc ltradepc exchrate inflation ldistance tertiary internet crisis APEC G_20 ASEAN, gmm(l.lifdic, lag(2 2)) iv(lgdp lgdppc ltradepc exchrate inflation ldistance tertiary internet) robust nolevel twostep
eststo dif1

xtabond2 lifdic l.lifdic lgdp lgdppc ltradepc exchrate inflation ldistance tertiary internet crisis APEC G_20 ASEAN, gmm(l.lifdic, lag(2 2)) iv(lgdp lgdppc ltradepc exchrate inflation ldistance tertiary internet) robust twostep
eststo sys1

Thanks very much for your time and help. May you always be blessed.



Export to Excel: Complicated

$
0
0
How would I export results of the code:
forvalues i = 1/100 {
table HP`i'_quintile, by(HP`i'_quintile) c(mean KG`i')
}

Thanks

Coefficient interpretation difference log - difference log in fixed effects model

$
0
0
Dear Statalist,

I have a question with regards to the coefficient interpretation of my model.

I am doing a fixed effects model on the effect of house price changes on consumption.

My house price index variable, hpilagthree, looks as follows:

Code:
gen loghpi = ln(housepriceindex201206100)
by facilityid, sort: gen hpi=loghpi[_n]-loghpi[_n-1]
by facilityid: gen hpilagthree = hpi[_n-3]
My consumption variable, growthcons1, looks as follows:
Code:
gen logcons1 = ln(cons1)
by facilityid, sort: generate growthcons1=logcons1[_n]-logcons1[_n-1]
To analyze the effect, I am running the following regression:
Code:
xtreg growthcons1 hpilagthree lti savingsrate incomerate age unemp lendingrate, fe vce(cluster facilityid)
I receive the following output:
Array

My question is how I interpret the 7.54. I know that with a log-log model, the coefficient is an elasticity with interpretation: a one percent increase in x results in a beta percent increase in y. However, my model isn't log-log, but delta log - delta log.

My guess is that I should divide the coefficient by 100, as the 'growth rates' (delta log variables) are not in percentages (so not 5%) but in units (so 0.05). Then the interpretation would be a one percent increase in the house price index three months previously increases current consumption by 0.075 percent. I am not sure whether this is correct.

Could anyone advise me whether this is the best way to specify the model and whether my interpretation is correct? Would it perhaps be better if I simply estimate a log log model, so
Code:
xtreg logcons1 loghpi[_n-3] lti savingsrate incomerate age unemp lendingrate, fe vce(cluster facilityid)
In that way the coefficient for loghpi would simply be the elasticity, but I am not sure whether I am incorporating house price growth in that case, because my research is centered around the effect of house price growth.


Thanks a lot and kind regards,

Lisanne Spiegelaar

Mean of Means by Mutliple Groups

$
0
0
Hi all,

I'm trying to calculate the mean of subgroups that additionally fulfill a certain condition: Specifically, I'd like to calculate the mean market cap by month of all firms that are IPO firms (i.e. IPO dummy ==1). What I tried is this:

Code:
bysort Month IPOdummy1Y: egen MeanMC1Y = mean(realmarketcap)
This somehow changes my values I have stored in IPOdummy1Y. Can someone kindly explain why this happens and how I could avoid this?

Many thanks for your help.

Best,
Peter

Why I am Getting

$
0
0
I am using Scott Long's user created command mtable which is from his Spost13 package.

I have generated a variable that I am trying to calculate the margins for a variable I have created (HispanicUnhealthyLifestyle) against my outcome variable (clinicaldepression) but it says there are no observations.

gen HispanicUnhealthyLifestyle = race == 2 & numcigs_r == 30 & exercise == 0

sum clinicaldepression if HispanicUnhealthyLifestyle

Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
clinicalde~n | 0



qui mtable if HispanicUnhealthyLifestyle, rowname(HispanicUnhealthyLifestyle)
> ///
> atmeans ci below
no observations
r(2000);


I am providing a data example below.



dataex HispanicUnhealthyLifestyle

----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float HispanicUnhealthyLifestyle
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
end
------------------ copy up to and including the previous line ------------------

Listed 100 out of 420 observations
Use the count() option to list more

My question is the following: How do I locate at which values there would be observations for my variable Hispanic?

"stteffects":Survival analysis when treatment information is available for some periods after entry

$
0
0
Hello,
I have data for a cohort of firms that started their operations in 2004 and were surveyed until 2011. However, the information about a treatment was gathered later only from 2007 to 2011. So, there is not any information about such treatment for years before 2007.

I want to evaluate the effect of such treatment on the survival time of these firm by implementing the code
Code:
stteffects
I have the following question:

Should I restrict the survival analysis to the period of 2007 to 2011, and thus changing the age of firms to be 1 for those in 2007, to be 2 in 2008, ..., and be to 5 for the firms that survived until 2011?

Are there any other ways to evaluate the effect of such treatment on the survival time without restricting the sample in
Code:
stteffects
?

I appreciate your help.

Best regards,
Hossein

No output for margins dydyx

$
0
0
Hi,

I am trying to estimate the marginal effects for the interaction term of 1) a categorical variable that records the frequency of using a mobile phone with 5 levels (0-4) and 2) a continuous variable that is the proportion of respondents in a primary sampling unit that voted.

My model is coded as: melogit yvar i.phoneusefreq##c.votedproportion other controls
margins, dydx(*) atmeans post

When I hand code an interaction term, I am able to obtain the marginal effects, but I know this is not the correct approach and one should follow Stata's ## notation for interactions, but my output has no estimates for the marginal effects when I use this approach. Is there a way to overcome this?

Thanks.

For Loop Syntax Error - Subsetting Data

$
0
0
Hello,

I have two datasets. I have a Maindata which I would like to split into around 40 datasets based on a criteria in a separate dataset. Both datasets contain longitudes and latitudes and the criteria dataset specifies the range for a given group. So it has around 40 observations. This is the main dataset.

Code:
Name      Census2001_Lat      Census2001_Lon
"ABC"        12.747113                     79.847343
"DEF"          12.874169                    79.653198
"GHI"           12.87979                     79.675159
"JKL"           12.867902                    79.66732
"MNO"          12.721048                    79.753166
This is the criteria dataset.

Code:
clear
Groups                 maxlat                          minlon                    minlat                                  maxlon
"GroupA"             12.95                   79.18333333333334 12.633333333333333             79.75
"GroupB "          12.816666666666666 79.11666666666666 12.483333333333333 79.36666666666666
"GroupC "          13.783333333333333 78.96666666666667        13.4                      79.58333333333333
"GroupD "           13.516666666666668              78.8                 13                            79.31666666666666
A variable id=_n has been defined for the Criteria. I am trying to run a for loop such that the code iterates over each of 40 observations (ie criteria) in the criteria dataset and create separate datasets in the Maindata based on the longitudes and latitudes given there. This is the code I used:

Code:
use Criteria
forval i=1/`id'{
preserve
use if (inrange(Census2001_Lon,`minlon',`maxlon') & inrange(Census2001_Lat,`minlat',`maxlat')) using MainData
save data_`id', replace
restore
}
However, I get a syntax error with the for loop. It says invalid syntax. Could someone please explain what's wrong? Thanks!

Fixed Effects on a weakly balanced data-set of cross-border M&A transactions

$
0
0
Dear Statalist users,

I am a complete beginner using Stata, so please bear with me if my question is too obvious.

I am researching the impact of corruption and governance on cross-border M&A performance. My dataset includes 600+ cross-border M&A announcements between 2008 and 2017. This is a case of a weakly balanced panel data as a country and firm can appear only once or several times during the 10-year period. In some cases, the same country and firm can show up more than once in the same year. The data looks something like this:
Year Action ID Country Firm CAR X1 X2 X3
2008 1 France a -0.06
2008 2 France b 0.02
2008 3 Croatia c -0.20
2009 4 England d -0.01
2009 5 France a 0.04
2010 6 Belgium f 0.01
2011 7 Belgium g -0.03
2011 8 France a 0.02
2011 9 Belgium f 0.04
2012 10 Croatia c 0.08
2012 11 Croatia j -0.09
I am interested in the model: CAR = a + x1 +x2 +x3 + u

My question is can I use a fixed or random effect model with this type of data? - if so which variable should I use as cross-section?

I have tried the following:
1) xtset country year, yearly
(which gives me error 451, which indicates that there are repeated values)

2) xtset ActionID year, yearly

this one gives me the following result:
panel variable: actionid (weakly balanced)
time variable: year, 2008 to 2017
delta: 1 year

However, when I run my regression: xtreg car x1 x2 x3 i.year, fe - I get all omitted variables due to collinearity.

I'd really appreciate any help I can get.

Many thanks,
Henry

Estimating treatment effects on multiple outcomes

$
0
0
Hello all,

I hope someone can point me to resources (within or beyond the Statasphere) relevant to the following. I am interested in the effect of a treatment T on two outcomes, Y and Z. Thus,

Code:
Y =   aX + bT + cXT + u
Z =   dX + fT + gXT + v

T = 1(hX + kZ + e > 0), estimated by logit or probit.
I am not willing to assume that corr(e, u) = corr(e, v) = 0. The issue of whether corr(u, v) = 0 or not is a headache for another day at this point.

To make this concrete, let Y be income, Z be training opportunities taken and T be union membership. Let everything be continuous and normally distributed for simplicity.

I'm stuck on how to estimate the treatment effect for union membership here. Beyond the question of which command to use, my conceptual hangup is that a person only gets to choose one union status, but that choice is potentially correlated with both income and training opportunities given (non)-membership in a union. Because of that, modeling the two outcomes separately seems problematic.

I'm more comfortable (perhaps wrongly) with how I'd handle this if I could assume conditional independence. I think that I could apply any of the teffects commands to Y and Z separately.

Any suggestions would be very much appreciated. Thank you.

Glenn

Logit model

$
0
0
Hello everybody
I'm trying to estimate the loss given default of a borrower.
Since this variable takes values between [0,1] is it apropriate to use logistic regression ???
Thank you in advance.

Option osample not working in teffects dialog box for nnmatch

$
0
0
This thread is a follow up to the problem described in this post ,

I am on Stata/IC 14.2 for Windows (64-bit x86-64) with Revision 29 Jan 2018.
The option "osample" in the tab "Advanced" is not processed by the dialog box for the command teffects when selecting/using "Nearest neighbour matching" as the estimator.
According to the help file of teffects, the option "osample" is supported by this estimator. In fact, calling teffects nnmatch something, osample(varname) does not give an error on the command line.
The lines below are taken from the official teffects.dlg and show where the call to set the option is forgotten. The problem is in the program adv_output which is in lines 1155 to 1611. I omitted the comments in the code and try to hilite the problem.

If you look at the code, then you see that for all other estimators there is a line optionarg adv.ed_osample, which is missing for nnmatch.

Code:
PROGRAM adv_output
BEGIN
    if model.cb_est.iseq("ra") | model.cb_est.iseq("ipw") | ///
       model.cb_est.iseq("ipwra") | model.cb_est.iseq("aipw") {
           optionarg /hidedefault adv.ed_tol
           optionarg adv.ed_ctrl
           optionarg adv.ed_lvl
           optionarg adv.ed_osample
    }
    if model.cb_est.iseq("nnmatch") {

        optionarg adv.ed_cal
        optionarg /hidedefault adv.ed_dtol
        optionarg /hidedefault adv.ed_pstol
        optionarg adv.ed_stub
        optionarg adv.ed_ctrl_ps
        optionarg adv.ed_lvl_ps
HERE MISSING optionarg adv.ed_osample
        if adv.ck_metric {
            put "metric("
            if adv.rb_mah {
                put "mahalanobis"
            }
            if adv.rb_iva {
                put "ivariance"
            }
            if adv.rb_euc {
                put "euclidean"
            }
            if adv.rb_user {
                put "matrix "
                require adv.cb_user
                put adv.cb_user
            }
            put ")"
        }
    }

    if model.cb_est.iseq("psmatch") {
        optionarg adv.ed_cal
        optionarg /hidedefault adv.ed_pstol
        optionarg adv.ed_stub
        optionarg adv.ed_ctrl_ps
        optionarg adv.ed_lvl_ps
        optionarg adv.ed_osample
    }
END
To reproduce just use any dataset and try to use the "osample" option for teffects nnmatch in the dialog box. The command will be sent but without the said option.
If needed, I could provide a walk-through of the dialog box code from where the edit field of the option is defined until the actual command gets issued.

PS: Forgive for the formatting, I did not see any guidelines how to hilite things in a post.

Using IV for count data with diff-in-diff specification

$
0
0
I want to estimate a count data model (the dependent variable is patents count) with a continuous diff-in-diff specification ("natural experiment"). I have a small sample (N=16, T=30). The main regression is:

Code:
gen treatment_post=treatment*post
xtpoisson patents treatment_post i.year,fe vce(robust)
In one of the robustness checks, I would like to instrument my treatment variable. My IV (like my treatment) is time invariant. Currently, I'm using the following command

Code:
gen iv_post=iv*post
ivpoisson gmm patents (treatment_post=iv_post) i.year i.group, vce(robust)
However, I saw in previous posts that this estimator might not be consistent (due to incidental parameters problem). I would be happy to get suggestions about the right direction. (In an old post, Jeff Wooldridge suggested: "to add the fixed effects residuals obtained in the first stage to the FE Poisson estimation in the second stage". I'm not sure if it works when the instrument is time-invariant and in general how exactly to implement that method (e.g., how to calculate the standard errors)).


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte group int year float post double(patents treatment) float iv
2 1971 1  3 .19008264462809918 14.162823
2 1964 0  1 .19008264462809918 14.162823
2 1962 0  1 .19008264462809918 14.162823
2 1951 0  0 .19008264462809918 14.162823
2 1955 0  0 .19008264462809918 14.162823
2 1970 1  3 .19008264462809918 14.162823
2 1978 1  0 .19008264462809918 14.162823
2 1954 0  1 .19008264462809918 14.162823
2 1968 1  3 .19008264462809918 14.162823
2 1948 0  2 .19008264462809918 14.162823
2 1977 1  0 .19008264462809918 14.162823
2 1973 1  2 .19008264462809918 14.162823
2 1950 0  1 .19008264462809918 14.162823
2 1960 0  0 .19008264462809918 14.162823
2 1979 1  1 .19008264462809918 14.162823
2 1972 1  1 .19008264462809918 14.162823
2 1963 0  0 .19008264462809918 14.162823
2 1984 1  1 .19008264462809918 14.162823
2 1981 1  1 .19008264462809918 14.162823
2 1983 1  1 .19008264462809918 14.162823
2 1974 1  2 .19008264462809918 14.162823
2 1949 0  0 .19008264462809918 14.162823
2 1965 1  3 .19008264462809918 14.162823
2 1980 1  1 .19008264462809918 14.162823
2 1957 0  0 .19008264462809918 14.162823
2 1958 0  0 .19008264462809918 14.162823
2 1953 0  2 .19008264462809918 14.162823
2 1966 1  6 .19008264462809918 14.162823
2 1956 0  0 .19008264462809918 14.162823
2 1967 1  2 .19008264462809918 14.162823
2 1961 0  2 .19008264462809918 14.162823
2 1975 1  3 .19008264462809918 14.162823
2 1982 1  2 .19008264462809918 14.162823
2 1985 1  0 .19008264462809918 14.162823
2 1952 0  1 .19008264462809918 14.162823
2 1976 1  1 .19008264462809918 14.162823
2 1969 1  3 .19008264462809918 14.162823
2 1959 0  1 .19008264462809918 14.162823
4 1973 1  0  .3243243243243243 13.913777
4 1969 1  1  .3243243243243243 13.913777
4 1976 1  0  .3243243243243243 13.913777
4 1953 0  0  .3243243243243243 13.913777
4 1979 1  4  .3243243243243243 13.913777
4 1956 0  0  .3243243243243243 13.913777
4 1958 0  0  .3243243243243243 13.913777
4 1963 0  0  .3243243243243243 13.913777
4 1978 1  0  .3243243243243243 13.913777
4 1977 1  1  .3243243243243243 13.913777
4 1974 1  0  .3243243243243243 13.913777
4 1961 0  1  .3243243243243243 13.913777
4 1966 1  1  .3243243243243243 13.913777
4 1980 1  0  .3243243243243243 13.913777
4 1955 0  1  .3243243243243243 13.913777
4 1965 1  1  .3243243243243243 13.913777
4 1982 1  0  .3243243243243243 13.913777
4 1975 1  0  .3243243243243243 13.913777
4 1967 1  3  .3243243243243243 13.913777
4 1957 0  0  .3243243243243243 13.913777
4 1949 0  2  .3243243243243243 13.913777
4 1951 0  0  .3243243243243243 13.913777
4 1968 1  1  .3243243243243243 13.913777
4 1972 1  0  .3243243243243243 13.913777
4 1954 0  0  .3243243243243243 13.913777
4 1981 1  0  .3243243243243243 13.913777
4 1950 0  1  .3243243243243243 13.913777
4 1960 0  0  .3243243243243243 13.913777
4 1971 1  0  .3243243243243243 13.913777
4 1985 1  0  .3243243243243243 13.913777
4 1962 0  0  .3243243243243243 13.913777
4 1983 1  0  .3243243243243243 13.913777
4 1952 0  0  .3243243243243243 13.913777
4 1970 1  0  .3243243243243243 13.913777
4 1984 1  0  .3243243243243243 13.913777
4 1959 0  2  .3243243243243243 13.913777
4 1964 0  0  .3243243243243243 13.913777
4 1948 0  1  .3243243243243243 13.913777
5 1979 1  3   .216072545340838 12.584287
5 1951 0  0   .216072545340838 12.584287
5 1968 1  9   .216072545340838 12.584287
5 1963 0  0   .216072545340838 12.584287
5 1948 0  3   .216072545340838 12.584287
5 1960 0  6   .216072545340838 12.584287
5 1952 0  0   .216072545340838 12.584287
5 1955 0  1   .216072545340838 12.584287
5 1985 1  2   .216072545340838 12.584287
5 1949 0  0   .216072545340838 12.584287
5 1974 1 14   .216072545340838 12.584287
5 1969 1 15   .216072545340838 12.584287
5 1978 1  0   .216072545340838 12.584287
5 1967 1 18   .216072545340838 12.584287
5 1982 1  2   .216072545340838 12.584287
5 1966 1 24   .216072545340838 12.584287
5 1984 1  1   .216072545340838 12.584287
5 1964 0  3   .216072545340838 12.584287
5 1975 1  4   .216072545340838 12.584287
5 1956 0  1   .216072545340838 12.584287
5 1983 1  2   .216072545340838 12.584287
5 1971 1  2   .216072545340838 12.584287
5 1981 1  0   .216072545340838 12.584287
5 1959 0  1   .216072545340838 12.584287
end

C statistic after competing risk analysis

$
0
0
Hello

I am trying to estimate Harrel 's C after competing risk regression analysis . Is there any way to do this in Stata?

Many thanks

Luis

Suggestions for the visualization of a triple interaction effect

$
0
0
I am looking to create some figure(s) that visualize a triple interaction effect I am observing in the data.
I reprint the basic results below

Code:
xtreg fwd log_dom t_s_d c.log_dom#c.t_s_d p_pria_timew p_pria_usew c.log_dom#c.p_pria_usew c.p_pria_usew#c.t_s_degree_cent c.log_dom#c.p_pria_usew#c.t_s_d log_other t_k_s team_t_1st col_difpairs team_m_patcount_5y p_classes dif_cpc p_cpc_1 p_pria p_claims p_inv f_inv_prod_5y f_search f_acap f_dar f_emp i.p_appy i.p_gry if compustat == 1 , fe

Fixed-effects (within) regression               Number of obs     =     40,282
Group variable: fid                             Number of groups  =        109

R-sq:                                           Obs per group:
     within  = 0.0831                                         min =          1
     between = 0.0728                                         avg =      369.6
     overall = 0.0679                                         max =      8,495

                                                F(36,40137)       =     101.00
corr(u_i, Xb)  = -0.2782                        Prob > F          =     0.0000

-------------------------------------------------------------------------------------------------------------
                                        fwd |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------------------------------------+----------------------------------------------------------------
                                  log_dom_t |   .5544467   .0850644     6.52   0.000     .3877186    .7211748
                            t_s_degree_cent |    .052213     .02431     2.15   0.032     .0045649    .0998612
                                            |
              c.log_dom_t#c.t_s_degree_cent |  -.0276355   .0060148    -4.59   0.000    -.0394246   -.0158464
                                            |
                               p_pria_timew |  -1.091934   .0910784   -11.99   0.000     -1.27045   -.9134187
                                p_pria_usew |   .3448452   .1871209     1.84   0.065    -.0219161    .7116065
                                            |
                  c.log_dom_t#c.p_pria_usew |   .1115617    .063443     1.76   0.079     -.012788    .2359115
                                            |
            c.p_pria_usew#c.t_s_degree_cent |   .1329657   .0234215     5.68   0.000     .0870591    .1788722
                                            |
c.log_dom_t#c.p_pria_usew#c.t_s_degree_cent |  -.0285318   .0052981    -5.39   0.000    -.0389161   -.0181474
Normally, for a simple interaction effect I would use something like

Code:
marginscontplot log_dom_t t_s_d, at1(0(.25)6) at2(0 5.7 14.3 22.9)
So I am wondering
1) Is it possible to run marginscontplot for specific values of my variable p_pria_usew (e.g. by using by, for, or over)? This should create multiple graphs that exhibit the effects
2) Is there a way to create a 3D plot (with fwd on the Y axis, log_dom_t on the X-axis, and t_s_degree_cent on the Z-axis for three different values of p_pria_usew (e.g. mean +/- 1 standard deviation
3) If either 1 or 2 is possible, would the solution be different if I would run a negative binomial regression instead of a simple OLS?

Thanks heaps in advance!

Simon

relogit or firthlogit?

$
0
0
Hi,
I am working on a dataset of over 50,000 observation, however my dependent variable which is a dummy variable has only about 100 cases with value of 1. All other cases are 0. From my understanding, relogit and firthlogit both can deal with so called the rare events. But could anyone suggest which one is better? or what are differences between these two modes?

Thank you very much.

Ruowen

sum appears in the last row

$
0
0
Dear All, I found this question here (http://bbs.pinggu.org/thread-6521515-1-1.html). The data is
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str8 area float(x1 x2 x3)
"A" 1000 1200 2000
"B"  800 1800  400
"C" 2300 1230 2310
end
and I want to add another row, say "D", which value equals the total sum of each variable of x1-x3. Any suggestion is highly appreciated.

Date fromating

$
0
0
Hi,
I have a sort of date (about 6 months daily data from 1st January 2016 untill 30 st June 2016) for about 1000 companies. The date data are in a numeric format, as shown in original date. Here is the command I am using to make data in time series: "tsset ISIN1 date, format(%tdNN/DD/CCYY)". As shown below, the formatted date is different from the original date which I had from the beginning. I mean the original date from 1st of January 2016 (Original date ) changed into the date 1st of April 2016 (Formatted date). I am not sure which command should I chose in order to get the correct date.
I would be grateful if someone guides me in this regard.

Original date Formatted date
ISIN date ISIN Date
GB0001771426 1/1/2016 AT0000785407 04/01/2016
GB0001771426 1/4/2016 AT0000785407 04/04/2016
GB0001771426 1/5/2016 AT0000785407 04/05/2016
GB0001771426 1/6/2016 AT0000785407 04/06/2016
GB0001771426 1/7/2016 AT0000785407 04/07/2016
GB0001771426 1/8/2016 AT0000785407 04/08/2016
GB0001771426 1/11/2016 AT0000785407 04/11/2016
GB0001771426 1/12/2016 AT0000785407 04/12/2016
GB0001771426 1/13/2016 AT0000785407 04/13/2016
GB0001771426 1/14/2016 AT0000785407 04/14/2016
GB0001771426 1/15/2016 AT0000785407 04/15/2016
GB0001771426 1/18/2016 AT0000785407 04/18/2016
GB0001771426 1/19/2016 AT0000785407 04/19/2016
GB0001771426 1/20/2016 AT0000785407 04/20/2016
GB0001771426 1/21/2016 AT0000785407 04/21/2016
GB0001771426 1/22/2016 AT0000785407 04/22/2016
GB0001771426 1/25/2016 AT0000785407 04/25/2016
GB0001771426 1/26/2016 AT0000785407 04/26/2016
GB0001771426 1/27/2016 AT0000785407 04/27/2016
GB0001771426 1/28/2016 AT0000785407 04/28/2016
GB0001771426 1/29/2016 AT0000785407 04/29/2016
GB0001771426 2/1/2016 AT0000785407 05/02/2016
GB0001771426 2/2/2016 AT0000785407 05/03/2016

Thanks in Advance,
Mahmoud

How to retrieve a part of a string, before a specific symbol?

$
0
0
Does anyone know how to retrieve a part of a string, before a specific symbol?

For example, in the data below,
Code:
 
Name
thisisatesSGA123
madSGA1
IKNOWHssswSGA
I would like to get the part before the SGA, and remove the rest:
Code:
 
Name
thisisates
mad
IKNOWHsssw
Can anyone help me?
Viewing all 72772 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>