Randomly selecting control observations by group and criteria

April 27, 2020, 1:10 pm

≫ Next: Replacing missing values with future observations (large gaps)

≪ Previous: How do I use replace if function to go through several variables at once (i.e. variables 1 through 40)?

Dear all,

I have a dataset which contains actual observations and a varying number of control observations per group.

The goal is to assign 3 control observations (marked as control*==1) to each actual observation (marked as actual==1) in order to have a fixed ratio between actuals and controls. Therefore, I would like to randomly select 3 observations from all potential controls per group.

The group variable (group_id) is the combination of "worker_id & firm_id". So, for each actual observation per group, the potential pool of controls consists in observations which share the same "worker_id & firm_id" and are marked as control1==1 or control2==1. For example, for the actual observation in line 1 below, the possible control observations according to control1 are those in line 2-7 because the have the same worker_id & firm_id and are marked as control1==1. According to control2, the eligible controls are line 6 and 7.

The variables control1 and control2 simply apply a different set of criteria under which an observation can qualify as a control to the actual. The goal is to conduct 2 separate random selections of 3 controls per group, first according to control1 and second according to control2. As a result, ideally 2 new variables would be generated which tag the randomly selected control observations per group according to control1 and control2.

Do you have any suggestions on how to approach this? I would really appreciate any help with the code.

There are in particular two aspects that I struggle to incorporate:
First, note that there can be more than one "actual" per group (max. 4) (compare for example line 1 and 8). In this case, the potential pool of controls is the same, however, the random set of 3 control observations should be drawn independently for each "actual".

Second, for some groups there may be fewer than 3 potential controls available. In this case, I would like to flag this "actual" in order to exclude it from the analysis later.

Please find below a data excerpt. Many thanks again for your help.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte worker_id int(firm_id group_id coworker_id) byte(actual control1 control2)
1 999 1999 100 1 0 0
1 999 1999 589 0 1 0
1 999 1999 877 0 1 0
1 999 1999 234 0 1 0
1 999 1999 205 0 1 0
1 999 1999 743 0 1 1
1 999 1999 284 0 1 1
1 999 1999 104 1 0 0
2 876 2876 874 1 0 0
2 876 2876 432 1 0 0
2 876 2876 434 0 1 1
2 876 2876 546 0 1 1
2 876 2876 342 0 1 1
2 876 2876 689 0 1 0
2 876 2876  65 0 1 1
2 876 2876 439 0 1 0
2 876 2876 234 0 1 0
2 876 2876 543 0 1 0
end

↧

Replacing missing values with future observations (large gaps)

April 27, 2020, 1:35 pm

≫ Next: Why stata still reports log likelihood results for Linear Probability Model when use esttab?

≪ Previous: Randomly selecting control observations by group and criteria

Hello Statalist community,
I am working with a panel dataset for mutual funds (fund identifier: wficn) with monthly observations. I have a variable called exp_ratio which is the expense ratio for a given fund. These ratios are reported infrequently and I might face period of e.g. 16 month without an observation before this ratio is reported. In such a case, I want to replace all the missing ratios with the next observable ratio- effectively replacing all missing values for the 16 month with the observation from month 17. However, the gaps in which the exp_ratio is reported vary over time (there might be only 2 consecutive missing values while at another point 10 consecutive values are missing).
The dataset is relatively large with 2,800,000 observations- hence, this work cannot be done manually.

I would really appreciate if somebody could help me out here. Thank you in advance!

The code I have written so far does not fulfil my requirement as it only replaces the missing value previous to an observation:

Code:

bysort wficn: replace exp_ratio= exp_ratio[_n+1] if exp_ratio==.

↧

Why stata still reports log likelihood results for Linear Probability Model when use esttab?

April 27, 2020, 4:40 pm

≫ Next: Graphing: Scatter Plot

≪ Previous: Replacing missing values with future observations (large gaps)

Dear statalists, when I report both LPM and logistic regression results use esttab function, why Stata still reports log likelihood for LPM even though the LPM does not use it?

↧

Graphing: Scatter Plot

April 27, 2020, 6:12 pm

≫ Next: How to create interaction variable in stata

≪ Previous: Why stata still reports log likelihood results for Linear Probability Model when use esttab?

Dear Stata,
I have a panel data. I want to graph a scatter plot, y axis is wage in log form, and x axis in a fraction of employed workers who are members of labor union. I want my axis to be in years. May I ask for an assistance, what is the command for this? Thank you.

↧

How to create interaction variable in stata

April 27, 2020, 8:43 pm

≫ Next: How to do Reality check and SPA test by Stata?

≪ Previous: Graphing: Scatter Plot

I have 2 independent variables and i need to create an interaction from both variable.
The first variable is distance to financial institution and i made the data became categorical (1: 0-1km; 2: 1-10km; 3: >10km)
The second variable is health shock in the dummy form. 1 for health shock occur, and 0 for health shock doesnt occur.
My data is panel for year 2007 and 2014.
Can i make an interaction from both variable? If yes, how can i do that?

↧

How to do Reality check and SPA test by Stata?

April 27, 2020, 8:46 pm

≫ Next: No constant in fixed-effect regression

≪ Previous: How to create interaction variable in stata

Dear Statalist,
I want to know whether there are any user written commands by stata for performing Reality Check (White 2000) and Superior Predictive Ability test (Hansen 2001)? So far I didnt find anything.

Here is the link to White`s paper: White, H. (2000). A reality check for data snooping. Econometrica, 68(5), 1097-1126. http://www.ssc.wisc.edu/~bhansen/718/White2000.pdf

Here is Hansen`s paper: Hansen, P. R. (2001). An unbiased and powerful test for superior predictive ability (No. 2001-06). http://www-siepr.stanford.edu/workp/swp05003.pdf

Thanks a lot on advance for your help! It is really appreciated!

Sincerely
Ning.

↧

No constant in fixed-effect regression

April 27, 2020, 9:16 pm

≫ Next: Topic impact access to microfinancial institution for ensuring consumption because of health shock : omitted because of collinearity

≪ Previous: How to do Reality check and SPA test by Stata?

Dear all:

Hope you are doing well and healthy. I am writing to ask a quick question about the fixed effect regressions in Stata.

I notice that none of the fixed-effect commands in Stata (areg, xtreg, reghdfe) could suppress the constant term. However, the constant term is quite annoying if my key interest is the fixed effect estimates because it could mess up the fixed effect estimates.

I am wondering if anyone knows any smart way to get rid of the constant term or recover the true fixed-effects estimates?

My hunch is:

Code:

xtreg y x, fe
predict fe_wrong, u
gen fe_right = fe_wrong + _b[_cons]

But I am not 100% sure.

If it is the case, how could we recover the true two-way fixed effects from constant?

Thank you, and I look forward to hearing from you!

Best,
Long

↧

Topic impact access to microfinancial institution for ensuring consumption because of health shock : omitted because of collinearity

April 27, 2020, 9:24 pm

≫ Next: Looping regression to determine one set of control variables that make models significant

≪ Previous: No constant in fixed-effect regression

Iam currently writing for my undergraduated thesis. I have large panel data in year 2007 and 2014. my dependen variable is change in consumption.
My independen variable is :
1. interaction between change in health condition in dummy form with dummy variable for distance 0-1km
2. interaction between change in health condition in dummy form with dummy variable for distance 1-10km
3. interaction between change in health condition in dummy form with dummy variable for distance >10km

I perform command for declaring panel using
xtset pidlink year
and regression using RE
xtreg consumption distance1 distance2 distance3 demographic, RE
the result is all the variable omitted due to collinearity, is there any other solution to solve this problem? is it okey with me makes my independent variable for distance like that?

Thank you in advance

↧

Looping regression to determine one set of control variables that make models significant

April 27, 2020, 11:33 pm

≫ Next: Demeaning and standardizing variables in panel regression

≪ Previous: Topic impact access to microfinancial institution for ensuring consumption because of health shock : omitted because of collinearity

Hi everyone,

I have one dependent variable Y, four independent variables X1 X2 X3 X4, 12 control variables C1 C2 .....C12 (please note that independent and control variables are in random forms and not in any order)

Now, I want to loop regression to select a set of control variables from my 12 ones that can make the coefficient of my 4 independent variables(X1 X2 X3 X4) significant, which in my case the p-value of the four coefficients should be less than 10%( the smaller the better).

Please note the set of control variables that I want to pick out can contain from just 1 control variable to as much as 12 ones, which means it can be just C1 or just C2 or just C7 and it can also be a combination like(C1, C7, C10, C12). If my calculation is correct, there should be ( 2^12-1=4095) combinations and correspondently 4095 times of looping regression.

And when the set of control variables satisfied the significance request is detected, I'd love it to be output so that I can see which sets are available.

Thank you for your help

↧

Demeaning and standardizing variables in panel regression

April 28, 2020, 12:26 am

≫ Next: Pooled OLS, fixed & random effects: Panel Data

≪ Previous: Looping regression to determine one set of control variables that make models significant

Hello everyone,

I am analyzing a panel data set with 55 countries. My dependent variable is firm equity issuance (aggregated at the country level) and my independent variable is aggregate stock market liquidity. I initially ran a panel regression with fixed effects as below,

xtreg equity liquiditity $controls i.year, fe vce(robust) // (1st regression)

However, since the scales of the two variables are different, the coefficients are not naturally interpretable. I recently saw in a paper that demeaning and standardizing variables allows meaningful interpretation of the coefficients. To try this, I ran the following command which creates a new set of standardized variables with the prefix "c_".

by country: center equity liquiditity , standardize

reg c_equity c_liquidity $controls i.year, vce(robust) // (2nd regression)

My questions are;
01) Is it Ok to standardize only the dependent and independent variable? Do I need to standardize the control variables in my model when running the 2nd regression?
02) When running the regression with demeaned and standardized variables, is the code stated above correct; is it correct to use "reg" instead of "xtreg"?
03) Even though I did not standardize the control variables, the coefficients and the p values of those control variables in the 2nd regression are different from the 1st regression. Is that to be expected?

Any help is much appreciated. Thank you.

↧

Pooled OLS, fixed & random effects: Panel Data

April 28, 2020, 1:13 am

≫ Next: How to get the start and end round of each user?

≪ Previous: Demeaning and standardizing variables in panel regression

Hey everyone, (Data description is posted in the buttom) I'm currently writing by bachelor at Copenhagen Business School, and ran into an issue with Stata that i haven't been able to find the solution to on my own.

Since it is a university assignment the normal approach (as i have been taught, and is the recommendations in https://www.iuj.ac.jp/faculty/kucc62...blq5Qmk7KvdJLg) would be to start of with a simple model like a Pooled OLS, and then if that isn't sufficient, or the assumptions of the model don't seem to hold up, then you move on to fixed or random effects models. Gladly correct me if this approach isn't optimal.

My first issue when doing the Pooled OLS, is figuring out if it is actually done correctly (As i have seen different approaches from different sources). From what i can tell you do this by running clustered standard errors.

Code:

reg Covid19_cases x1 x2 x3 Country, vce(cluster Country)

Question 1. is this approach to Pooled OLS correct, and how should i include my time variable in the -reg?

Question 2. How do i test the assumptions of heteroskedasticity and autocorrelation when using clustered standard errors, as this seems to make it impossible to run a Breusch-Pagan test.

Code:

. hettest hettest not appropriate after robust cluster() r(498);

Furthermore, i know that -xtreg usually outperforms -reg (with clustered standard errors) when it comes to panel data regression.

So my Question 3 (See output from Pooled OLS and Random effects below) is how do i based on the stata output determine whether i should use Pooled OLS, fixed or random effects model. (As almost all my variables are static, i know that i'll probably end up with a -re effects model. I just simply haven't been able to statistically argue for this point of view, as i can't even test for things like heteroskedasticity and autocorrelation)

output from Pooled OLS:

Code:

Linear regression                               Number of obs     =      4,592
                                                F(19, 41)         =     303.69
                                                Prob > F          =     0.0000
                                                R-squared         =     0.7294
                                                Root MSE          =       1242

                                                (Std. Err. adjusted for 42 clusters in Country)
-----------------------------------------------------------------------------------------------
                              |               Robust
                Covid19_cases |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------------+----------------------------------------------------------------
                  Ages0_14Pct |   8123.495   11156.99     0.73   0.471     -14408.5    30655.49
                 Ages65_99Pct |   3038.334   11320.49     0.27   0.790    -19823.86    25900.53
                 Ages15_64Pct |   7851.408   11239.37     0.70   0.489    -14846.96    30549.78
               Covid19_deaths |   9.630924   1.715408     5.61   0.000     6.166587    13.09526
                   CrimeIndex |  -8.226521   5.282897    -1.56   0.127    -18.89555    2.442507
                  DAI_B_index |   1999.881   1276.736     1.57   0.125    -578.5393    4578.301
                  DAI_G_index |   290.8351   454.5301     0.64   0.526     -627.107    1208.777
                  DAI_P_index |   3.885746    690.839     0.01   0.996    -1391.292    1399.064
                      Gdp2018 |   .1692471   .0260085     6.51   0.000     .1167219    .2217724
           GdpAgriculturalPct |    1772.46   2558.133     0.69   0.492    -3393.794    6938.715
             GdpIndustrialPct |  -35.28193   2158.263    -0.02   0.987    -4393.983    4323.419
                GdpServicePct |   150.5583   2216.818     0.07   0.946    -4326.396    4627.512
         InternetUsage2014Pct |  -283.9534   701.2951    -0.40   0.688    -1700.248    1132.341
                  popData2018 |  -9.46e-07   4.60e-07    -2.06   0.046    -1.87e-06   -1.64e-08
pop_AnnualGrowthPct_2010_2018 |  -15230.27   11371.27    -1.34   0.188    -38195.02    7734.475
                pop_density18 |  -.3554776   .4322918    -0.82   0.416    -1.228509    .5175533
          SocialMobilityIndex |  -9.681271    21.7794    -0.44   0.659    -53.66567    34.30313
              StringencyIndex |   3.398476   1.255605     2.71   0.010     .8627302    5.934222
                      Country |   .3061796     1.2523     0.24   0.808    -2.222892    2.835251
                        _cons |  -7908.159   12120.52    -0.65   0.518    -32386.04    16569.72
-----------------------------------------------------------------------------------------------

output from Random effects:

Code:

xtset Country Date

Code:

Random-effects GLS regression                   Number of obs     =      4,592
Group variable: Country                         Number of groups  =         42

R-sq:                                           Obs per group:
     within  = 0.6600                                         min =         51
     between = 0.9450                                         avg =      109.3
     overall = 0.7293                                         max =        113

                                                Wald chi2(18)     =    9283.67
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

-----------------------------------------------------------------------------------------------
                Covid19_cases |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------------------+----------------------------------------------------------------
                  Ages0_14Pct |   7719.581   17073.57     0.45   0.651       -25744    41183.17
                 Ages65_99Pct |   2600.686   16826.48     0.15   0.877    -30378.61    35579.98
                 Ages15_64Pct |   7477.483   17549.24     0.43   0.670    -26918.39    41873.36
               Covid19_deaths |   9.728414   .1100511    88.40   0.000     9.512718     9.94411
                   CrimeIndex |   -8.30544   6.737789    -1.23   0.218    -21.51126    4.900383
                  DAI_B_index |   1956.444   1257.037     1.56   0.120    -507.3033    4420.191
                  DAI_G_index |   308.9836   463.2871     0.67   0.505    -599.0424     1217.01
                  DAI_P_index |   64.28808   1023.292     0.06   0.950    -1941.327    2069.903
                      Gdp2018 |   .1685496   .0209757     8.04   0.000      .127438    .2096611
           GdpAgriculturalPct |   1816.435   4797.581     0.38   0.705    -7586.651    11219.52
             GdpIndustrialPct |  -124.5615   4119.166    -0.03   0.976    -8197.979    7948.856
                GdpServicePct |   135.0255   4126.775     0.03   0.974    -7953.304    8223.356
         InternetUsage2014Pct |  -392.6615   1069.255    -0.37   0.713    -2488.362    1703.039
                  popData2018 |  -9.59e-07   3.03e-07    -3.16   0.002    -1.55e-06   -3.64e-07
pop_AnnualGrowthPct_2010_2018 |  -15717.27   16056.61    -0.98   0.328    -47187.65    15753.11
                pop_density18 |  -.3674106   .5401536    -0.68   0.496    -1.426092     .691271
          SocialMobilityIndex |  -8.118353   21.31227    -0.38   0.703    -49.88963    33.65293
              StringencyIndex |   3.613653   .5180906     6.97   0.000     2.598214    4.629092
                        _cons |  -7511.384    18162.4    -0.41   0.679    -43109.03    28086.26
------------------------------+----------------------------------------------------------------
                      sigma_u |   319.2514
                      sigma_e |  1214.6213
                          rho |  .06462069   (fraction of variance due to u_i)
-----------------------------------------------------------------------------------------------

Data description:
21 variables, and 4592 observations. (unbalanced dataset)

Variable	Description
Date	Time indicator (In days)
StringencyIndex	Index measuring the goverment response to Covid19. 100 being the most severe response, and 0 being the loosest response.
Covid19_cases	Dependent variable Measuring the number of recorded covid19 cases
Covid19_deaths	Measuring the number of recording deaths caused by covid19
popData2018	2018 country population data
DAI_index	Digital adoption index Measuring a countries digital adoption across three dimensions of the economy: people, government, and business
DAI_B_index	Measuring a countries digital adoption across business
DAI_P_index	Measuring a countries digital adoption across people
DAI_G_index	Measuring a countries digital adoption across government
pop_AnnualGrowthPct_2010_2018	Measuring a countries annual growth in population from 2010 to 2018 in pct.
Ages0_14Pct	Measuring the pct. of a countries population who are between 0 and 14 years of age.
Ages15_64Pct	Measuring the pct. of a countries population who are between 15 and 64 years of age.
Ages65_99Pct	Measuring the pct. of a countries population who are between 65 and 99 years of age.
Ages0_99Pct	Measuring the pct. of a countries population who are between 0 and 99 years of age.
CrimeIndex	Index measuring crime rates by country. 100 being the highest crimes rates and 0 being the lowest
SocialMobilityIndex	Index measuring social mobility by country 100 being the highest social mobility and 0 being the lowest
Gdp2018	Country GDP by 2018 numbers
GdpAgriculturalPct	Pct. of a countries GDP that comes from the agriculture sector
GdpIndustrialPct	Pct. of a countries GDP that comes from the industrial sector
GdpServicePct	Pct. of a countries GDP that comes from the service sector
InternetUsage2014Pct	% of a countries population that uses the internet, by 2014 numbers
Country	Entity indicator
Continent	Continent
pop_density18	Population density by country by 2018 numbers

I hope i have been as precise and informative as possible.

Best regards, Walther Larsen

↧

How to get the start and end round of each user?

April 28, 2020, 1:22 am

≫ Next: fillin (a question on Twitter)

≪ Previous: Pooled OLS, fixed & random effects: Panel Data

Hi all,

I need your help. I have a dataset with two variables:
- Username
- Round: Ordinal number of rounds.

I would like to get the spell of each user: at which round he starts and at which round he ends.

Thank you for your help.

Here is the dataset:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str18 username int round
"user E"  1
"user C"  1
"user D"  1
"user E"  2
"user C"  2
"user D"  2
"user E"  3
"user C"  3
"user D"  3
"user E"  4
"user C"  4
"user D"  4
"user E"  5
"user C"  5
"user D"  5
"user E"  6
"user C"  6
"user D"  6
"user F"  6
"user E"  7
"user D"  7
"user F"  7
"user E"  8
"user D"  8
"user F"  8
"user E"  9
"user D"  9
"user F"  9
"user E" 10
"user D" 10
"user F" 10
"user A" 10
end

↧

fillin (a question on Twitter)

April 28, 2020, 1:28 am

≫ Next: Delete observations under conditions

≪ Previous: How to get the start and end round of each user?

https://twitter.com/johannesmboehm/s...48146396504064 raised a question about fillin. As tweets may disappear for various reasons, here it is, edited slightly.

fillin doesn't allow by: (but never understood why). I just wish there was a more efficient way to do it rather than reshaping twice.

The reason is, as I understand it, that the variables you want to specify in by: can just be specified directly.

Here is a silly example, a panel dataset that you want to make balanced (futile though that may be):

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(AreaID Year Var1 Var2)
1 2000 20 30
1 2001 21 32
2 2000 50 55
end

fillin AreaID Year

list

+---------------------------------------+
| AreaID Year Var1 Var2 _fillin |
|---------------------------------------|
1. | 1 2000 20 30 0 |
2. | 1 2001 21 32 0 |
3. | 2 2000 50 55 0 |
4. | 2 2001 . . 1 |
+---------------------------------------+

↧

Delete observations under conditions

April 28, 2020, 1:39 am

≫ Next: Marginal effects-interpretation

≪ Previous: fillin (a question on Twitter)

Hi, I have a question that I hope someone can answer. I'm still pretty new to stata, so excuse me if I'm completely lost...

I work with a dataset where I made my own. A survey.

I have 538 observations in my dataset.
I am researching something with Muslims in Denmark and have therefore measured the degree of their identification with the social group "muslim minorities" on three variables.
Here, they should declare "strongly agree" to "strongly disagree" on a scale of 1-5.

I would now like only to keep the observations that have either responded "strongly agree" or "partially agree" on mininum two of the three variables.

I don't know if that's a possibility in stata at all? I've searched and read a lot, but can't find some commands that can solve my problem.

↧

Marginal effects-interpretation

April 28, 2020, 2:15 am

≫ Next: Showing 95%-CI intervals in bar graph

≪ Previous: Delete observations under conditions

Dear all,

I am using a logit model to find which are the possibities the dependent variable Y to be affected by the X variables.
In the table below are presenting the results of my logit model. The column I_m shows the marginal effects of the model.
Now, I want to interpet how the variable X3 affect the dependent variable but I am a little bit confused. Which value am I going to use for the interpetation? I think that the value -4.160 is going to be used, right? But how am I going to interpret this negative association?
I have read in the forum that in cases where there is a negative association between two variables in the logit model, then an increase of the X variable means that it has less possibility to affect the dependent variable (an increase of the X3 variable means that it has 416% possibility not to affect the dependent variable. Is this correct)?

Thank you
Kind regards
K. Marenas

- - - - - - - - - - - - - - - - - - - - - - - - - - -
I I_m

main
X1 1.240 0.200
(0.66) (0.66)
X2 0.543*** 0.088***
(3.06) (3.09)
X1*x2 6.744* 1.090*
(1.70) (1.70)
X3 -34.421*** -4.160***
(-3.77) (-3.90)
X4 0.333*** 0.054***
(3.61) (3.68)

- - - - - - - - - - - - - - - - - - - - - - - - -

↧

Showing 95%-CI intervals in bar graph

April 28, 2020, 2:50 am

≫ Next: hetprobit and robust

≪ Previous: Marginal effects-interpretation

Hello everyone

In my data I have a dichotomous variable (=value) for three different conditions (which are coded as 0,1,2). Currently, I get a nice picture with:

graph bar, over(value) by(condition, cols(3)) blabel(bar, format(%4.2g)) ytitle(Distribution in percentage) yscale(range(0 100)) ylabel(#5) scheme(s1mono) xsize(5)

However, I also want to show the 95%-CI in the bars. Some research suggest, that I would need to work with a twoway graph and use || rcap but I don't get it to work. I think the syntax with by(condition) is not supported in two-way graphs?

Can somebody help me?

Thanks
Jonas

↧

hetprobit and robust

April 28, 2020, 2:52 am

≫ Next: Paneldata Fixed Effects with and without robust leads to the same result

≪ Previous: Showing 95%-CI intervals in bar graph

Hello,

I have been readig on this forum and if I understand correctly, hetprobit estimates a probit model with hetereoscedasticity. However, when I look at the help page in Stata of hetprobit, I see that the option vc(robust) still exists. This seems strange to me, given that I assume that the hetprobit command already takes into account this heteroscedasticity. Could anyone clarify this?

Thank you very much!
Timea

↧

Paneldata Fixed Effects with and without robust leads to the same result

April 28, 2020, 2:53 am

≫ Next: brant test STATA 16

≪ Previous: hetprobit and robust

Hello,
I hope anyone can help me.
I have a unbalanced Panel. Hausmann leads me to use for further analysis fixed effects. Testparm and modified wald test leads me to use entity and time fixed effects and also "robust" with vce (robust) because of heteroskedasticiy. When I use xi: xtreg y x1 i.year, fe and xi: xtreg y x1 i.year, fe vce (robust) I get the same results...Did i made a mistake anywhre? Thanks in advance.

. xtset Unternehmen Jahr
panel variable: Unternehmen (unbalanced)
time variable: Jahr, 2014 to 2018, but with gaps
delta: 1 year

. xtreg Y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11, fe

Fixed-effects (within) regression Number of obs = 342
Group variable: Unternehmen Number of groups = 79

R-sq: Obs per group:
within = 0.6964 min = 1
between = 0.3551 avg = 4.3
overall = 0.4950 max = 5

F(11,252) = 52.54
corr(u_i, Xb) = -0.2154 Prob > F = 0.0000

------------------------------------------------------------------------------
Y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | .0121592 .015234 0.80 0.426 -.017843 .0421614
x2 | 1.885169 5.310276 0.36 0.723 -8.573006 12.34334
x3 | -.4471534 .2738079 -1.63 0.104 -.9863968 .09209
x4 | -12.6464 3.358336 -3.77 0.000 -19.26038 -6.032417
x5 | -3.154039 1.94352 -1.62 0.106 -6.981651 .6735729
x6 | 9.843865 .5781626 17.03 0.000 8.705219 10.98251
x7 | -.0105897 .0053595 -1.98 0.049 -.0211448 -.0000345
x8 | -.0909935 .1932579 -0.47 0.638 -.4716 .289613
x9 | .0042337 .179777 0.02 0.981 -.3498233 .3582906
x10 | .2618862 .1633328 1.60 0.110 -.0597851 .5835574
x11 | .286262 .1016634 2.82 0.005 .0860437 .4864802
_cons | 10.14478 64.69261 0.16 0.876 -117.2623 137.5519
-------------+----------------------------------------------------------------
sigma_u | 22.903426
sigma_e | 13.34097
rho | .74666305 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(78, 252) = 5.96 Prob > F = 0.0000

. estimates store fe

. xtreg Y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11, re

Random-effects GLS regression Number of obs = 342
Group variable: Unternehmen Number of groups = 79

R-sq: Obs per group:
within = 0.6785 min = 1
between = 0.5220 avg = 4.3
overall = 0.6113 max = 5

Wald chi2(11) = 611.68
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

------------------------------------------------------------------------------
Y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | .0017081 .012697 0.13 0.893 -.0231775 .0265937
x2 | 1.529129 1.906254 0.80 0.422 -2.20706 5.265318
x3 | -.3607013 .2628963 -1.37 0.170 -.8759687 .1545661
x4 | -6.831479 2.186577 -3.12 0.002 -11.11709 -2.545868
x5 | -4.372107 1.993001 -2.19 0.028 -8.278317 -.4658961
x6 | 10.26525 .5341722 19.22 0.000 9.218294 11.31221
x7 | -.0058791 .0054074 -1.09 0.277 -.0164774 .0047193
x8 | .2901557 .1817283 1.60 0.110 -.0660251 .6463366
x9 | -.2081798 .1323955 -1.57 0.116 -.4676702 .0513107
x10 | .1843618 .1283733 1.44 0.151 -.0672452 .4359688
x11 | .0867639 .079641 1.09 0.276 -.0693296 .2428575
_cons | 16.62166 31.65446 0.53 0.600 -45.41994 78.66325
-------------+----------------------------------------------------------------
sigma_u | 15.001195
sigma_e | 13.34097
rho | .55837761 (fraction of variance due to u_i)
------------------------------------------------------------------------------

. estimates store re

. hausman fe re

---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fe re Difference S.E.
-------------+----------------------------------------------------------------
x1 | .0121592 .0017081 .0104511 .0084179
x2 | 1.885169 1.529129 .3560401 4.956332
x3 | -.4471534 -.3607013 -.0864521 .0765263
x4 | -12.6464 -6.831479 -5.814919 2.548981
x5 | -3.154039 -4.372107 1.218068 .
x6 | 9.843865 10.26525 -.4213873 .221206
x7 | -.0105897 -.0058791 -.0047106 .
x8 | -.0909935 .2901557 -.3811492 .065753
x9 | .0042337 -.2081798 .2124134 .1216191
x10 | .2618862 .1843618 .0775244 .1009847
x11 | .286262 .0867639 .199498 .0631883
------------------------------------------------------------------------------
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test: Ho: difference in coefficients not systematic

chi2(11) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 76.16
Prob>chi2 = 0.0000
(V_b-V_B is not positive definite)

. xtset Unternehmen Jahr
panel variable: Unternehmen (unbalanced)
time variable: Jahr, 2014 to 2018, but with gaps
delta: 1 year

. xi: xtreg Y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 i.Jahr, fe
i.Jahr _IJahr_2014-2018 (naturally coded; _IJahr_2014 omitted)

Fixed-effects (within) regression Number of obs = 342
Group variable: Unternehmen Number of groups = 79

R-sq: Obs per group:
within = 0.7509 min = 1
between = 0.4127 avg = 4.3
overall = 0.5609 max = 5

F(15,248) = 49.85
corr(u_i, Xb) = -0.1701 Prob > F = 0.0000

Y Coef. Std. Err. t P>t [95% Conf. Interval]

x1 .0091464 .0139301 0.66 0.512 -.01829 .0365829
x2 -4.252209 4.960329 -0.86 0.392 -14.02195 5.517534
x3 -.473511 .2530563 -1.87 0.062 -.9719246 .0249026
x4 -12.73621 3.099359 -4.11 0.000 -18.84064 -6.63179
x5 11.2182 3.789075 2.96 0.003 3.755325 18.68107
x6 10.09346 .5504423 18.34 0.000 9.009325 11.1776
x7 .0046431 .0086834 0.53 0.593 -.0124595 .0217458
x8 .0455602 .2440602 0.19 0.852 -.4351348 .5262552
x9 -.0980529 .1657255 -0.59 0.555 -.4244618 .228356
x10 .2070221 .1506021 1.37 0.170 -.0896001 .5036444
x11 .102173 .0971356 1.05 0.294 -.0891428 .2934889
_IJahr_2015 15.09593 3.162919 4.77 0.000 8.866322 21.32554
_IJahr_2016 21.15731 5.111523 4.14 0.000 11.08978 31.22485
_IJahr_2017 17.33039 4.970709 3.49 0.001 7.540202 27.12058
_IJahr_2018 23.44871 3.456689 6.78 0.000 16.6405 30.25692
_cons 17.31959 66.04854 0.26 0.793 -112.768 147.4072

sigma_u 21.567956
sigma_e 12.179949
rho .75819981 (fraction of variance due to u_i)

F test that all u_i=0: F(78, 248) = 6.59 Prob > F = 0.0000

. xi: regress Y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 i.Unternehmen i.Jahr
i.Unternehmen _IUnternehm_1-79 (naturally coded; _IUnternehm_1 omitted)
i.Jahr _IJahr_2014-2018 (naturally coded; _IJahr_2014 omitted)

Source SS df MS Number of obs = 342
F(93, 248) = 23.07
Model 318227.767 93 3421.80395 Prob > F = 0.0000
Residual 36791.0876 248 148.35116 R-squared = 0.8964
Adj R-squared = 0.8575
Total 355018.854 341 1041.11101 Root MSE = 12.18

Y Coef. Std. Err. t P>t [95% Conf. Interval]

x1 .0091464 .0139301 0.66 0.512 -.01829 .0365829
x2 -4.252209 4.960329 -0.86 0.392 -14.02195 5.517534
x3 -.473511 .2530563 -1.87 0.062 -.9719246 .0249026
x4 -12.73621 3.099359 -4.11 0.000 -18.84064 -6.63179
x5 11.2182 3.789075 2.96 0.003 3.755325 18.68107
x6 10.09346 .5504423 18.34 0.000 9.009325 11.1776
x7 .0046431 .0086834 0.53 0.593 -.0124595 .0217458
x8 .0455602 .2440602 0.19 0.852 -.4351348 .5262552
x9 -.0980529 .1657255 -0.59 0.555 -.4244618 .228356
x10 .2070221 .1506021 1.37 0.170 -.0896001 .5036444
x11 .102173 .0971356 1.05 0.294 -.0891428 .2934889
_IUnternehm_2 1.575516 10.30654 0.15 0.879 -18.724 21.87503
_IUnternehm_3 -39.06488 10.25417 -3.81 0.000 -59.26125 -18.86852
_IUnternehm_4 -24.40019 12.7029 -1.92 0.056 -49.41951 .6191261
_IUnternehm_5 -14.42787 11.04235 -1.31 0.193 -36.17662 7.320877
_IUnternehm_6 -8.348222 9.779419 -0.85 0.394 -27.60953 10.91308
_IUnternehm_7 20.5798 15.03405 1.37 0.172 -9.030897 50.1905
_IUnternehm_8 2.41823 14.65319 0.17 0.869 -26.44234 31.2788
_IUnternehm_9 5.66785 16.76055 0.34 0.736 -27.34332 38.67902
_IUnternehm_10 -16.14584 10.02271 -1.61 0.108 -35.88633 3.594639
_IUnternehm_11 17.81726 18.40685 0.97 0.334 -18.43642 54.07094
_IUnternehm_12 19.34285 14.61768 1.32 0.187 -9.447774 48.13348
_IUnternehm_13 20.54812 12.09701 1.70 0.091 -3.277862 44.3741
_IUnternehm_14 32.32243 12.44368 2.60 0.010 7.813661 56.83119
_IUnternehm_15 -37.85379 12.83629 -2.95 0.003 -63.13584 -12.57174
_IUnternehm_16 46.11187 13.25584 3.48 0.001 20.00349 72.22025
_IUnternehm_17 -6.344185 8.112828 -0.78 0.435 -22.32301 9.634643
_IUnternehm_18 2.288403 11.73014 0.20 0.845 -20.81499 25.3918
_IUnternehm_19 2.392787 11.70008 0.20 0.838 -20.65141 25.43699
_IUnternehm_20 5.753934 18.12912 0.32 0.751 -29.95275 41.46061
_IUnternehm_21 -4.725286 11.67111 -0.40 0.686 -27.71241 18.26184
_IUnternehm_22 -16.47211 15.21205 -1.08 0.280 -46.4334 13.48917
_IUnternehm_23 -4.486927 12.41375 -0.36 0.718 -28.93675 19.9629
_IUnternehm_24 13.11413 18.38299 0.71 0.476 -23.09257 49.32082
_IUnternehm_25 9.684397 15.62584 0.62 0.536 -21.09188 40.46067
_IUnternehm_26 14.69772 15.2387 0.96 0.336 -15.31606 44.71149
_IUnternehm_27 -6.59571 14.57482 -0.45 0.651 -35.30192 22.1105
_IUnternehm_28 5.690628 11.49149 0.50 0.621 -16.94274 28.32399
_IUnternehm_29 7.263189 10.90053 0.67 0.506 -14.20623 28.73261
_IUnternehm_30 54.27688 15.65619 3.47 0.001 23.44083 85.11292
_IUnternehm_31 1.799696 14.1221 0.13 0.899 -26.01485 29.61424
_IUnternehm_32 41.18111 11.57496 3.56 0.000 18.38334 63.97887
_IUnternehm_33 9.193567 10.87896 0.85 0.399 -12.23338 30.62051
_IUnternehm_34 -18.78013 10.63879 -1.77 0.079 -39.73402 2.173762
_IUnternehm_35 -3.696596 9.635246 -0.38 0.702 -22.67394 15.28075
_IUnternehm_36 7.51152 8.79678 0.85 0.394 -9.814405 24.83744
_IUnternehm_37 7.066402 14.56265 0.49 0.628 -21.61584 35.74864
_IUnternehm_38 34.83821 10.2329 3.40 0.001 14.68374 54.99269
_IUnternehm_39 -10.7081 12.98113 -0.82 0.410 -36.27542 14.85923
_IUnternehm_40 13.68806 11.58929 1.18 0.239 -9.137929 36.51404
_IUnternehm_41 57.59317 17.35045 3.32 0.001 23.42015 91.76619
_IUnternehm_42 -3.437701 13.36429 -0.26 0.797 -29.75967 22.88427
_IUnternehm_43 -2.297358 8.450392 -0.27 0.786 -18.94105 14.34633
_IUnternehm_44 2.880239 10.46194 0.28 0.783 -17.72534 23.48582
_IUnternehm_45 -18.96786 10.9294 -1.74 0.084 -40.49414 2.55842
_IUnternehm_46 -14.54469 9.648038 -1.51 0.133 -33.54723 4.457851
_IUnternehm_47 -24.71603 10.59782 -2.33 0.020 -45.58923 -3.842824
_IUnternehm_48 23.25981 11.90508 1.95 0.052 -.1881501 46.70778
_IUnternehm_49 -6.297338 9.950548 -0.63 0.527 -25.89569 13.30102
_IUnternehm_50 18.34087 20.53314 0.89 0.373 -22.10071 58.78245
_IUnternehm_51 -18.92622 15.42757 -1.23 0.221 -49.31199 11.45954
_IUnternehm_52 -14.16451 9.746474 -1.45 0.147 -33.36092 5.031913
_IUnternehm_53 -.7125818 9.647064 -0.07 0.941 -19.7132 18.28804
_IUnternehm_54 8.41392 10.01824 0.84 0.402 -11.31775 28.14559
_IUnternehm_55 26.42424 14.99984 1.76 0.079 -3.119073 55.96755
_IUnternehm_56 -23.45135 9.192148 -2.55 0.011 -41.55599 -5.346725
_IUnternehm_57 -15.05711 9.931668 -1.52 0.131 -34.61828 4.504062
_IUnternehm_58 57.27402 13.73893 4.17 0.000 30.21415 84.33388
_IUnternehm_59 -.4294764 14.91609 -0.03 0.977 -29.80784 28.94889
_IUnternehm_60 -15.16002 8.958061 -1.69 0.092 -32.8036 2.483558
_IUnternehm_61 17.21946 8.898878 1.94 0.054 -.3075541 34.74647
_IUnternehm_62 -14.53266 10.48203 -1.39 0.167 -35.17782 6.112498
_IUnternehm_63 -6.940407 10.67026 -0.65 0.516 -27.95629 14.07548
_IUnternehm_64 -5.196539 15.06009 -0.35 0.730 -34.85853 24.46546
_IUnternehm_65 26.02929 19.90437 1.31 0.192 -13.17388 65.23246
_IUnternehm_66 14.60718 11.24314 1.30 0.195 -7.537036 36.75139
_IUnternehm_67 -22.08857 8.733508 -2.53 0.012 -39.28987 -4.887264
_IUnternehm_68 -7.113967 8.458266 -0.84 0.401 -23.77316 9.545228
_IUnternehm_69 -17.87005 11.78165 -1.52 0.131 -41.07491 5.334804
_IUnternehm_70 52.5356 12.00522 4.38 0.000 28.89041 76.18079
_IUnternehm_71 -34.25489 15.97117 -2.14 0.033 -65.71133 -2.798458
_IUnternehm_72 -8.828511 12.80215 -0.69 0.491 -34.04332 16.3863
_IUnternehm_73 21.10142 14.70096 1.44 0.152 -7.853237 50.05607
_IUnternehm_74 -20.18248 11.94858 -1.69 0.092 -43.71611 3.351144
_IUnternehm_75 12.24323 18.45948 0.66 0.508 -24.11412 48.60058
_IUnternehm_76 -12.00339 15.06139 -0.80 0.426 -41.66793 17.66115
_IUnternehm_77 5.011427 12.52339 0.40 0.689 -19.65435 29.6772
_IUnternehm_78 -29.8218 15.41626 -1.93 0.054 -60.18529 .5416937
_IUnternehm_79 -24.57285 16.59229 -1.48 0.140 -57.25261 8.106912
_IJahr_2015 15.09593 3.162919 4.77 0.000 8.866322 21.32554
_IJahr_2016 21.15731 5.111523 4.14 0.000 11.08978 31.22485
_IJahr_2017 17.33039 4.970709 3.49 0.001 7.540202 27.12058
_IJahr_2018 23.44871 3.456689 6.78 0.000 16.6405 30.25692
_cons 15.75827 61.05587 0.26 0.797 -104.4959 136.0124

. xi: xtreg Y x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 i.Jahr, fe vce(robust)
i.Jahr _IJahr_2014-2018 (naturally coded; _IJahr_2014 omitted)

Fixed-effects (within) regression Number of obs = 342
Group variable: Unternehmen Number of groups = 79

R-sq: Obs per group:
within = 0.7509 min = 1
between = 0.4127 avg = 4.3
overall = 0.5609 max = 5

F(15,78) = 37.55
corr(u_i, Xb) = -0.1701 Prob > F = 0.0000

(Std. Err. adjusted for 79 clusters in Unternehmen)

Robust
Y Coef. Std. Err. t P>t [95% Conf. Interval]

x1 .0091464 .0133006 0.69 0.494 -.017333 .0356259
x2 -4.252209 5.415393 -0.79 0.435 -15.03343 6.52901
x3 -.473511 .3681172 -1.29 0.202 -1.206376 .2593541
x4 -12.73621 3.07371 -4.14 0.000 -18.8555 -6.616927
x5 11.2182 4.267745 2.63 0.010 2.721767 19.71462
x6 10.09346 .9198481 10.97 0.000 8.262186 11.92474
x7 .0046431 .008961 0.52 0.606 -.0131968 .0224831
x8 .0455602 .2953716 0.15 0.878 -.5424795 .6335998
x9 -.0980529 .1633973 -0.60 0.550 -.4233519 .2272461
x10 .2070221 .1437574 1.44 0.154 -.0791769 .4932211
x11 .102173 .0929156 1.10 0.275 -.0828077 .2871538
_IJahr_2015 15.09593 3.107805 4.86 0.000 8.908765 21.2831
_IJahr_2016 21.15731 5.616581 3.77 0.000 9.975561 32.33907
_IJahr_2017 17.33039 5.899157 2.94 0.004 5.586071 29.07471
_IJahr_2018 23.44871 3.764808 6.23 0.000 15.95355 30.94387
_cons 17.31959 65.53958 0.26 0.792 -113.1597 147.7989

sigma_u 21.567956
sigma_e 12.179949
rho .75819981 (fraction of variance due to u_i)

↧

brant test STATA 16

April 28, 2020, 3:03 am

≫ Next: Randomization test and descriptive statistics

≪ Previous: Paneldata Fixed Effects with and without robust leads to the same result

Hi everyone

I can't run brant, detail in STATA 16. I have downloaded the oparallel and been trying to run after the omodel logit but this messege appears "the command brant is unrecognized
r(199);"
Any ideas how to solve this please?
Thank you in advance
Ellie

↧

Randomization test and descriptive statistics

April 28, 2020, 3:38 am

≫ Next: Independent variables based on same variable

≪ Previous: brant test STATA 16

Hi everyone!

I am trying to replicate a balance test and descriptive statistics similar to the photo attached (but with three treatment groups in my case). I have been googling some command for this but I haven't found any, so I was wondering if any of you have a tip for that. I have Stata16 and I need to convert the table into latex code.

Thanks a lot!

Array

Array

↧