Suest after xtivreg2

November 6, 2015, 8:11 am

≪ Previous: Working with dates - assigning day and month to year of birth

Dear All,

I need to compare coefficients from 4 different regressions post xtivreg2. I am running:

quietly eststo: xtivreg2 cog x (vol=iv1) if (popfin==1), i(id) fe first
quietly eststo: xtivreg2 cog x (vol=iv2) if (popfin==1), i(id) fe first
quietly eststo: xtivreg2 cog x (vol=iv3) if (popfin==1), i(id) fe first
quietly eststo: xtivreg2 cog x (vol=iv4) if (popfin==1), i(id) fe first

I wish to compare the coefficients on vol from the four regressions above. I tried the following:

suest est1 est2 est3 est4, vce (cluster id)

but got the following error:
unable to generate scores for model est1
suest requires that predict allow the score option
r(322);

From researching further it seems (I am not 100% sure on this) suest requires predicted scores and xtivreg2 doesn't allow scores. So is there an alternative way to compare the coefficient on vol from the 4 estimates regressions above? Many thanks.

Sincerely,
Sumedha Gupta.

↧

topic test

November 6, 2015, 8:31 am

≫ Next: Simulating multivariate normal distribution with mean of e(B) and variance/covariance e(V)

≪ Previous: Suest after xtivreg2

Ignore this test.

↧

Simulating multivariate normal distribution with mean of e(B) and variance/covariance e(V)

November 6, 2015, 8:41 am

≫ Next: Trouble determining which variables have significant interactions for an imputation model

≪ Previous: topic test

Hi everyone,

I am working on implementing a parametric bootstrap simulation where I have B, a vector with several parameter estimates from a regression model, and e(V), a variance/covariance matrix. The book I'm using says that "each entry in the simulated sample is a random draw from a multivariate normal distribution with mean [e(b)] and variance/covariance [e(V)]." I've subbed in stata matrix names where appropriate.

How would I implement this type of analysis in stata? The bootstrap command seems to be inappropriate in this scenario since it is non-parametric.

↧

Trouble determining which variables have significant interactions for an imputation model

November 6, 2015, 8:43 am

≫ Next: insignificant interaction term - should we still look at the marginal effects?

≪ Previous: Simulating multivariate normal distribution with mean of e(B) and variance/covariance e(V)

Dear All,

I would be very grateful for your help. I am developing a chained imputation model using 14 categorical and continuous variables. I am imputing 6 variables with 2-20% missing data. I have read the Stata 13 manual and understand the Stata script for specifying the model. I am having trouble working out which variables have significant interactions between them. In the Stata imputation manual and recommended papers on multiple imputation there are many sections on how to put interactions into models but I can't find any advice on how to decide which variables have significant interactions in the first place and I am getting rather frustrated with it!

Many thanks for your time

Andrew Rosser

↧

insignificant interaction term - should we still look at the marginal effects?

November 6, 2015, 8:45 am

≫ Next: Stepwise ZIP and ZINB models

≪ Previous: Trouble determining which variables have significant interactions for an imputation model

Dear Statalist,

I have a question that is maybe not so much about Stata as it is about statistics in general.

I run a multivariate regression model (prais) and include an interaction effect between two continuous variables. Let’s call them X and Z. (c.X#c.Z)
The output tells me the interaction term c.X#c.Z is not significant. P>(t) = 0.470

Can I/should I proclaim that there is no interaction, or should I still look at the marginal effects?

When I look at the margins, I get the following:
margins, dydx(X) at( Z=(0(1)10)) vsquish

------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
X |
_at |
1 | -.0272084 .0156325 -1.74 0.082 -.0578475 .0034307
2 | -.0252041 .0131056 -1.92 0.054 -.0508906 .0004824
3 | -.0231998 .0106948 -2.17 0.030 -.0441613 -.0022383
4 | -.0211955 .0084996 -2.49 0.013 -.0378544 -.0045366
5 | -.0191912 .0067341 -2.85 0.004 -.0323898 -.0059926
6 | -.0171869 .0058046 -2.96 0.003 -.0285638 -.00581
7 | -.0151826 .0061058 -2.49 0.013 -.0271497 -.0032156
8 | -.0131783 .0074905 -1.76 0.079 -.0278595 .0015028
9 | -.0111741 .0094961 -1.18 0.239 -.0297861 .007438
10 | -.0091698 .0118104 -0.78 0.438 -.0323177 .0139782
11 | -.0071655 .0142841 -0.50 0.616 -.0351618 .0208308
------------------------------------------------------------------------------

Or should I say, based on this, that the two variables actually interact; that at some values of Z (3-8), the effect of X on Y is actually moderated/influenced by Z.
I thought that after looking at the (insignificant) interaction term in the output, there is no need to investigate further.

Would the conclusion/interpretation change if _all 11_ values (not just 3-8 but 1-11) were significant, while the main interaction term in the multivariate regression remains insignificant?

Thank you,

Alex

↧

Stepwise ZIP and ZINB models

November 6, 2015, 12:45 pm

≫ Next: Combining 5 dataset into one

≪ Previous: insignificant interaction term - should we still look at the marginal effects?

Greetings everyone,

I wanted to ask if there is a way to perform stepwise (or, for that matter, any other variable selection method) zero-inflated poisson or negative binomial regressions, as the stepwise command is not supported by ZIp and ZINB.

Thank you in advance for any information and advice,
Thanos

↧

Combining 5 dataset into one

November 6, 2015, 1:02 pm

≫ Next: Matching Pairs and Indicating Common Characters

≪ Previous: Stepwise ZIP and ZINB models

Hi there!
I am having trouble trying to xtset my dta file in stata. I have five home countries, and each countries have data for 70 host countries from the year 2001 - 2012. after i stacked all the dataset (5 home countries), and xtset, the answer given was repeated time values within panel r(451). The example of the data is this :

year	home country	host country	gdp	export	import
2001	1	1	44	55	45
2002	1	2	45	56	34
2003	1	3	45	66	43
2004	1	4	54	56	66
2001	2	1	67	98	76
2002	2	2	55	76	53
2003	2	3	67	35	13
2004	2	4	89	47	66
2001	3	1	45	56	77
2002	3	2	34	56	87
2003	3	3	7	44	67
2004	3	4	78	67	88

how can i arrange the data so that i can xtset country year?

Thank you so much

↧

Matching Pairs and Indicating Common Characters

November 6, 2015, 1:47 pm

≫ Next: Counting number of times an observation appears within a group

≪ Previous: Combining 5 dataset into one

Dear All,

I am encountering a problem that I found really hard to solve. I am writing here for seek for some advices.
There are two datasets in my analysis. The first one contains interstate conflicts data. The data looks like (for example):

Conflict ID	Involved Countries	Side A or B
1	USA	A
1	Canada	B
1	Australia	A
1	China	B
1	Russia	B
2	India	A
2	Thailand	B
2	Bangladesh	B

(A: Conflict starter. B: The other side of the conflict)
My aim is to pair all the A and B side within each Conflict ID. Is there any quick way that I can use to pair them?

The other dataset includes Country's the certain information.

Country	Characteristics
USA	HAHAHA
USA	LALALA
USA	WAWAWA
Canada	LALALA
Canada	NONONO
China	HAHAHA
China	YAYAYA
Australia	HAHAHA
Russia	YAYAYA

My aim for this dataset is to identify whether two country share the same characteristics.

Finally, I would like to combine these two dataset to create a new dataset indicating a paired country with
a dummy indicating whether they share the same characteristic. The ideal data structure look like:

Conflict ID	Pair ID	Country	Side	Character Dummy
1	1	USA	A	1
1	1	Canada	B	1
1	2	USA	A	1
1	2	China	B	1
1	3	USA	A	0
1	3	Russia	B	0

I understand this is quite long and a bit confusing question! I really appreciate any help from here!
If there is anything unclear, I will be more than happy to explain!

Thank you in advance for your kind help! I look forward to hearing from you!

Best regards
Long

↧

Counting number of times an observation appears within a group

November 6, 2015, 2:42 pm

≫ Next: help fixing my code

≪ Previous: Matching Pairs and Indicating Common Characters

Hi all, I am a new user to STATA and using a dataset with ~9000 observations and two main variables of interest: ir_no and ir_no_1. I am trying to count the number of times a given value of ir_no_1 appears for each ir_no. Here is a mock table of what the data looks like with the variable I am trying to make ("new_var").

ir_no	ir_no_1	new_var
111	abc22	2
111	abc22	2
111	abc11	1
222	abc33	2
222	abc22	1
222	abc33	2

I tried the following code, but something must be incorrect since it gives me the total number of observations per group:

gen calc = .
foreach i in ir_no_1 {
    by ir_no: egen calc2 = count(ir_no_1) if ir_no_1 == `i'
    replace calc = calc2 if ir_no_1 == `i'
    drop calc2
    }

Is there an easy way to do this? I looked around the forums for a quick solution, but have been unsuccessful. Thanks in advance for any advice.

↧

help fixing my code

November 6, 2015, 11:42 pm

≫ Next: non-parametric propensity score matching

≪ Previous: Counting number of times an observation appears within a group

sort visitlink daystoevent
by visitlink: egen transfer_admit=1 if pt_int=1 & daystoevent>min(daystoevent)

visitlink is the patient ID
daystoevent is the number of days that help identify the chronological order of admissions
lost is length of hospital stay

I have noticed that some patients were admitted to one hospital and transferred to another hospital (both have different IDs) and I have identified the patients of interest (pt_int=1) based on the presence of some variables which were positive in both admissions. I am trying to figure out a code that would help identify the observation with the least daystoevent.

daystoevent los visitlink pt_int
15146 4 1888 1 <--This is the index admission
15150 17 1888 1 <--this is the transfer admission for the same patient notice daytoevent +los in the previous observation = to daystoevent here
13275 7 6334 1
18034 8 6676 1
13187 1 6687 0 <-- (previous admission to the index admission)
13244 4 6687 1
17238 30 9647 1

↧

non-parametric propensity score matching

November 7, 2015, 12:28 am

≫ Next: Delete each 10th observation:panel data

≪ Previous: help fixing my code

Can propensity scores generated through non-parametric measures be used for matching measures such radius matching or nearest-neighbour matching? I have seen the use of non-parametric propensity scores for reweighting but not for matching.

↧

Delete each 10th observation:panel data

November 7, 2015, 2:09 am

≫ Next: Country variables and fixed effects

≪ Previous: non-parametric propensity score matching

Dear Statalist member,

I have already asked this question but have not a response. Probably it was lost...

Let me ask you again hoping for some help as I am struggling to implement it several weeks.

I have panel data for 100 stocks. I have generated variable Trade_1. It takes value 1 for buy signal and -1 for sell.

My next step is to generate Trade_2. It is almost the same as Trade_1. The only difference is that I need to replace any signal with =. for the following 10 days.
In other words after a first buy/sell signal one should wait for the next 10 days and ignore any signals within this period.

I tried to do the following but it is wrong and does not work....

bysort _j (day_1): replace trade_2=1 if [f1.trade_1 +f2.trade_1 +f3.trade_1 +f4.trade_1 +f5.trade_1 +f6.trade_1 +f7.trade_1 +f8.trade_1 +f9.trade_1 +f10.trade_1 ]>0
bysort _j (day_1): replace trade_2=-1 if [f1.trade_1 +f2.trade_1 +f3.trade_1 +f4.trade_1 +f5.trade_1 +f6.trade_1 +f7.trade_1 +f8.trade_1 +f9.trade_1 +f10.trade_1 ]<0

May you please suggest how to delete each subsequent observation after the first signal?

Thank you.

Array

↧

Country variables and fixed effects

November 7, 2015, 4:06 am

≫ Next: Placebo tests

≪ Previous: Delete each 10th observation:panel data

Hi, members! I am using a cross-section data from a survey, where I am studying the effects of regime types on work ethic. My variables of interest are the regime type and individual controls (such as sex, age, etc.). When I create a dummy "country" to consider it a fixed effect, stata omitts some countries because of collinearity (that I believe that is collinear with the regime type). Can anyone explain me what should I do in order to control for unobserved heterogeneity?

↧

Placebo tests

November 7, 2015, 5:53 am

≫ Next: How to use spmat command on panel data?

≪ Previous: Country variables and fixed effects

Hi everyone,

I have a question on how to run a placebo test in STATA.
I have read the following entry from the STATA Forum, but I am unsure the same framework applies to my case.

I am running a model to explain the role of experience on adoption of a strategy.
I have several control variables on the RHS, FE, and I included past decisions on the strategy to proxy the importance of DIRECT experience.
I also include past decisions on the strategy of neighbors to proxy the importance of INDIRECT experience
So, the basic model I am estimating is as follows :

Y_it = Y_it-1 + Y_jt-1 + Controls + Fixed Effects with i different from j

I would like to run a PLACEBO test to show that INDIRECT experience is really influencing adoption, but it is unclear if I am proceeding in a correct way.
I runned the following regression:

Y_it = Y_it-1 + Y_jt-1 + Controls + Fixed Effects + Y_it+1 + Y_jt+1 with i different from j

If INDIRECT experience at time t-1 matter, I should find that Y_jt+1 is not significant. Correct?

However, I believe the same cannot be done for the DIRECT experience (Y_it+1).
Being an endogenous choice, I found that Y_it+1 is significant.
But this seems reasonable because if I adopt the strategy at time t (because I have done in t-1, so that experience matter), I am also likely to adopt in t+1.

Am I correct in this?

Suggestions/comments would be extremely appreciated.

Many thanks

Fabio

↧

How to use spmat command on panel data?

November 7, 2015, 7:17 am

≫ Next: generate histograms with descriptive statistics

≪ Previous: Placebo tests

Hello Statalist users,

I am trying to create a spatial weighting-matrix based on the inverse distance with spmat command:

"spmat idistance W x y, id(id) dfunction(dhaversine)normalize(spectral)"

As a result I get the information: "Two or more observations have the same coordinates" (r498). The reason is probably the fact that my dataset contains data on the same country for various years.
Is there any way around this problem? Or any other command that is working on panel data?

Thank you for your help!

↧

generate histograms with descriptive statistics

November 7, 2015, 8:22 am

≫ Next: Panel data cointegration with xtwest

≪ Previous: How to use spmat command on panel data?

Hi,

I got a question about the histogram command in STATA which I cant solve by googleing it or reading the STATA help article: is it possible to create histograms where there is a box inside the histogram with statistics about the variable of interest like mean, median, st. dev., skewness, and kurtosis and if there is such an option, how would the code look like?

Thank you very much!

Cheers,

Kurt

↧

Panel data cointegration with xtwest

November 7, 2015, 9:11 am

≫ Next: Comparing regression coefficients from separate logistic regression models

≪ Previous: generate histograms with descriptive statistics

Hi all.

When I run the xtwest command on my panel data only part of the tests show that there is cointegration.
When I use the "bootstrap" option the significance of the cointegration tests are stronger. Should I use this option?

Thank you!

↧

Comparing regression coefficients from separate logistic regression models

November 7, 2015, 1:13 pm

≫ Next: Interpretation of Multinomial Logit Model

≪ Previous: Panel data cointegration with xtwest

I am conducting an analysis using annual cross-sectional data of doctor office visits to assess trends in prescriptions over time. I have modeled the trends in prescriptions by using logistic regression where the outcome is whether or not the medication was prescribed at the visit and the predictor is time. I am essentially examining trends in continuing and new prescriptions for the medication, and have an estimate of trends for each over time (e.g., I have two odds ratios for time each from separate models). I would like to compare the trends for the new and continuing prescriptions but I can't figure out a way to compare these trends because the odds ratios for time are each in separate models. Is there a way I can compare the odds ratio for time in the two models to see if there are differences in the trends?

↧

Interpretation of Multinomial Logit Model

November 7, 2015, 1:50 pm

≫ Next: Paired Regression (Possibly with IV)

≪ Previous: Comparing regression coefficients from separate logistic regression models

Dear Folks,

I am running a multinomial logit model for my research. I am creating a categorical variables (dummies) for industries and for advisor.

First of all, how do we calculate the probability as most of the text books use some calculation or newer version of Stata will give us that probabilities straight away in order for u to interpret. If that's the case then how do we interpret? I used this command for the marginal effects.
margins, dy/dx (*) at means predict (pr outcome (2)))

My independent variable are choice of performance measure to be used either (Outcome 1) ROA exclusively, (Outcome 2) ROE exclusively, (Outcome 3) ROE & ROA jointly and Outcome 4( neither ROE nor ROA

X variables consists of the control variables market capitalization, volatility in ROA, volatility in ROE, industry dummies and some of advisor dummies

margins, dydx(*) atmeans predict(pr outcome (2)	dy/dx	Std.	z	P>z [95%	Conf.	Interval]
return on asset volatility (ROAV)	-0.04423	0.026128	-1.69	0.09	-0.09544	0.006978
Return on equity volatility (ROEV)	-0.32427	0.19455	-1.67	0.096	-0.70558	0.057037
board committee lb	0.097636	0.071001	1.38	0.169	-0.04152	0.236794
nominatee committee % lnc	-0.00397	0.001804	-2.2	0.028	-0.00751	-0.00043
Leverage lev	-0.00052	0.000323	-1.6	0.109	-0.00115	0.000116
Price to book ptb	0.00661	0.003876	1.71	0.088	-0.00099	0.014206
Market Cpaitalization lmc	-0.04825	0.011329	-4.26	0	-0.07046	-0.02605
Bain (Dummy Advisor)	0.095134	0.070328	1.35	0.176	-0.04271	0.232973
Mckinsey (Dummy Advisor)	0.250295	0.058871	4.25	0	0.13491	0.36568
BG (Dummy Advisor)	0.121386	0.074217	1.64	0.102	-0.02408	0.266848
Towers (Dummy Advisor)	0.118271	0.059905	1.97	0.048	0.000861	0.235682
Mercer (Dummy Advisor)	0.591537	0.135875	4.35	0	0.325226	0.857848
Pwc (Dummy Advisor)	0.119648	0.064528	1.85	0.064	-0.00683	0.246121
Food Service Industry (Dummy)	0.046763	0.03126	1.5	0.135	-0.01451	0.108032
Customer Service Industry (Dummy)	0.047415	0.017949	2.64	0.008	0.012236	0.082594
Car's Industry (Dummy)	-0.20241	0.046227	-4.38	0	-0.29301	-0.11181
Genearl Retailers (Dummy)	0.117296	0.02383	4.92	0	0.070591	0.164001
Aerospace Industry (Dummy)	0.084182	0.018928	4.45	0	0.047084	0.12128
Minning Industry (Dummy)	-0.16032	0.027304	-5.87	0	-0.21383	-0.1068
Agriculture Industry (Dummy)	-0.14285	0.022954	-6.22	0	-0.18784	-0.09786
Food court Industry (Dummy)	-0.13902	0.021214	-6.55	0	-0.1806	-0.09744

global ylist
global xlist roev roav lb lnc lev pth lmc Bain (dummy advisor) Mckinsey (dummy advisor ) BG (dummy advisor) Industries dummy)..... etc
* Multinomial logit model with base outcome the most frequent alternative
mlogit $ylist $xlist

margins, dydx(*) atmeans predict(pr outcome(1))
margins, dydx(*) atmeans predict(pr outcome(2))
margins, dydx(*) atmeans predict(pr outcome(3))
margins, dydx(*) atmeans predict(pr outcome(4))

How does it work with the interpretation of dummy Bain advisor? Is it relative to all other advisors ? Do we have to find probabilities or Stata calculates for us?
I also think we can run industry and time effects together?

Here goes the code

Thanks,

↧

Paired Regression (Possibly with IV)

November 7, 2015, 2:27 pm

≫ Next: Help creating a frequency histogram with a proportion curve?

≪ Previous: Interpretation of Multinomial Logit Model

Dear All,

I am writing to ask an econometrics question. I am running a regression on paired observations
with some paired covariates (characteristics shared by the pair)
and individual-specific covariates (characteristics not shared).

The data looks like this:
(very similar to my previous posting on matching in case you happened to see it)

Pair ID	Country	GINI	Same Character Dummy	GDP	Pop	Dep. Var
1	USA	0.2	1	100	100	5
1	China	0.5	1	80	500	6
2	USA	0.2	0	100	100	3
2	Russia	0.3	0	60	200	2
3	China	0.5	0	80	500	5
3	India	0.5	0	50	300	2
...

(GINI as my key independent variable)

With the unit of observation being a country pair, the econometrics model in my mind is:
DEPVARp = a + GINIp + DUMMY_p + GINI_p*DUMMY_p + GDP_p + POP_p + error_p
(p stands for pairs)

But I do not know:
1) Is the model econometrically right?
2) how can I implement it in Stata?
Can it just be a normal OLS regression with a robust error clustered at paired level?
e.g. reg DEPVAR GINI DUMMY GINI_DUMMY GDP POP, cluster(PAIR ID)
OR, there is some special command for this particular econometrics model?

Intuitively, the paired regression should be quite different from the simple OLS at country level,
but I really do not know what the proper model should be.

The follow-up question is related to Instrumental Approach.
If I have valid instrumental variable for GINI, how can I implement it into the paired regression?
Is it the same as the usual procedure?
i.e. predict in the first stage and use the predict value for 2nd stage?

If there is anything that is unclear, I am very happy to clarify!
Thank you very much in advance for your kind help! I look forward to hearing from you!

Best regards
Long

↧