Quantcast
Channel: Statalist
Viewing all 73253 articles
Browse latest View live

Editing Marginsplot Labels

$
0
0
Hi, I am pretty new to Stata and I am running a logistic regression.

I have followed some code I found in several sources and I was able to make a plot of the marginal effects of my independent variables on the predicted probability of my dependent variable be 1.

However, the plot labels for my independent variables are all messed up. Could someone help me please in setting this up? Bellow, I detail the case and attach the plot.

I have 25 independent variables (predictors). All of them are statistically significant. I wrote the following code (variable names and plot titles are in Portuguese):

logit NAO_DESISTENTE i.IN_AMOSTRA_ENEM i.PUBLICA i.IN_UNIVERSIDADE i.IN_LICENCIATURA i.IN_NOTURNO FEMININO ///
c.IDADE_INGRESSO i.PROGRAMA i.ENG_CIV i.ENG_ELE i.ENG_MEC i.ENG_QUI i.ENG_PRO i.ENG_AMB i.ENG_FLO i.ENG_OUT i.ARQUIT i.BIOLOG i.QUIMIC ///
i.MATEMA i.COMPUT i.TEC_INF i.TEC_IND, or

margins, dydx(*) asobserved
marginsplot, horizontal allxlabels recast(scatter) xline(0) title("Não desistência do curso, CTEM") xtitle("Efeito marginal sobre a probabilidade predita" "de não desistir do curso")


I obtained a good chart with plots for all my 25 predictors. But the labels are too big, are not aligned with the corresponding dots and I only got labels for 4 predictors. How to make the labels for the 25 predictors fit in the chart?

Array


Thank you very much in advance,

Luiz

Multiple Imputation and Ordinal Logistic Regression

$
0
0
Hello everyone,

This is my first time posting to the Statalist after years of viewing.

I'm trying to conduct an ordinal logistic regression analysis on a multiply imputed dataset. More specifically, I want to test the proportional odds assumption. I have used the "omodel" command in the past to do this. However, in my imputed dataset, I can only use "omodel" with the "xeq" command, which returns results for each imputation (30 in total for me). Is there a way to get a combined test across all imputations?

Any help would be greatly appreciated.

Sincerely,

-Kevin Kovach

Obtaining Marginal Effects after gllamm command...

$
0
0
Hello,
I am estimating a generalized linear mixed model (gllamm command ) for a sectoral choice model using probit option. I would like to estimate the corresponding marginal effects from the model after this command:
xi:gllamm emp_dum i.marr_dum i.educ2 age age2 i.children hhsize i.urban i.zone i.mov_dum i.skill_occup if gender==1, i(indiv) family(binomial) link(probit) nip(5) adapt

Already, I tried mfx command after gllamm, but the dydx values obtained from (mfx) is exactly the same as that from the original gllamm model.

Please is there a means of extracting the marginal effects after gllamm? Could the estimated gllamm be the marginal effects since I found the exact values using (mfx) command?

Thanks in advance!

Ikechukwu

lagged value and sub-samples

$
0
0
Hello,
I'm using panel data fixed effects in my dissertation, with lagged value for independent variables. I would like to know if I have also to use lagged value for dummy variable that I will use it to create sub-samples ( and also as interaction term).

Thanks

wrong median calculation because of decimal numbers in STATA

$
0
0
Hi,

I have a problem with the median value calculated by STATA. I give an example below:

An observation in my excel sheet is 7.12037022820472 and this is the median value of the sample also. When I import the excel sheet to STATA, STATA stores this value as 7.120370228205 and STATA calculates the median value of the sample as 7.1203704. So when I write a formula for the variables which equal to the median value of the sample, STATA gives a wrong result because 7.120370228205 is not equal to 7.1203704. How can I fix this problem?

Replacing numbers of ascending and descending order

$
0
0

Hi all,

I have a simple question and data in the following form:

clear
input str9 team float action float time
"R" . 1
"R" . 2
"R" . 3
"R" . 4
"R" . 7
"R" 0 8
"R" . 9
"R" . 13
"R" . 25
"Z" . 2
"Z" . 3
"Z" . 4
"Z" . 6
"Z" . 7
"Z" 0 9
"Z" . 12
"Z" . 14
"Z" . 15
end

What I want to do is to replace the missing values with ascending and descending values from the zero for each team that then should like look like this:

clear
input str9 team float action float time
"R" -5 1
"R" -4 2
"R" -3 3
"R" -2 4
"R" -1 7
"R" 0 8
"R" 1 9
"R" 2 13
"R" 3 25
"Z" -5 2
"Z" -4 3
"Z" -3 4
"Z" -2 6
"Z" -1 7
"Z" 0 9
"Z" 1 12
"Z" 2 14
"Z" 3 15
end

Many thanks for you help!

Philip

Applying random intercepts and random slopes to logit multi level

$
0
0
Hi,

I am currently working on the analysis of some conjoint data. I wanted to apply random intercepts and random slopes with a logit multi-level (each individual (level 2) delivered 26 observations (level 1). The command that I am using is the following:

xtmelogit dependent_Variable x1 x2///
x3 x4 x5 x6 x7 ///
x8 x9 x10 x11 ///
x12 x13 x14 x15 x16///
|| respondent_ID: x1 x2 ///
x3 x4 x5 x6 x7 ///
x8 x9 x10 x11 ///
x12 x13 x14 x15 x16, cov(uns)

Unfortunately, this takes ages and I cannot finish the analysis (was running for around a day, without any recognizable development).

Am I doing something wrong? Or is this actually the problem with random intercepts and random slopes? Do you have any recommendations?

The dependent_Variable is a decision (coded 0 or 1), the respondent_ID is the top-level variable (where the observations are "clustered"), and the x(n) variables are attributes of the presented option, where the respondent had to make his decision.

Happy for any help. If you have problems understanding my questions, please let me know!

HM

Combining continuous variables

$
0
0
Hi,

I want to combine two age variables, PartAg_NT2BLQ1 and PartAg_NT3BLQ1, into a new variable (PartAg). As shown in the dataset example below, I have age data from two cycles of a population-based survey. Some participated only in the first one (PartAg_NT2BLQ1), some participated only in the last one (PartAg_NT3BLQ1), and some participated in both. I want my new age variable (PartAg) to show the participant age when they first participated in the survey.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input double(PartAg_NT2BLQ1    PartAg_NT3BLQ1)
74.7 84.7
70.8 82.2
69.7 80.8
27.2    .
37.1    .
68.5    .
48.2 59.2
50.8 61.7
47.9 58.9
45.3 56.9
90.6    .
51.9    .
66.7 78.3
55.7 67.4
49.1    .
56.6 67.6
24.2    .
30.3    .
47.9 58.9
53.1 63.5
41.1 52.2
79    .
68.6 79.4
. 44.4
. 53.9
43.8 55.5
56.2 67.6
66.3 78.3
. 54.1
56.6 67.2
. 44.5
. 44.1
end

I tried to use the following code provided by an earlier thread in Statalist https://www.stata.com/statalist/arch.../msg00251.html
Code:
gen PartAg = max(PartAg_NT2BLQ1, PartAg_NT3BLQ1) if missing(PartAg_NT2BLQ1, PartAg_NT3BLQ1)
It generated a nice looking variable, but I'm not sure what this code does to those participants who attended both the first and the second cycle of the study? Does it only count the first registered age and omit the second one? Or does it omit those participating in both cycles completely?

Best regards,
Sigrid Vikjord

combining two rows

$
0
0
Dear users,

I have a dataset that looks like this:

| nuts3 nacecat proptot |
|----------------------------|
1. | ITC11 kis 1.163951 |
2. | ITC11 lkis .7991949 |
3. | ITC11 ht 1.748063 |
4. | ITC11 lt 1.190657 |
5. | ITC11 agr .0787983 |
|----------------------------|
6. | ITC11 conpu .9105123 |
7. | ITC12 kis 1.045959 |
8. | ITC12 lkis 1.061093 |
9. | ITC12 ht 1.226766 |
10. | ITC12 agr 3.870968 |
|----------------------------|
11. | ITC12 conpu .496988 |

I need to collapse, by nuts3, the subcategories "ht" and "lt" of the variable nacecat. I want to sum them generating a unique subcategory, named "manif" such as manif=ht+lt

Thank you for any suggestion!

Synth: conformability error after dropping observations

$
0
0
Hi,
I am using the synthetic control method to analyse the effects of economic sanctions on income inequality.

After running the following codes, Stata said control units: for at least one unit predictor gininet(1995) is missing for ALL periods specified

Code:
ssc install synth
Code:
tsset country_n year
Code:
synth gininet log_GDPcap log_exports log_investment log_secondarygross coupdum gininet(1992) gininet(1997) gininet(2003), trunit(7) trperiod(2004) xperiod(1993(1)2003) fig
I therefore applied the following code to solve for the 'missing' problem

Code:
 list country_n if missing(gininet) & inlist(year, 1992, 1997, 2003)
and then dropped the observations with missing values.

Now, Stata gives me 'conformability error'. How do I fix this so that my regression works?
Thank you

Inclusion probability under PPS sampling without replacement

$
0
0
Hi guys,

How do we estimate inclusion probability (and, in turn, sampling weights) in a setting of probability proportional to size sampling without replacement? Seems like SPSS produces a 'joint probabilities file' containing these estimates (link); is there any analogue in Stata?

Thanks in advance!

ssc install esttab

$
0
0
Dear All,
I am trying to install the Stata package that holds the function of esttab. However, when I try to install it in my Stata, it gives me the following error:
cannot write in directory c:\ado\plus\_
My hunch is that, since my university has very strict guidelines about downloading anything in the laptop, I would need first to save the package in the folder, specifically defined for this purpose and then run it from that safe folder. However, since it is automatic, I cannot unfortunately change the directory. Can you please advice me how to change the directory when I try to install the package?

Thanks a bunch.

Best

Reliability Analysis - Doubt of Alpha de cronbach in item sign

$
0
0
Reliability Analysis - Doubt of Alpha de cronbach in item sign


Good morning, I'm doing a reliability analysis and I have positive and negative items in my survey. My question is how to interpret the signs reported by the items, that is, to know when I have to revert items based on these resulting signs.

Ex.
Positive items: item1, item2, item3, item4, item5 item6 item9 item11
Negative item: item 7 item8 item10 item12

Annex the result obtained for better support.

in this case I still do not reverse the negative items, I have not done the collection or database, so since I do not have information about it I would like to know how is the address of the item.
Array

Margins after nocons

$
0
0
Hello everyone,

how can I get average marginal effects after a logistic regression with a suppressed regression constant?

(I'm estimating a discrete-time event history analysis. Therefor I'm applying a binomial logistic regession on a person-period data set. I want to model the baseline hazard rate with a set of dummy variables for the periods. I would like to estimate AMEs to compare my models.)

I tried the following code but it doesn't work with the "nocons" option.
Code:
logit firstkidt2 i.wave centered_kiwust_f centered_kiwust_m kiwusti, nocons
eststo margin: margins, dydx(*) post
I'm using Stata 14.0.

Thank you for your suggestions!

What is panel level effects?

$
0
0
Can anyone please explain What is panel level effect?

one category of the variable does not show up

$
0
0
Hi.
I am running several regression models. The first one looks at the effect of the treatment (as a categorical variable, 1=control, 2-treatment group 1, 3-treatment group2) on the dependent variable. So, I put the following command

reg dv i.treatment

I get the following results:

treatment |
2 | .1951886 .1955817 1.00 0.320 -.1919224 .5822996
3 | .2856146 .2146775 1.33 0.186 -.1392923 .7105216

For the second model, I add potential mediators, and the command is as follows

reg dv i.treatment m1 m2 m3

However, when I get results, the coefficients for the second value of the treatment does not show up (see below). I have no idea why. I really need to figure it out, since I am super confused.

3.treatment | .2274107 .1476469 1.54 0.127 -.0661015 .520923

I would appreciate if you could help me out with it.

Best,

how to get same result with xtdpdsys and xtabond2?

$
0
0
Hello, i am currently working on my thesis and i am stuck with overidentification test. I used the xtdpdsys command and i got the expected coefficient and significant values for my variable of interest. However when i use the sargan test, i get a p-value of 0.0000 which makes me feel uncomfortable. i believe the sargan test has flaws and i want to use the hansen j test rather but there seem to be no command for it. The only way to get it is to use the xtabond2 command. I am not too good with that command due to the endogenous and exogenous options, it confuses me. i would be glad if someone can assist me to get the right xtabond2 command from my results which will give me the exact results as the xtdpdsys. I have posted the results here

xtdpdsys lngreenfield lngdppc angdppc inf open cit lnpop cr lngreenfieldL1, nocons lags(1) artests(2) vce(robust)
note: L.lngreenfield dropped because of collinearity

System dynamic panel-data estimation Number of obs = 165
Group variable: id Number of groups = 19
Time variable: year
Obs per group:
min = 8
avg = 8.684211
max = 9

Number of instruments = 52 Wald chi2(8) = 4224.77
Prob > chi2 = 0.0000
One-step results
--------------------------------------------------------------------------------
| Robust
lngreenfield | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
lngdppc | .5713249 .4883243 1.17 0.242 -.3857731 1.528423
angdppc | .0017639 .0391657 0.05 0.964 -.0749995 .0785273
inf | -.0350448 .021586 -1.62 0.104 -.0773525 .007263
open | .0013623 .0107646 0.13 0.899 -.0197359 .0224605
cit | -.1055809 .0387232 -2.73 0.006 -.181477 -.0296848
lnpop | 1.435553 .2608094 5.50 0.000 .9243757 1.94673
cr | .8368848 .534084 1.57 0.117 -.2099007 1.88367
lngreenfieldL1 | -.1874725 .1027013 -1.83 0.068 -.3887632 .0138183
--------------------------------------------------------------------------------
Instruments for differenced equation
GMM-type: L(2/.).lngreenfield
Standard: D.lngdppc D.angdppc D.inf D.open D.cit D.lnpop D.cr
D.lngreenfieldL1
Instruments for level equation
GMM-type: LD.lngreenfield

TSB and r(tsb_sam)

$
0
0
I'm trying to conduct TSB but I'm not able find these saved/stored results. STATA says the results are found in r(tsb_sam). But where is this place?? I'm able to do some of these analysis and also get these bstat but where are they stored/saved?

Solution merging data M:M

$
0
0
Dear Stata community,

I have a problem with merging data. I have two datasets. One contains M&A deals of different firms in different years. Each firm can make multiple acquisitions per year. The second dataset contains directors per firm per year. I created a variable called cusipyear in order to merge. However, only m:m merge seems possible. I was wondering if anybody has a solution for this, because when I use m:m merge it succesfully mergers for firms that made only one acquisiton in a year, but not if a firm makes multiple acquisitions per year. In that case only one director would be merged and not the total directors of that company of that specific year.

Problem with Optimal Bandwidth Selection for Fuzzy Regression Discontinuity Design

$
0
0
Dear All,

I am facing a problem in using the -rdbwselect- and getting an unexpected error message. I am using Stata 13.0.

The setting of the research project is as follows: Our aim is to estimate the impact of participating in a training program on outcome variable Y. For each person, we have two years of information- Baseline and Endline. Z is the running variable which ranges from -10 to 10, and the cutoff value is 0 (if a person scores higher than or equal to 0 he/she is assigned to the treatment). T is the treatment indicator-whether the person received training or not. I need to find the optimal bandwidth for estimating the point estimators. I typed:

rdbwselect deltaY Z , fuzzy (T) all
where, deltaY represents the endline and baseline difference of the outcome variable.

But then, instead of results, I get this message “Invertibility problem in the computation of preliminary bandwidth below the threshold Invertibility problem in the computation of preliminary bandwidth above the threshold Invertibility problem in the computation of bias bandwidth (b) below the threshold Invertibility problem in the computation of bias bandwidth (b) above the threshold Invertibility problem in the computation of loc. poly. bandwidth (h) below the threshold Invertibility problem in the computation of loc. poly. bandwidth (h) above the threshold”.

I am not sure why am I getting this message, and how to solve it. Any help or advice would be much appreciated.

With kind regards,

Nusrat Jimi
Viewing all 73253 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>