Dear all,
I am rather desperate to get the marginal effect of father’s education on the probability of college graduation after running a logistic regression for a number of countries that participated in the PIAAC survey carried out by OECD.
The model is quite simple: the dependent variable is binary, my main independent variable is father’s education (three categories) and I have two controls (age and gender). In principle, it should not be difficult. But I have to account for the complex sample design of PIAAC, which means using a number of weights provided in PIAAC data.
A Stata module (repest) was specifically created for this purpose: “repest estimates statistics using replicate weights (…) thus accounting for complex survey designs in the estimation of sample variances”. It is especially designed to for databases like IELS, PIAAC, PISA, TALIS…
‘Repest’ basically works as follows:
PHP Code:
repest svyname [if] [in] , estimate(cmd [,cmd_options]) [options]
Next, there is one of the examples provided by the authors in the corresponding help file of repest:
PHP Code:
repest PIAAC, estimate(stata: reg lnwage pvlit@ yrsqual) by(cnt)
Since I want to run the same model for a number of countries in PIAAC, I intend to create a loop that includes repest. But I also want to generate the marginal effect of father’s education after the logistic regression for each country, storing these marginal effects and then saving them in a different Stata file (dta).
At the end of repest help file, the authors provide a loop precisely for logit posestimation:
HTML Code:
User-defined estimation command: 2. logit postestimation
cap program drop mylogitmargins
program define mylogitmargins, eclass
syntax [if] [in] [pweight], logit(string) [margins(string) loptions(string) moptions(string)]
tempname b m
// compute logit regressions, store results in vectors
logit `logit' [`weight' `exp'] `if' `in', `loptions'
matrix `b'= e(b)
// compute logit postestimation, store results in vectors
if "`margins'" != "" | "`moptions'" != ""{
margins `margins', post `moptions'
matrix `m' = e(b)
matrix colnames `m' = margins:
matrix `b'= [`b', `m']
}
// post results
ereturn post `b'
end
. repest PISA, estimate(stata: mylogitmargins, logit(repeat pv@math escs ib1.st04q01) margins(st04q01) moptions(atmeans))
Yet, I do not know how to replicate this with my data and, in particular, how to make sure that the marginal effects of father’s education for each country is stored after each logit.
I have succeeded in making ‘repest’ work with my logit model. Next, I show a program so that the name of the country appear in the output, a replica of the program for logit post-estimation offered by the authors of repest and, finally, the loop where I introduce repest for the estimation of logit probabiities for each country:
Code:
egen cntryid3_group=group(cntryid3), label
program define pe
if `"`0'"' != "" {
display as text `"`0'"'
`0'
display("")
}
end
cap program drop mylogitmargins
program define mylogitmargins, eclass
syntax [if] [in] [pweight], logit(string) [margins(string) loptions(string) moptions(string)]
tempname b m
// compute logit regressions, store results in vectors
logit `logit' [`weight' `exp'] `if' `in', `loptions'
matrix `b'= e(b)
// compute logit postestimation, store results in vectors
if "`margins'" != "" | "`moptions'" != ""{
margins `margins', post `moptions'
matrix `m' = e(b)
matrix colnames `m' = margins:
matrix `b'= [`b', `m']
}
// post results
ereturn post `b'
end
foreach i of numlist 1/24 {
display "`: label (cntryid3_group) `i''"
pe capture noisily repest PIAAC, estimate(stata: mylogitmargins, logit(univ i.edufath female age if cntryid3_group==`i' & egresados==1) margins(r.edufath))
}
But I have not succeeded in generating the marginal effect of father’s education and storing them after the logistic regression for each country
Next, I show the results (output) for the second country of the list. The last two lines in the output are precisely the contrast of marginal effects for the three categories of father's education (second versus first, third versus first). It's what I want; yet, I do not know how to store them for each country, and how to retrieve them afterwards.
HTML Code:
capture noisily repest PIAAC, estimate(stata: mylogitmargins, logit(univ i.edufath
> female age if cntryid3_group==2 & egresados==1) margins(r.edufath))
(note: file C:\Users\LOrti\AppData\Local\Temp\ST_00000005.tmp not found)
file C:\Users\LOrti\AppData\Local\Temp\ST_00000005.tmp saved
_pooled.
: _pooled
----------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
univ_1b_edufat~r | 0 (omitted)
univ_2_edufather | 1.384978 .1785063 7.76 0.000 1.035112 1.734844
univ_3_edufather | 2.719534 .2327599 11.68 0.000 2.263333 3.175735
univ_female | .0569396 .1659537 0.34 0.732 -.2683237 .382203
univ_age | -.0004378 .0151277 -0.03 0.977 -.0300876 .0292121
univ__cons | -2.912376 .5037946 -5.78 0.000 -3.899795 -1.924956
margins_r2vs1_~r | .1281268 .0182265 7.03 0.000 .0924035 .1638501
margins_r3vs1_~r | .4029726 .0477158 8.45 0.000 .3094514 .4964938
----------------------------------------------------------------------------------
Could you help me with this?
Thanks for your attention
Luis Ortiz
PD: In case it could be of any use, I include a sample of my data, extracted from my dataset using dataex:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input double edufather float(age female univ cntryid3_group)
1 99 1 1 1
2 99 0 1 1
3 99 1 1 1
1 99 1 0 1
1 99 1 0 1
2 99 0 0 1
3 99 0 1 1
3 99 0 0 1
3 99 1 0 1
1 99 1 0 1
3 99 0 0 1
1 99 1 0 1
2 99 1 0 1
3 99 1 1 1
3 99 1 1 1
2 99 0 0 1
1 99 0 0 1
3 99 1 1 1
2 99 1 0 1
2 99 0 1 1
3 99 0 0 1
1 99 0 0 1
3 99 1 1 1
3 99 1 0 1
1 99 1 0 1
2 99 0 0 1
2 99 1 1 1
1 99 0 0 1
. 99 0 0 1
1 99 0 0 1
1 99 1 1 1
1 99 1 0 1
1 99 0 0 1
2 99 1 0 1
1 99 0 0 1
2 99 0 1 1
3 99 1 0 1
3 99 0 0 1
2 99 0 0 1
1 99 1 0 1
. 99 0 0 1
2 99 0 0 1
3 99 0 0 1
1 99 1 0 1
2 99 0 1 1
3 99 0 1 1
3 99 0 1 1
3 99 1 1 1
3 99 0 0 1
2 99 0 0 1
3 99 1 0 1
1 99 0 0 1
2 99 0 0 1
2 99 1 0 1
1 99 0 0 1
. 99 0 0 1
. 99 0 0 1
2 99 0 0 1
1 99 1 0 1
1 99 1 1 1
1 99 0 0 1
1 99 1 0 1
2 99 0 0 1
2 99 1 0 1
1 99 0 0 1
1 99 1 0 1
3 99 0 1 1
1 99 0 0 1
2 99 1 0 1
1 99 0 1 1
2 99 0 1 1
1 99 0 0 1
1 99 0 0 1
2 99 1 0 1
1 99 1 0 1
2 99 1 0 1
1 99 0 0 1
2 99 0 1 1
3 99 0 0 1
3 99 1 0 1
3 99 1 0 1
1 99 0 0 1
3 99 1 1 1
3 99 0 1 1
3 99 0 0 1
2 99 0 0 1
3 99 0 0 1
2 99 0 0 1
2 99 1 0 1
1 99 0 0 1
3 99 1 0 1
1 99 0 0 1
2 99 1 1 1
3 99 1 0 1
2 99 0 0 1
. 99 0 0 1
1 99 0 0 1
1 99 0 0 1
1 99 0 0 1
2 99 0 0 1
end
label values edufather edu_fat
label def edu_fat 1 "ISCED 1/2/3sh", modify
label def edu_fat 2 "ISCED 3/4", modify
label def edu_fat 3 "ISCED 5/6", modify
label values female gndr
label def gndr 0 "Male", modify
label def gndr 1 "Female", modify
label values univ univ_lab
label def univ_lab 0 "No uni", modify
label def univ_lab 1 "Univ", modify
label values cntryid3_group cntryid3_group
label def cntryid3_group 1 "124. Canada", modify