Dear all,
I have a question on the interpretation of interaction effect between binary and categorical variable after Cox regression. I am studying if having a diagnosis affects the risk of dying differently in different educational levels.
I have read several posts (and the links included in these) related to this topic:
https://www.statalist.org/forums/for...interpretation
https://www.statalist.org/forums/for...ent-categories
https://www.stata.com/statalist/arch.../msg01122.html
https://www.statalist.org/forums/for...vival-analysis
https://www.statalist.org/forums/for...ferent-samples
https://www.statalist.org/forums/for...ns-after-stcox
I have also studied the examples of Maarten L. Buis:
http://www.maartenbuis.nl/publications/interactions.html.
I have interpreted my results especially following the "Example of a categorical by continuous interaction in a Cox regression model for survival data” (
https://www.stata.com/statalist/arch.../msg01332.html). However, my case is slightly different since I have a binary and categorical variable.
My question is:
1. Have I misinterpreted the results on Cox regression's interaction effect between diagnosis and educational level? If so, how?
2. Or have I misinterpreted the results of margins and marginsplot instead?
I'm using Stata/MP 15.1. The information on education and diagnosis is measured in 2010. Individuals are followed from 2011 to 2015.
Results:
Code:
stset time, failure(died) id (id)
stcox i.diag ##i.edu
margins, at(diag=(0 1) edu=(1 2 3))
marginsplot, scheme(s1mono)
Cox regression -- Breslow method for ties
No. of subjects = 99,760 Number of obs = 99,760
No. of failures = 10,287
Time at risk = 401420.2051
LR chi2(9) = 2827.01
Log likelihood = -112414.25 Prob > chi2 = 0.0000
------------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------------+----------------------------------------------------------------
1.diag | 3.676446 .4114305 11.63 0.000 2.952367 4.578108
|
edu |
secondary | 1.646723 .0612913 13.40 0.000 1.530871 1.771342
basic | 2.898637 .0890877 34.63 0.000 2.729183 3.078612
|
diag#edu |
1#secondary | .8658759 .118324 -1.05 0.292 .6624253 1.131812
1#basic | .631256 .0739209 -3.93 0.000 .5017977 .794113
------------------------------------------------------------------------------------
. margins, at(diag=(0 1) edu=(1 2 3))
Predictive margins Number of obs = 99,760
Model VCE : OIM
Expression : Predicted hazard ratio, predict()
1._at : diag = 0
edu = 1
2._at : diag = 0
edu = 2
3._at : diag = 0
edu = 3
4._at : diag = 1
edu = 1
5._at : diag = 1
edu = 2
6._at : diag = 1
edu = 3
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at |
1 | 1.008644 .0292157 34.52 0.000 .9513822 1.065906
2 | 1.660957 .0789297 21.04 0.000 1.506258 1.815657
3 | 2.923693 .1268029 23.06 0.000 2.675164 3.172222
4 | 3.708225 .4283425 8.66 0.000 2.868689 4.547761
5 | 5.287402 .4463702 11.85 0.000 4.412533 6.162272
6 | 6.785243 .3498909 19.39 0.000 6.09947 7.471017
------------------------------------------------------------------------------
. marginsplot
Variables that uniquely identify margins: diag edu
Array
diag = have diagnosis (1=yes, 0=no), edu = education level (1=tertiary, 2=secondary, 3=basic), event = died between 2010-2017 (yes/no).
Interpretation:
Having a diagnosis increases the hazard by 3.67 times among tertiary educated.
Secondary educated have a 1.64 times higher risk for mortality when compared to highly educated. Those with only basic education are 2.89 times more likely to die than the highly educated.
Those with secondary education and a diagnosis, have a 14% (1-0.86) smaller risk of dying compared to highly educated with a diagnosis - However, the difference is not statistically significant. Those with basic education and a diagnosis have a 37% smaller risk of dying than tertiary educated (statistically significant). In other words, wouldn't these results suggest that having a diagnosis is more "harmful" for tertiary educated than for those with only secondary or basic education?
However, looking at the results after margins and marginsplot: here the results do not suggest that the diagnosis would have different effect in different educational levels. These results are more similar to what is produced after running a Cox regression without the main effects:
Code:
stcox i.diag#i.edu
------------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------------+----------------------------------------------------------------
diag#edu |
0#secondary | 1.646723 .0612913 13.40 0.000 1.530871 1.771342
0#basic | 2.898637 .0890877 34.63 0.000 2.729183 3.078612
1#tertiary | 3.676446 .4114305 11.63 0.000 2.952367 4.578108
1#secondary | 5.24209 .4157252 20.89 0.000 4.487451 6.123634
1#basic | 6.727095 .2851373 44.97 0.000 6.19082 7.309824
------------------------------------------------------------------------------------
How could I produce marginsplot including also the main effects?
Please let me know if I have missed something and that I my problem could be solved by revisiting some links that I have listed above.
Best,
Inge