biostats.mantel_haenszel_test#
- biostats.mantel_haenszel_test(data, variable_1, variable_2, stratum)[source]#
Test whether there is an association between two categorical variables in stratified data.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least three categorical columns.
- variable_1
str
The first categorical variable. Maximum 20 groups.
- variable_2
str
The second categorical variable. Maximum 20 groups. Switching the two variables will not change the result.
- stratum
str
The categorical variable that specifies which stratum the samples belong to. Maximum 30 strata.
- data
- Returns:
- summary
pandas.DataFrame
The contingency table of the two categorical variables in each stratum.
- result
pandas.DataFrame
The degree of freedom, chi-square statistic, and p-value of the test.
- summary
See also
chi_square_test
Test the association between two categorical variables.
Examples
>>> import biostats as bs >>> data = bs.dataset("mantel_haenszel_test.csv") >>> data Treatment Revascularization Study 0 Niacin Yes FATS 1 Niacin Yes FATS 2 Niacin No FATS 3 Niacin No FATS 4 Niacin No FATS .. ... ... ... 669 Placebo No CLAS1 670 Placebo No CLAS1 671 Placebo No CLAS1 672 Placebo No CLAS1 673 Placebo No CLAS1
We want to test whether there is an association between Treatment and Revascularization, with the data including the five Study.
>>> summary, result = bs.mantel_haenszel_test(data=data, variable_1="Treatment", variable_2="Revascularization", stratum="Study") >>> summary No Yes FATS Niacin 46 2 Placebo 41 11 <NA> <NA> AFREGS Niacin 67 4 Placebo 60 12 <NA> <NA> ARBITER2 Niacin 86 1 Placebo 76 4 <NA> <NA> HATS Niacin 37 1 Placebo 32 6 <NA> <NA> CLAS1 Niacin 92 2 Placebo 93 1 <NA> <NA>
The contingency tables of Treatment and Revascularization in the five Study.
>>> result D.F. Chi Square p-value Normal 1 12.745723 0.000357 ***
The p-value < 0.001, so there is a significant association between Treatment and Revascularization in the stratified data.