biostats.mantel_haenszel_test#

biostats.mantel_haenszel_test(data, variable_1, variable_2, stratum)[source]#

Test whether there is an association between two categorical variables in stratified data.

Parameters:
datapandas.DataFrame

The input data. Must contain at least three categorical columns.

variable_1str

The first categorical variable. Maximum 20 groups.

variable_2str

The second categorical variable. Maximum 20 groups. Switching the two variables will not change the result.

stratumstr

The categorical variable that specifies which stratum the samples belong to. Maximum 30 strata.

Returns:
summarypandas.DataFrame

The contingency table of the two categorical variables in each stratum.

resultpandas.DataFrame

The degree of freedom, chi-square statistic, and p-value of the test.

See also

chi_square_test

Test the association between two categorical variables.

Examples

>>> import biostats as bs
>>> data = bs.dataset("mantel_haenszel_test.csv")
>>> data
    Treatment Revascularization  Study
0      Niacin               Yes   FATS
1      Niacin               Yes   FATS
2      Niacin                No   FATS
3      Niacin                No   FATS
4      Niacin                No   FATS
..        ...               ...    ...
669   Placebo                No  CLAS1
670   Placebo                No  CLAS1
671   Placebo                No  CLAS1
672   Placebo                No  CLAS1
673   Placebo                No  CLAS1

We want to test whether there is an association between Treatment and Revascularization, with the data including the five Study.

>>> summary, result = bs.mantel_haenszel_test(data=data, variable_1="Treatment", variable_2="Revascularization", stratum="Study")
>>> summary
                     No   Yes
FATS       Niacin    46     2
          Placebo    41    11
                   <NA>  <NA>
AFREGS     Niacin    67     4
          Placebo    60    12
                   <NA>  <NA>
ARBITER2   Niacin    86     1
          Placebo    76     4
                   <NA>  <NA>
HATS       Niacin    37     1
          Placebo    32     6
                   <NA>  <NA>
CLAS1      Niacin    92     2
          Placebo    93     1
                   <NA>  <NA>

The contingency tables of Treatment and Revascularization in the five Study.

>>> result
        D.F.  Chi Square   p-value     
Normal     1   12.745723  0.000357  ***

The p-value < 0.001, so there is a significant association between Treatment and Revascularization in the stratified data.