biostats.mcnemar_exact_test#

biostats.mcnemar_exact_test(data, variable_1, variable_2, pair)[source]#

Test whether the proportions of a categorical variable are different in two paired groups.

Parameters:
datapandas.DataFrame

The input data. Must contain at least two categorical columns, and a column specifying the pairs.

variable_1str

The categorical variable that specifies which group the samples belong to. Maximum 10 groups. The most frequently appearing two groups will be chosen automatically.

variable_2str

The categorical variable that we want to calculate proportions of. Maximum 10 groups. The most frequently appearing two groups will be chosen automatically.

pairstr

The variable that specifies the pair ID. Samples in the same pair should have the same ID. Maximum 1000 pairs.

Returns:
summarypandas.DataFrame

The contingency table of the two categorical variables with matched pairs as the unit.

resultpandas.DataFrame

The p-value of the test.

See also

mcnemar_test

The normal approximation version of McNemar’s test,

fisher_exact_test

Test the association between two categorical variables.

Examples

>>> import biostats as bs
>>> data = bs.dataset("mcnemar_exact_test.csv")
>>> data
   Treatment Result  ID
0    control   fail   1
1    control   fail   2
2    control   fail   3
3    control   fail   4
4    control   fail   5
..       ...    ...  ..
83      test   pass  40
84      test   pass  41
85      test   pass  42
86      test   pass  43
87      test   pass  44

We want to test whether the proportions of Result are different between the two Treatment, where each control is paired with a test.

>>> summary, result = bs.mcnemar_exact_test(data=data, variable_1="Treatment", variable_2="Result", pair="ID")
>>> summary
                test : fail  test : pass
control : fail           21            9
control : pass            2           12

The contingency table of Treatment and Result where the counting unit is the matched pair.

>>> result
       p-value      
Model  0.06543  <NA>

The p-value > 0.05, so there is no significant difference between the proportions of Result under the two Treatment.