biostats.mcnemar_test#

biostats.mcnemar_test(data, variable_1, variable_2, pair)[source]#

Test whether the proportions of a categorical variable are different in two paired groups.

Parameters:
datapandas.DataFrame

The input data. Must contain at least two categorical columns, and a column specifying the pairs.

variable_1str

The categorical variable that specifies which group the samples belong to. Maximum 20 groups. The most frequently appearing two groups will be chosen automatically.

variable_2str

The categorical variable that we want to calculate proportions of. Maximum 20 groups. The most frequently appearing two groups will be chosen automatically.

pairstr

The variable that specifies the pair ID. Samples in the same pair should have the same ID. Maximum 2000 pairs.

Returns:
summarypandas.DataFrame

The contingency table of the two categorical variables with matched pairs as the unit.

resultpandas.DataFrame

The degree of freedom, chi-square statistic, and p-value of the test (both normal and corrected).

See also

mcnemar_exact_test

The exact version of McNemar’s test,

chi_square_test

Test the association between two categorical variables.

Examples

>>> import biostats as bs
>>> data = bs.dataset("mcnemar_test.csv")
>>> data
    Treatment       Result   ID
0      before      support    1
1      before      support    2
2      before      support    3
3      before      support    4
4      before      support    5
..        ...          ...  ...
195     after  not_support   96
196     after  not_support   97
197     after  not_support   98
198     after  not_support   99
199     after  not_support  100

We want to test whether the proportions of Result are different between the two Treatment, where each before is paired with a after.

>>> summary, result = bs.mcnemar_test(data=data, variable_1="Treatment", variable_2="Result", pair="ID")
>>> summary
                      after : support  after : not_support
before : support                   30                   12
before : not_support               40                   18

The contingency table of Treatment and Result where the counting unit is the matched pair.

>>> result
           D.F.  Chi Square   p-value     
Normal        1   15.076923  0.000103  ***
Corrected     1   14.019231  0.000181  ***

The p-value < 0.001, so there is a significant difference between the proportions of Result under the two Treatment.