biostats.mcnemar_test#

biostats.mcnemar_test(data, variable_1, variable_2, pair)[source]#

Test whether the proportions of a categorical variable are different in two paired groups.

Parameters:

datapandas.DataFrame: The input data. Must contain at least two categorical columns, and a column specifying the pairs.
variable_1str: The categorical variable that specifies which group the samples belong to. Maximum 20 groups. The most frequently appearing two groups will be chosen automatically.
variable_2str: The categorical variable that we want to calculate proportions of. Maximum 20 groups. The most frequently appearing two groups will be chosen automatically.
pairstr: The variable that specifies the pair ID. Samples in the same pair should have the same ID. Maximum 2000 pairs.

Returns:

summarypandas.DataFrame: The contingency table of the two categorical variables with matched pairs as the unit.
resultpandas.DataFrame: The degree of freedom, chi-square statistic, and p-value of the test (both normal and corrected).

See also

mcnemar_exact_test: The exact version of McNemar’s test,
chi_square_test: Test the association between two categorical variables.

Examples

>>> import biostats as bs
>>> data = bs.dataset("mcnemar_test.csv")
>>> data
    Treatment       Result   ID
0      before      support    1
1      before      support    2
2      before      support    3
3      before      support    4
4      before      support    5
..        ...          ...  ...
195     after  not_support   96
196     after  not_support   97
197     after  not_support   98
198     after  not_support   99
199     after  not_support  100

We want to test whether the proportions of Result are different between the two Treatment, where each before is paired with a after.

>>> summary, result = bs.mcnemar_test(data=data, variable_1="Treatment", variable_2="Result", pair="ID")
>>> summary
                      after : support  after : not_support
before : support                   30                   12
before : not_support               40                   18

The contingency table of Treatment and Result where the counting unit is the matched pair.

>>> result
           D.F.  Chi Square   p-value     
Normal        1   15.076923  0.000103  ***
Corrected     1   14.019231  0.000181  ***

The p-value < 0.001, so there is a significant difference between the proportions of Result under the two Treatment.