biostats.mcnemar_test#
- biostats.mcnemar_test(data, variable_1, variable_2, pair)[source]#
Test whether the proportions of a categorical variable are different in two paired groups.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least two categorical columns, and a column specifying the pairs.
- variable_1
str
The categorical variable that specifies which group the samples belong to. Maximum 20 groups. The most frequently appearing two groups will be chosen automatically.
- variable_2
str
The categorical variable that we want to calculate proportions of. Maximum 20 groups. The most frequently appearing two groups will be chosen automatically.
- pair
str
The variable that specifies the pair ID. Samples in the same pair should have the same ID. Maximum 2000 pairs.
- data
- Returns:
- summary
pandas.DataFrame
The contingency table of the two categorical variables with matched pairs as the unit.
- result
pandas.DataFrame
The degree of freedom, chi-square statistic, and p-value of the test (both normal and corrected).
- summary
See also
mcnemar_exact_test
The exact version of McNemar’s test,
chi_square_test
Test the association between two categorical variables.
Examples
>>> import biostats as bs >>> data = bs.dataset("mcnemar_test.csv") >>> data Treatment Result ID 0 before support 1 1 before support 2 2 before support 3 3 before support 4 4 before support 5 .. ... ... ... 195 after not_support 96 196 after not_support 97 197 after not_support 98 198 after not_support 99 199 after not_support 100
We want to test whether the proportions of Result are different between the two Treatment, where each before is paired with a after.
>>> summary, result = bs.mcnemar_test(data=data, variable_1="Treatment", variable_2="Result", pair="ID") >>> summary after : support after : not_support before : support 30 12 before : not_support 40 18
The contingency table of Treatment and Result where the counting unit is the matched pair.
>>> result D.F. Chi Square p-value Normal 1 15.076923 0.000103 *** Corrected 1 14.019231 0.000181 ***
The p-value < 0.001, so there is a significant difference between the proportions of Result under the two Treatment.