biostats.binomial_test#
- biostats.binomial_test(data, variable, expect)[source]#
Test whether the proportion of a categorical variable is different from the expected proportion.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least one categorical column. Maximum 500 rows.
- variable
str
The categorical variable that we want to calculate the proportion of. Maximum 10 groups.
- expect
dict
The expected proportions of each group. The sum of the proportions will be automatically normalized to 1.
- data
- Returns:
- summary
pandas.DataFrame
The observed counts and proportions of each group, and the expected counts and proportions of each group.
- result
pandas.DataFrame
The p-value of the test.
- summary
See also
chi_square_test_fit
The normal approximation version of binomial test.
fisher_exact_test
Test the association between two categorical variables.
Notes
Warning
The binomial test calculates the exact p-value by iterating through all the possible distributions, so it may consume lots of time when the size of data is huge. For larger data,
chi_square_test_fit()
is recommended.Examples
>>> import biostats as bs >>> data = bs.dataset("binomial_test.csv") >>> data Flower 0 Purple 1 Purple 2 Red 3 Blue 4 White .. ... 143 Purple 144 Purple 145 Blue 146 Red 147 Blue
We want to test whether the proportion in Flower is different from the expected proportions.
>>> summary, result = bs.binomial_test(data=data, variable="Flower", expect={"Purple":9, "Red":3, "Blue":3, "White":1}) >>> summary Observe Prop.(Obs.) Expect Prop.(Exp.) Purple 72 0.486486 83.25 0.5625 Red 38 0.256757 27.75 0.1875 Blue 20 0.135135 27.75 0.1875 White 18 0.121622 9.25 0.0625
The observed and expected counts and proportions of each group are given.
>>> result p-value Model 0.002255 **
The p-value < 0.01, so the observed proportions are significantly different from the expected proportions.