biostats.binomial_test#

biostats.binomial_test(data, variable, expect)[source]#

Test whether the proportion of a categorical variable is different from the expected proportion.

Parameters:

datapandas.DataFrame: The input data. Must contain at least one categorical column. Maximum 500 rows.
variablestr: The categorical variable that we want to calculate the proportion of. Maximum 10 groups.
expectdict: The expected proportions of each group. The sum of the proportions will be automatically normalized to 1.

Returns:

summarypandas.DataFrame: The observed counts and proportions of each group, and the expected counts and proportions of each group.
resultpandas.DataFrame: The p-value of the test.

See also

chi_square_test_fit: The normal approximation version of binomial test.
fisher_exact_test: Test the association between two categorical variables.

Notes

Warning

The binomial test calculates the exact p-value by iterating through all the possible distributions, so it may consume lots of time when the size of data is huge. For larger data, chi_square_test_fit() is recommended.

Examples

>>> import biostats as bs
>>> data = bs.dataset("binomial_test.csv")
>>> data
     Flower
0    Purple
1    Purple
2       Red
3      Blue
4     White
..      ...
143  Purple
144  Purple
145    Blue
146     Red
147    Blue

We want to test whether the proportion in Flower is different from the expected proportions.

>>> summary, result = bs.binomial_test(data=data, variable="Flower", expect={"Purple":9, "Red":3, "Blue":3, "White":1})
>>> summary
        Observe  Prop.(Obs.)  Expect  Prop.(Exp.)
Purple       72     0.486486   83.25       0.5625
Red          38     0.256757   27.75       0.1875
Blue         20     0.135135   27.75       0.1875
White        18     0.121622    9.25       0.0625

The observed and expected counts and proportions of each group are given.

>>> result
        p-value    
Model  0.002255  **

The p-value < 0.01, so the observed proportions are significantly different from the expected proportions.