biostats.categorical#

biostats.categorical(data, variable)[source]#

Compute descriptive statistics of a categorical variable.

Parameters:
datapandas.DataFrame

The input data. Must contain at least one categorical column.

variablestr

The categorical variable to be analyzed. Maximum 20 groups.

Returns:
resultpandas.DataFrame

The count, proportion, 95% confidence interval (lower and upper limit) of each group.

See also

contingency

Compute the contingency table of two categorical variables.

chi_square_test_fit

Test whether the proportion of groups in a categorical variable is different from the expected proportion.

Examples

>>> import biostats as bs
>>> data = bs.dataset("categorical.csv")
>>> data
      Color
0    Yellow
1      Blue
2       Red
3      Blue
4       Red
..      ...
195  Yellow
196    Blue
197     Red
198    Blue
199    Blue

We want to compute descriptive statistics of Color.

>>> result = bs.categorical(data=data, variable="Color")
>>> result
        Count  Proportion  95% CI: Lower  95% CI: Upper
Yellow     74       0.370       0.306126       0.438774
Blue       69       0.345       0.282598       0.413244
Red        35       0.175       0.128605       0.233644
Green      22       0.110       0.073772       0.160927

Descriptive statistics of Color are computed.