biostats.numeric_grouped#

biostats.numeric_grouped(data, variable, group)[source]#

Compute descriptive statistics of a numeric variable in different groups.

Parameters:
datapandas.DataFrame

The input data. Must contain at least one numeric column and one categorical column.

variablestr

The numeric variable that we want to analyze.

groupstr

The categorical variable that specifies which group the samples belong to. Maximum 20 groups.

Returns:
resultpandas.DataFrame

The count, arithmetic mean, median, geometric mean, harmonic mean, mode, / sample variance, sample standard deviation, coefficient of variation, population variance, population standard deviation, / minimum, 25% percentile, 50% percentile, 75% percentile, maximum, range, interquartile range, / standard error, two-sided 95% confidence interval (lower and upper limit), and one-sided 95% confidence interval (lower and upper limit) of the variable in each group.

See also

numeric

Compute descriptive statistics of numeric variables.

one_way_anova

Test whether the mean values of a variable are different between several groups.

Examples

>>> import biostats as bs
>>> data = bs.dataset("numeric_grouped.csv")
>>> data
    Count  Animal
0      76    Fish
1     102    Fish
2      12    Fish
3      39    Fish
4      55    Fish
5      93    Fish
6      98    Fish
7      53    Fish
8     102    Fish
9      28  Insect
10     85  Insect
11     17  Insect
12     20  Insect
13     33  Insect
14     75  Insect
15     78  Insect
16     25  Insect
17     87  Insect

We want to compute descriptive statistics of Count in the two Animal.

>>> result = bs.numeric_grouped(data=data, variable="Count", group="Animal")
>>> result
                                 Fish      Insect
Count                        9.000000    9.000000
Mean                        70.000000   49.777778
Median                      76.000000   33.000000
Geometric Mean              59.835149   41.169549
Harmonic Mean               45.057085   34.058186
Mode                       102.000000   17.000000
                                  NaN         NaN
Variance                  1029.500000  923.694444
Std. Deviation              32.085822   30.392342
Coef. Variation              0.458369    0.610560
(Population) Variance      915.111111  821.061728
(Population) Std.Dev        30.250803   28.654175
                                  NaN         NaN
Minimum                     12.000000   17.000000
25% Percentile              53.000000   25.000000
50% Percentile              76.000000   33.000000
75% Percentile              98.000000   78.000000
Maximum                    102.000000   87.000000
Range                       90.000000   70.000000
Interquartile Range         45.000000   53.000000
                                  NaN         NaN
Std. Error                  10.695274   10.130781
95% CI: Lower               45.336654   26.416156
95% CI: Upper               94.663346   73.139400
(One-Tail) 95% CI: Lower    50.111624   30.939105
(One-Tail) 95% CI: Upper    89.888376   68.616451

Descriptive statistics of Count in the two Animal are computed.