biostats.numeric#

biostats.numeric(data, variable)[source]#

Compute descriptive statistics of numeric variables.

Parameters:
datapandas.DataFrame

The input data. Must contain at least one numeric column.

variablelist

The list of numeric variables to be analyzed.

Returns:
resultpandas.DataFrame

The count, arithmetic mean, median, geometric mean, harmonic mean, mode, / sample variance, sample standard deviation, coefficient of variation, population variance, population standard deviation, / minimum, 25% percentile, 50% percentile, 75% percentile, maximum, range, interquartile range, / standard error, two-sided 95% confidence interval (lower and upper limit), and one-sided 95% confidence interval (lower and upper limit) of each variable.

See also

numeric_grouped

Compute descriptive statistics of a numeric variable in different groups.

one_sample_t_test

Test whether the mean value of a variable is different from the expected value.

Examples

>>> import biostats as bs
>>> data = bs.dataset("numeric.csv")
>>> data
   Fish  Crab  Temperature
0    76   123         25.7
1   102   265         20.1
2    12    35          4.2
3    39    86         11.4
4    55   140         31.2
5    93   315         21.0
6    98   279         17.8
7    53   120         13.3
8   102   312         18.5

We want to compute descriptive statistics of the three variables.

>>> result = bs.numeric(data=data, variable=["Fish", "Crab", "Temperature"])
>>> result
                                  Fish          Crab  Temperature
Count                         9.000000      9.000000     9.000000
Mean                         70.000000    186.111111    18.133333
Median                       76.000000    140.000000    18.500000
Geometric Mean               59.835149    152.721285    16.057497
Harmonic Mean                45.057085    116.064406    13.243700
Mode                        102.000000     35.000000     4.200000
                                   NaN           NaN          NaN
Variance                   1029.500000  11331.111111    62.895000
Std. Deviation               32.085822    106.447692     7.930637
Coef. Variation               0.458369      0.571958     0.437351
(Population) Variance       915.111111  10072.098765    55.906667
(Population) Std.Dev         30.250803    100.359846     7.477076
                                   NaN           NaN          NaN
Minimum                      12.000000     35.000000     4.200000
25% Percentile               53.000000    120.000000    13.300000
50% Percentile               76.000000    140.000000    18.500000
75% Percentile               98.000000    279.000000    21.000000
Maximum                     102.000000    315.000000    31.200000
Range                        90.000000    280.000000    27.000000
Interquartile Range          45.000000    159.000000     7.700000
                                   NaN           NaN          NaN
Std. Error                   10.695274     35.482564     2.643546
95% CI: Lower                45.336654    104.288172    12.037306
95% CI: Upper                94.663346    267.934050    24.229360
(One-Sided) 95% CI: Lower    50.111624    120.129579    13.217533
(One-Sided) 95% CI: Upper    89.888376    252.092643    23.049133

Descriptive statistics of the three variables are computed.