biostats.one_way_ancova#

biostats.one_way_ancova(data, variable, between, covariable)[source]#

Test whether the mean values of a variable are different between several groups, when another variable is controlled.

Parameters:
datapandas.DataFrame

The input data. Must contain at least two numeric columns and one categorical column.

variablestr

The numeric variable that we want to calculate mean values of.

betweenstr

The categorical variable that specifies which group the samples belong to. Maximum 20 groups.

covariablestr

Another numeric variable that we want to control.

Returns:
summarypandas.DataFrame

The counts, mean values and standard deviations of the variable, and that of the covariable in each group.

resultpandas.DataFrame

The sums of squares, degrees of freedom, F statistics, and p-values of the test.

See also

one_way_anova

Test whether the mean values are different between groups.

two_way_ancova

Test whether the mean values are different between groups classified in two ways, when another variable is controlled.

Examples

>>> import biostats as bs
>>> data = bs.dataset("one_way_ancova.csv")
>>> data
    Pulse Species  Temp
0    67.9      ex  20.8
1    65.1      ex  20.8
2    77.3      ex  24.0
3    78.7      ex  24.0
4    79.4      ex  24.0
5    80.4      ex  24.0
6    85.8      ex  26.2
7    86.6      ex  26.2
8    87.5      ex  26.2
9    89.1      ex  26.2
10   98.6      ex  28.4
11  100.8      ex  29.0
12   99.3      ex  30.4
13  101.7      ex  30.4
14   44.3     niv  17.2
15   47.2     niv  18.3
16   47.6     niv  18.3
17   49.6     niv  18.3
18   50.3     niv  18.9
19   51.8     niv  18.9
20   60.0     niv  20.4
21   58.5     niv  21.0
22   58.9     niv  21.0
23   60.7     niv  22.1
24   69.8     niv  23.5
25   70.9     niv  24.2
26   76.2     niv  25.9
27   76.1     niv  26.5
28   77.0     niv  26.5
29   77.7     niv  26.5
30   84.7     niv  28.6
31   74.3    fake  17.2
32   77.2    fake  18.3
33   77.6    fake  18.3
34   79.6    fake  18.3
35   80.3    fake  18.9
36   81.8    fake  18.9
37   90.0    fake  20.4
38   88.5    fake  21.0
39   88.9    fake  21.0
40   90.7    fake  22.1
41   99.8    fake  23.5
42  100.9    fake  24.2
43  106.2    fake  25.9
44  106.1    fake  26.5
45  107.0    fake  26.5
46  107.7    fake  26.5
47  114.7    fake  28.6

We want to test whether the mean values of Pulse ar different between the three Species, with Temp being controlled.

>>> summary, result = bs.one_way_ancova(data=data, variable="Pulse", between="Species", covariable="Temp")
>>> summary
  Species  Count  Mean (Pulse)  Std. (Pulse)  Mean (Temp)  Std. (Temp)
1      ex     14     85.585714      11.69930    25.757143     3.074639
2     niv     17     62.429412      12.95684    22.123529     3.659325
3    fake     17     92.429412      12.95684    22.123529     3.659325

The mean values of Pulse and Temp in each group are given.

>>> result
           Sum Square  D.F.  F Statistic       p-value     
Species   7835.737962     2  1372.995165  2.252680e-40  ***
Temp      7025.952857     1  2462.205692  2.877499e-40  ***
Residual   125.554874    44          NaN           NaN  NaN

The p-value of Species < 0.05, so the mean values of Pulse are different between the three Species, even after Temp being controlled.