biostats.one_way_ancova#
- biostats.one_way_ancova(data, variable, between, covariable)[source]#
Test whether the mean values of a variable are different between several groups, when another variable is controlled.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least two numeric columns and one categorical column.
- variable
str
The numeric variable that we want to calculate mean values of.
- between
str
The categorical variable that specifies which group the samples belong to. Maximum 20 groups.
- covariable
str
Another numeric variable that we want to control.
- data
- Returns:
- summary
pandas.DataFrame
The counts, mean values and standard deviations of the variable, and that of the covariable in each group.
- result
pandas.DataFrame
The sums of squares, degrees of freedom, F statistics, and p-values of the test.
- summary
See also
one_way_anova
Test whether the mean values are different between groups.
two_way_ancova
Test whether the mean values are different between groups classified in two ways, when another variable is controlled.
Examples
>>> import biostats as bs >>> data = bs.dataset("one_way_ancova.csv") >>> data Pulse Species Temp 0 67.9 ex 20.8 1 65.1 ex 20.8 2 77.3 ex 24.0 3 78.7 ex 24.0 4 79.4 ex 24.0 5 80.4 ex 24.0 6 85.8 ex 26.2 7 86.6 ex 26.2 8 87.5 ex 26.2 9 89.1 ex 26.2 10 98.6 ex 28.4 11 100.8 ex 29.0 12 99.3 ex 30.4 13 101.7 ex 30.4 14 44.3 niv 17.2 15 47.2 niv 18.3 16 47.6 niv 18.3 17 49.6 niv 18.3 18 50.3 niv 18.9 19 51.8 niv 18.9 20 60.0 niv 20.4 21 58.5 niv 21.0 22 58.9 niv 21.0 23 60.7 niv 22.1 24 69.8 niv 23.5 25 70.9 niv 24.2 26 76.2 niv 25.9 27 76.1 niv 26.5 28 77.0 niv 26.5 29 77.7 niv 26.5 30 84.7 niv 28.6 31 74.3 fake 17.2 32 77.2 fake 18.3 33 77.6 fake 18.3 34 79.6 fake 18.3 35 80.3 fake 18.9 36 81.8 fake 18.9 37 90.0 fake 20.4 38 88.5 fake 21.0 39 88.9 fake 21.0 40 90.7 fake 22.1 41 99.8 fake 23.5 42 100.9 fake 24.2 43 106.2 fake 25.9 44 106.1 fake 26.5 45 107.0 fake 26.5 46 107.7 fake 26.5 47 114.7 fake 28.6
We want to test whether the mean values of Pulse ar different between the three Species, with Temp being controlled.
>>> summary, result = bs.one_way_ancova(data=data, variable="Pulse", between="Species", covariable="Temp") >>> summary Species Count Mean (Pulse) Std. (Pulse) Mean (Temp) Std. (Temp) 1 ex 14 85.585714 11.69930 25.757143 3.074639 2 niv 17 62.429412 12.95684 22.123529 3.659325 3 fake 17 92.429412 12.95684 22.123529 3.659325
The mean values of Pulse and Temp in each group are given.
>>> result Sum Square D.F. F Statistic p-value Species 7835.737962 2 1372.995165 2.252680e-40 *** Temp 7025.952857 1 2462.205692 2.877499e-40 *** Residual 125.554874 44 NaN NaN NaN
The p-value of Species < 0.05, so the mean values of Pulse are different between the three Species, even after Temp being controlled.