biostats.two_way_ancova#
- biostats.two_way_ancova(data, variable, between_1, between_2, covariable)[source]#
Test whether the mean values of a variable are different between several groups classified in two ways, when another variable is controlled.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least two numeric columns and two categorical columns.
- variable
str
The numeric variable that we want to calculate mean values of.
- between_1
str
The first categorical variable that specifies the groups of the samples. Maximum 20 groups.
- between_2
str
The second categorical variable that specifies the groups of the samples. Maximum 20 groups.
- covariable
str
Another numeric variable that we want to control.
- data
- Returns:
- summary
pandas.DataFrame
The counts, mean values and standard deviations of the variable, and that of the covariable in each combination of groups.
- result
pandas.DataFrame
The sums of squares, degrees of freedom, F statistics, and p-values of the test.
- summary
See also
two_way_anova
Test whether the mean values are different between several groups classified in two ways.
one_way_ancova
Test whether the mean values are different between groups, when another variable is controlled.
Examples
>>> import biostats as bs >>> data = bs.dataset("two_way_ancova.csv") >>> data Activity Sex Genotype Age 0 1.884 male ff 69 1 2.283 male ff 51 2 2.396 male fs 75 3 2.838 female ff 68 4 2.956 male fs 29 5 4.216 female ff 28 6 3.620 female ss 56 7 2.889 female ff 38 8 3.550 female fs 32 9 3.105 male fs 61 10 4.556 female fs 20 11 3.087 female fs 57 12 4.939 male ff 71 13 3.486 male ff 21 14 3.079 female ss 43 15 2.649 male fs 62 16 1.943 female fs 54 17 4.198 female ff 45 18 2.473 female ff 27 19 2.033 female ff 66 20 2.200 female fs 74 21 2.157 female fs 19 22 2.801 male ss 20 23 3.421 male ss 75 24 1.811 female ff 68 25 4.281 female fs 25 26 4.772 female fs 38 27 3.586 female ss 18 28 3.944 female ff 49 29 2.669 female ss 18 30 3.050 female ss 34 31 4.275 male ss 49 32 2.963 female ss 42 33 3.236 female ss 25 34 3.673 female ss 55 35 3.110 male ss 73
We want to test that whether the mean values of Activity are different between male and female, and between ff, fs and ss, with Age being controlled.
>>> summary, result = bs.two_way_ancova(data=data, variable="Activity", between_1="Sex", between_2="Genotype", covariable="Age") >>> summary Sex Genotype Count Mean (Activity) Std. (Activity) Mean (Age) Std. (Age) 1 male ff 4 3.14800 1.374512 53.000 23.151674 2 male fs 4 2.77650 0.316843 56.750 19.568257 3 male ss 4 3.40175 0.634811 54.250 25.708300 4 female ff 8 3.05025 0.959903 48.625 17.204132 5 female fs 8 3.31825 1.144539 39.875 19.910066 6 female ss 8 3.23450 0.361775 36.375 15.202796
The mean values of Activity and Age in each combination of groups are given.
>>> result Sum Square D.F. F Statistic p-value Sex 0.018057 1 0.023349 0.879612 <NA> Genotype 0.113591 2 0.073441 0.929363 <NA> Sex : Genotype 0.727884 2 0.470606 0.629311 <NA> Age 1.286714 1 1.663822 0.207280 <NA> Residual 22.427109 29 NaN NaN <NA>
After controlling Age, the p-value of Sex > 0.05, so Activity are not different between the two Sex. The p-value of Genotype > 0.05, so Activity are not different between the three Genotype. The p-value of interaction > 0.05, so there is no interaction between Sex and Genotype.