biostats.two_way_anova#

biostats.two_way_anova(data, variable, between_1, between_2)[source]#

Test whether the mean values of a variable are different between several groups, when the groups are classified in two ways.

Parameters:
datapandas.DataFrame

The input data. Must contain at least one numeric column and two categorical columns.

variablestr

The numeric variable that we want to calculate mean values of.

between_1str

The first categorical variable that specifies the groups of the samples. Maximum 20 groups.

between_2str

The second categorical variable that specifies the groups of the samples. Maximum 20 groups.

Returns:
summarypandas.DataFrame

The counts, mean values, standard deviations, and confidence intervals of each combination of groups.

resultpandas.DataFrame

The degrees of freedom, sums of squares, means of squares, F statistics, and p-values of the test.

See also

two_way_ancova

Two-way ANOVA with another variable being controlled.

one_way_anova

Test whether the mean values are different between groups.

Examples

>>> import biostats as bs
>>> data = bs.dataset("two_way_anova.csv")
>>> data
    Activity     Sex Genotype
0      1.884    male       ff
1      2.283    male       ff
2      2.396    male       fs
3      2.838  female       ff
4      2.956    male       fs
5      4.216  female       ff
6      3.620  female       ss
7      2.889  female       ff
8      3.550  female       fs
9      3.105    male       fs
10     4.556  female       fs
11     3.087  female       fs
12     4.939    male       ff
13     3.486    male       ff
14     3.079  female       ss
15     2.649    male       fs
16     1.943  female       fs
17     4.198  female       ff
18     2.473  female       ff
19     2.033  female       ff
20     2.200  female       fs
21     2.157  female       fs
22     2.801    male       ss
23     3.421    male       ss
24     1.811  female       ff
25     4.281  female       fs
26     4.772  female       fs
27     3.586  female       ss
28     3.944  female       ff
29     2.669  female       ss
30     3.050  female       ss
31     4.275    male       ss
32     2.963  female       ss
33     3.236  female       ss
34     3.673  female       ss
35     3.110    male       ss

We want to test that whether the mean values of Activity are different between male and female, and between ff, fs and ss. We also want to test that whether there is an interaction between Sex and Genotype.

>>> summary, result = bs.two_way_anova(data=data, variable="Activity", between_1="Sex", between_2="Genotype")
>>> summary
      Sex Genotype  Count     Mean  Std. Deviation  95% CI: Lower  95% CI: Upper
1    male       ff      4  3.14800        1.374512       0.960845       5.335155
2    male       fs      4  2.77650        0.316843       2.272332       3.280668
3    male       ss      4  3.40175        0.634811       2.391624       4.411876
4  female       ff      8  3.05025        0.959903       2.247751       3.852749
5  female       fs      8  3.31825        1.144539       2.361392       4.275108
6  female       ss      8  3.23450        0.361775       2.932048       3.536952

The mean values and 95% confidence intervals of each combination of Sex and Genotype are given.

>>> result
                D.F.  Sum Square  Mean Square  F Statistic   p-value      
Sex                1    0.068080     0.068080     0.086128  0.771180  <NA>
Genotype           2    0.277240     0.138620     0.175366  0.840004  <NA>
Sex : Genotype     2    0.814641     0.407321     0.515295  0.602515  <NA>
Residual          30   23.713823     0.790461          NaN       NaN  <NA>

The p-value of Sex > 0.05, so Activity are not different between the two Sex. The p-value of Genotype > 0.05, so Activity are not different between the three Genotype. The p-value of interaction > 0.05, so there is no interaction between Sex and Genotype.