biostats.one_way_anova#
- biostats.one_way_anova(data, variable, between)[source]#
Test whether the mean values of a variable are different between several groups.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least one numeric column and one categorical column.
- variable
str
The numeric variable that we want to calculate mean values of.
- between
str
The categorical variable that specifies which group the samples belong to. Maximum 20 groups.
- data
- Returns:
- summary
pandas.DataFrame
The counts, mean values, standard deviations, and confidence intervals of each group.
- result
pandas.DataFrame
The degree of freedom, sum of squares, mean of squares, F statistic, and p-value of the test.
- summary
See also
one_way_ancova
Test whether the mean values are different between groups, when another variable is controlled.
two_way_anova
Test whether the mean values are different between groups, when classified in two ways.
kruskal_wallis_test
The non-parametric version of one-way ANOVA.
Examples
>>> import biostats as bs >>> data = bs.dataset("one_way_anova.csv") >>> data Length Location 0 0.0571 Tillamook 1 0.0813 Tillamook 2 0.0831 Tillamook 3 0.0976 Tillamook 4 0.0817 Tillamook 5 0.0859 Tillamook 6 0.0735 Tillamook 7 0.0659 Tillamook 8 0.0923 Tillamook 9 0.0836 Tillamook 10 0.0873 Newport 11 0.0662 Newport 12 0.0672 Newport 13 0.0819 Newport 14 0.0749 Newport 15 0.0649 Newport 16 0.0835 Newport 17 0.0725 Newport 18 0.0974 Petersburg 19 0.1352 Petersburg 20 0.0817 Petersburg 21 0.1016 Petersburg 22 0.0968 Petersburg 23 0.1064 Petersburg 24 0.1050 Petersburg 25 0.1033 Magadan 26 0.0915 Magadan 27 0.0781 Magadan 28 0.0685 Magadan 29 0.0677 Magadan 30 0.0697 Magadan 31 0.0764 Magadan 32 0.0689 Magadan 33 0.0703 Tvarminne 34 0.1026 Tvarminne 35 0.0956 Tvarminne 36 0.0973 Tvarminne 37 0.1039 Tvarminne 38 0.1045 Tvarminne
We want to test whether the mean values of Length in each Location are different.
>>> summary, result = bs.one_way_anova(data=data, variable="Length", between="Location") >>> summary Location Count Mean Std. Deviation 95% CI: Lower 95% CI: Upper 1 Tillamook 10 0.080200 0.011963 0.071642 0.088758 2 Newport 8 0.074800 0.008597 0.067613 0.081987 3 Petersburg 7 0.103443 0.016209 0.088452 0.118434 4 Magadan 8 0.078012 0.012945 0.067190 0.088835 5 Tvarminne 6 0.095700 0.012962 0.082098 0.109302
The mean values of Length and their 95% confidence intervals in each group are given.
>>> result D.F. Sum Square Mean Square F Statistic p-value Location 4 0.004520 0.001130 7.121019 0.000281 *** Residual 34 0.005395 0.000159 NaN NaN NaN
The p-value < 0.001, so the mean values of Length in each group are significantly different.