biostats.multivariate_anova#
- biostats.multivariate_anova(data, variable, between)[source]#
Test whether the mean values of several variables are different between several groups.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least two numeric columns and one categorical column.
- variable
list
The list of numeric variables that we want to calculate mean values of.
- between
str
The categorical variable that specifies which group the samples belong to. Maximum 20 groups.
- data
- Returns:
- summary
pandas.DataFrame
The mean values and standard deviations of each numeric variable in each group.
- result
pandas.DataFrame
The degree of freedom, Pillai’s Trace, F statistic, and p-value of the test.
- summary
See also
one_way_anova
Test whether the mean values of a variable are different between groups.
Examples
>>> import biostats as bs >>> data = bs.dataset("multivariate_anova.csv") >>> data sepal_length sepal_width species 0 5.1 3.5 setosa 1 4.9 3.0 setosa 2 4.7 3.2 setosa 3 4.6 3.1 setosa 4 5.0 3.6 setosa .. ... ... ... 145 6.7 3.0 virginica 146 6.3 2.5 virginica 147 6.5 3.0 virginica 148 6.2 3.4 virginica 149 5.9 3.0 virginica
We want to test whether the mean values of sepal_length and sepal_width in each species are different.
>>> summary, result = bs.multivariate_anova(data=data, variable=["sepal_length", "sepal_width"], between="species") >>> summary species Mean (sepal_length) Std. (sepal_length) Mean (sepal_width) Std. (sepal_width) 1 setosa 5.006 0.352490 3.428 0.379064 2 versicolor 5.936 0.516171 2.770 0.313798 3 virginica 6.588 0.635880 2.974 0.322497
The mean values of sepal_length and sepal_width in each species are given.
>>> result D.F. Pillai's Trace F Statistic p-value species 2 0.945314 65.87798 9.902977e-40 ***
The p-value < 0.001, so the mean values of sepal_length and sepal_width in each group are significantly different.