biostats.multivariate_anova#

biostats.multivariate_anova(data, variable, between)[source]#

Test whether the mean values of several variables are different between several groups.

Parameters:
datapandas.DataFrame

The input data. Must contain at least two numeric columns and one categorical column.

variablelist

The list of numeric variables that we want to calculate mean values of.

betweenstr

The categorical variable that specifies which group the samples belong to. Maximum 20 groups.

Returns:
summarypandas.DataFrame

The mean values and standard deviations of each numeric variable in each group.

resultpandas.DataFrame

The degree of freedom, Pillai’s Trace, F statistic, and p-value of the test.

See also

one_way_anova

Test whether the mean values of a variable are different between groups.

Examples

>>> import biostats as bs
>>> data = bs.dataset("multivariate_anova.csv")
>>> data
     sepal_length  sepal_width    species
0             5.1          3.5     setosa
1             4.9          3.0     setosa
2             4.7          3.2     setosa
3             4.6          3.1     setosa
4             5.0          3.6     setosa
..            ...          ...        ...
145           6.7          3.0  virginica
146           6.3          2.5  virginica
147           6.5          3.0  virginica
148           6.2          3.4  virginica
149           5.9          3.0  virginica

We want to test whether the mean values of sepal_length and sepal_width in each species are different.

>>> summary, result = bs.multivariate_anova(data=data, variable=["sepal_length", "sepal_width"], between="species")
>>> summary
      species  Mean (sepal_length)  Std. (sepal_length)  Mean (sepal_width)  Std. (sepal_width)
1      setosa                5.006             0.352490               3.428            0.379064
2  versicolor                5.936             0.516171               2.770            0.313798
3   virginica                6.588             0.635880               2.974            0.322497

The mean values of sepal_length and sepal_width in each species are given.

>>> result
         D.F.  Pillai's Trace  F Statistic       p-value     
species     2        0.945314     65.87798  9.902977e-40  ***

The p-value < 0.001, so the mean values of sepal_length and sepal_width in each group are significantly different.