biostats.repeated_measures_anova#

biostats.repeated_measures_anova(data, variable, between, subject)[source]#

Test whether the mean values of a variable are different between several groups on repeated measured data.

Parameters:
datapandas.DataFrame

The input data. Must contain at least one numeric column and one categorical column, as well as a column specifying the subjects.

variablestr

The numeric variable that we want to calculate mean values of.

betweenstr

The categorical variable that specifies which group the samples belong to. Maximum 20 groups.

subjectstr

The variable that specifies the subject ID. Samples measured on the same subject should have the same ID. Maximum 2000 subjects.

Returns:
summarypandas.DataFrame

The counts, mean values, standard deviations, and confidence intervals of each group.

resultpandas.DataFrame

The degree of freedom, sum of squares, mean of squares, F statistic, and p-value of the test.

See also

one_way_anova

Test whether the mean values of a variable are different between groups.

friedman_test

The non-parametric version of repeated measure ANOVA.

Examples

>>> import biostats as bs
>>> data = bs.dataset("repeated_measures_anova.csv")
>>> data
    response drug  patient
0         30    A        1
1         28    B        1
2         16    C        1
3         34    D        1
4         14    A        2
5         18    B        2
6         10    C        2
7         22    D        2
8         24    A        3
9         20    B        3
10        18    C        3
11        30    D        3
12        38    A        4
13        34    B        4
14        20    C        4
15        44    D        4
16        26    A        5
17        28    B        5
18        14    C        5
19        30    D        5

We want to test whether the mean values of response in each drug are different, when the samples are repeatedly measured on the four patient.

>>> summary, result = bs.repeated_measures_anova(data=data, variable="response", between="drug", subject="patient")
>>> summary
  drug  Count  Mean  Std. Deviation  95% CI: Lower  95% CI: Upper
1    A      5  26.4        8.763561      15.518602      37.281398
2    B      5  25.6        6.542171      17.476822      33.723178
3    C      5  15.6        3.847077      10.823223      20.376777
4    D      5  32.0        8.000000      22.066688      41.933312

The mean values of response and their 95% confidence intervals in each group are given.

>>> result
          D.F.  Sum Square  Mean Square  F Statistic  p-value     
drug         3       698.2   232.733333    24.758865  0.00002  ***
Residual    12       112.8     9.400000          NaN      NaN  NaN

The p-value < 0.001, so the mean values of response in each group are significantly different.