biostats.repeated_measures_anova#

biostats.repeated_measures_anova(data, variable, between, subject)[source]#

Test whether the mean values of a variable are different between several groups on repeated measured data.

Parameters:

datapandas.DataFrame: The input data. Must contain at least one numeric column and one categorical column, as well as a column specifying the subjects.
variablestr: The numeric variable that we want to calculate mean values of.
betweenstr: The categorical variable that specifies which group the samples belong to. Maximum 20 groups.
subjectstr: The variable that specifies the subject ID. Samples measured on the same subject should have the same ID. Maximum 2000 subjects.

Returns:

summarypandas.DataFrame: The counts, mean values, standard deviations, and confidence intervals of each group.
resultpandas.DataFrame: The degree of freedom, sum of squares, mean of squares, F statistic, and p-value of the test.

See also

one_way_anova: Test whether the mean values of a variable are different between groups.
friedman_test: The non-parametric version of repeated measure ANOVA.

Examples

>>> import biostats as bs
>>> data = bs.dataset("repeated_measures_anova.csv")
>>> data
    response drug  patient
       30    A        1
       28    B        1
       16    C        1
       34    D        1
       14    A        2
       18    B        2
       10    C        2
       22    D        2
       24    A        3
       20    B        3
      18    C        3
      30    D        3
      38    A        4
      34    B        4
      20    C        4
      44    D        4
      26    A        5
      28    B        5
      14    C        5
      30    D        5

We want to test whether the mean values of response in each drug are different, when the samples are repeatedly measured on the four patient.

>>> summary, result = bs.repeated_measures_anova(data=data, variable="response", between="drug", subject="patient")
>>> summary
  drug  Count  Mean  Std. Deviation  95% CI: Lower  95% CI: Upper
1    A      5  26.4        8.763561      15.518602      37.281398
2    B      5  25.6        6.542171      17.476822      33.723178
3    C      5  15.6        3.847077      10.823223      20.376777
4    D      5  32.0        8.000000      22.066688      41.933312

The mean values of response and their 95% confidence intervals in each group are given.

>>> result
          D.F.  Sum Square  Mean Square  F Statistic  p-value     
drug         3       698.2   232.733333    24.758865  0.00002  ***
Residual    12       112.8     9.400000          NaN      NaN  NaN

The p-value < 0.001, so the mean values of response in each group are significantly different.