biostats.friedman_test#

biostats.friedman_test(data, variable, between, subject)[source]#

Test whether the mean values of a variable are different between several groups on repeated measured data with nonparametric methods.

Parameters:
datapandas.DataFrame

The input data. Must contain at least one numeric column and one categorical column, as well as a column specifying the subjects.

variablestr

The numeric variable that we want to calculate mean values of.

betweenstr

The categorical variable that specifies which group the samples belong to. Maximum 20 groups.

subjectstr

The variable that specifies the subject ID. Samples measured on the same subject should have the same ID. Maximum 2000 subjects.

Returns:
summarypandas.DataFrame

The counts, mean values, standard deviations, minimums, first quartiles, medians, third quartiles, and maximums of the variable in each group.

resultpandas.DataFrame

The degree of freedom, chi-square statistic, and p-value of the test.

See also

kruskal_wallis_test

Test whether the mean values of a variable are different between groups with nonparametric methods.

repeated_measures_anova

The parametric version of Friedman Test.

Examples

>>> import biostats as bs
>>> data = bs.dataset("friedman_test.csv")
>>> data
    response drug  patient
0         30    A        1
1         28    B        1
2         16    C        1
3         34    D        1
4         14    A        2
5         18    B        2
6         10    C        2
7         22    D        2
8         24    A        3
9         20    B        3
10        18    C        3
11        30    D        3
12        38    A        4
13        34    B        4
14        20    C        4
15        44    D        4
16        26    A        5
17        28    B        5
18        14    C        5
19        30    D        5

We want to test whether the mean values of response in each drug are different with nonparametric methods, when the samples are repeatedly measured on the four patient.

>>> summary, result = bs.friedman_test(data=data, variable="response", between="drug", subject="patient")
>>> summary
  drug  Count  Mean  Std. Deviation  Minimum  1st Quartile  Median  3rd Quartile  Maximum
1    A      5  26.4        8.763561       14            24      26            30       38
2    B      5  25.6        6.542171       18            20      28            28       34
3    C      5  15.6        3.847077       10            14      16            18       20
4    D      5  32.0        8.000000       22            30      30            34       44

The mean values and some descriptive statistics of each group are given.

>>> result
       D.F.  Chi Square  p-value    
Model     3       13.56  0.00357  **

The p-value < 0.01, so the mean values of response in each group are significantly different.