biostats.wilcoxon_signed_rank_test#

biostats.wilcoxon_signed_rank_test(data, variable, between, group, pair)[source]#

Test whether the mean values of a variable are different in two paired groups with nonparametric methods.

Parameters:
datapandas.DataFrame

The input data. Must contain at least one numeric column and one categorical column, as well as a column specifying the pairs.

variablestr

The numeric variable that we want to calculate mean values of.

betweenstr

The categorical variable that specifies which group the samples belong to. Maximum 20 groups.

grouplist

List of the two groups to be compared.

pairstr

The variable that specifies the pair ID. Samples in the same pair should have the same ID. Maximum 2000 pairs.

Returns:
summarypandas.DataFrame

The counts, mean values, standard deviations, minimums, first quartiles, medians, third quartiles, and maximums of the variable in the two groups.

resultpandas.DataFrame

The rank sums, z statistic, and p-values of the normal and exact tests.

See also

sign_test

Similar with Wilcoxon signed-rank test but not taking the values of difference into account.

wilcoxon_rank_sum_test

Compare the mean values between two independent groups with nonparametric methods.

paired_t_test

The parametric version of sign test.

Examples

>>> import biostats as bs
>>> data = bs.dataset("wilcoxon_signed_rank_test.csv")
>>> data
   Concentration     Month           Clone
0            8.1    August    Balsam_Spire
1           10.0    August         Beaupre
2           16.5    August       Hazendans
3           13.6    August       Hoogvorst
4            9.5    August        Raspalje
5            8.3    August            Unal
6           18.3    August  Columbia_River
7           13.3    August   Fritzi_Pauley
8            7.9    August       Trichobel
9            8.1    August           Gaver
10           8.9    August          Gibecq
11          12.6    August           Primo
12          13.4    August       Wolterson
13          11.2  November    Balsam_Spire
14          16.3  November         Beaupre
15          15.3  November       Hazendans
16          15.6  November       Hoogvorst
17          10.5  November        Raspalje
18          15.5  November            Unal
19          12.7  November  Columbia_River
20          11.1  November   Fritzi_Pauley
21          19.9  November       Trichobel
22          20.4  November           Gaver
23          14.2  November          Gibecq
24          12.7  November           Primo
25          36.8  November       Wolterson

We want to test whether Concentration is different between August and November for every Clone with nonparametric methods.

>>> summary, result = bs.wilcoxon_signed_rank_test(data=data, variable="Concentration", between="Month", group=["August", "November"], pair="Clone")
>>> summary
          Count       Mean  Std. Deviation  Minimum  1st Quartile  Median  3rd Quartile  Maximum
August       13  11.423077        3.451607      7.9           8.3    10.0          13.4     18.3
November     13  16.323077        6.886963     10.5          12.7    15.3          16.3     36.8

The mean values and some descriptive statistics of the two groups are given.

>>> result
        Rank Sum  z Statistic   p-value   
Normal        16     2.026684  0.042695  *
Exact         16          NaN  0.039795  *

The p-value < 0.05, so there is a significant difference between Concentration of August and November.