biostats.wilcoxon_rank_sum_test#
- biostats.wilcoxon_rank_sum_test(data, variable, between, group)[source]#
Test whether the mean values of a variable are different in two groups with nonparametric methods.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least one numeric column and one categorical column.
- variable
str
The numeric variable that we want to calculate mean values of.
- between
str
The categorical variable that specifies which group the samples belong to. Maximum 20 groups.
- group
list
List of the two groups to be compared.
- data
- Returns:
- summary
pandas.DataFrame
The counts, mean values, standard deviations, minimums, first quartiles, medians, third quartiles, and maximums of the variable in the two groups.
- result
pandas.DataFrame
The rank sums, z statistic, and p-values of the normal and exact tests.
- summary
See also
wilcoxon_signed_rank_test
Compare the mean values between two paired groups with nonparametric methods.
kruskal_wallis_test
Compare the mean values between more than two groups with nonparametric methods.
two_sample_t_test
The parametric version of Wilcoxon rank-sum test.
Examples
>>> import biostats as bs >>> data = bs.dataset("wilcoxon_rank_sum_test.csv") >>> data Value Time 0 69 2pm 1 70 2pm 2 66 2pm 3 63 2pm 4 68 2pm 5 70 2pm 6 69 2pm 7 67 2pm 8 62 2pm 9 63 2pm 10 76 2pm 11 59 2pm 12 62 2pm 13 62 2pm 14 75 2pm 15 62 2pm 16 72 2pm 17 63 2pm 18 68 5pm 19 62 5pm 20 67 5pm 21 68 5pm 22 69 5pm 23 67 5pm 24 61 5pm 25 59 5pm 26 62 5pm 27 61 5pm 28 69 5pm 29 66 5pm 30 62 5pm 31 62 5pm 32 61 5pm 33 70 5pm
We want to test whether value is different between 2pm and 5pm with nonparametric methods.
>>> summary, result = bs.wilcoxon_rank_sum_test(data=data, variable="Value", between="Time", group=["2pm", "5pm"]) >>> summary Count Mean Std. Deviation Minimum 1st Quartile Median 3rd Quartile Maximum 2pm 18 66.555556 4.889632 59 62.25 66.5 69.75 76 5pm 16 64.625000 3.667424 59 61.75 64.0 68.00 70
The mean values and some descriptive statistics of the two groups are given.
>>> result Rank Sum z Statistic p-value Normal 357 1.444746 0.148529 <NA> Exact 357 NaN NaN <NA>
The p-value > 0.05, so there is no significant difference between the two groups.