biostats.sign_test#
- biostats.sign_test(data, variable, between, group, pair)[source]#
Test whether the mean values of a variable are different in two paired groups with nonparametric methods.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least one numeric column and one categorical column, as well as a column specifying the pairs.
- variable
str
The numeric variable that we want to calculate mean values of.
- between
str
The categorical variable that specifies which group the samples belong to. Maximum 20 groups.
- group
list
List of the two groups to be compared.
- pair
str
The variable that specifies the pair ID. Samples in the same pair should have the same ID. Maximum 2000 pairs.
- data
- Returns:
- summary
pandas.DataFrame
The counts, mean values, standard deviations, minimums, first quartiles, medians, third quartiles, and maximums of the variable in the two groups.
- result
pandas.DataFrame
The sums of sign, z statistic, and p-values of the normal and exact tests.
- summary
See also
wilcoxon_signed_rank_test
Similar with sign test but taking the values of difference into account.
wilcoxon_rank_sum_test
Compare the mean values between two independent groups with nonparametric methods.
paired_t_test
The parametric version of sign test.
Examples
>>> import biostats as bs >>> data = bs.dataset("sign_test.csv") >>> data Concentration Month Clone 0 8.1 August Balsam_Spire 1 10.0 August Beaupre 2 16.5 August Hazendans 3 13.6 August Hoogvorst 4 9.5 August Raspalje 5 8.3 August Unal 6 18.3 August Columbia_River 7 13.3 August Fritzi_Pauley 8 7.9 August Trichobel 9 8.1 August Gaver 10 8.9 August Gibecq 11 12.6 August Primo 12 13.4 August Wolterson 13 11.2 November Balsam_Spire 14 16.3 November Beaupre 15 15.3 November Hazendans 16 15.6 November Hoogvorst 17 10.5 November Raspalje 18 15.5 November Unal 19 12.7 November Columbia_River 20 11.1 November Fritzi_Pauley 21 19.9 November Trichobel 22 20.4 November Gaver 23 14.2 November Gibecq 24 12.7 November Primo 25 36.8 November Wolterson
We want to test whether Concentration is different between August and November for every Clone with nonparametric methods.
>>> summary, result = bs.sign_test(data=data, variable="Concentration", between="Month", group=["August", "November"], pair="Clone") >>> summary Count Mean Std. Deviation Minimum 1st Quartile Median 3rd Quartile Maximum August 13 11.423077 3.451607 7.9 8.3 10.0 13.4 18.3 November 13 16.323077 6.886963 10.5 12.7 15.3 16.3 36.8
The mean values and some descriptive statistics of the two groups are given.
>>> result Sum z Statistic p-value Normal 3 -1.664101 0.096092 <NA> Exact 3 NaN 0.092285 <NA>
The p-value > 0.05, so there is no significant difference between Concentration of August and November.