biostats.simple_linear_regression#
- biostats.simple_linear_regression(data, x, y)[source]#
Fit an equation that predicts a numeric variable from another numeric variable.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least two numeric columns.
- x
str
The predictor variable. Must be numeric.
- y
str
The response variable. Must be numeric.
- data
- Returns:
- summary
pandas.DataFrame
The coefficients of the fitted equation, along with the confidence intervals, standard errors, t statistics, and p-values.
- result
pandas.DataFrame
The R-squared, adjusted R-squared, F statistic, and p-value of the fitted model.
- summary
See also
multiple_linear_regression
Fit an equation that predicts a numeric variable from other variables.
simple_logistic_regression
Fit an equation that predicts a dichotomous categorical variable from a numeric variable.
correlation
Test the correlation between two numeric variables.
Examples
>>> import biostats as bs >>> data = bs.dataset("simple_linear_regression.csv") >>> data Weight Eggs 0 5.38 29 1 7.36 23 2 6.13 22 3 4.75 20 4 8.10 25 5 8.62 25 6 6.30 17 7 7.44 24 8 7.26 20 9 7.17 27 10 7.78 24 11 6.23 21 12 5.42 22 13 7.87 22 14 5.25 23 15 7.37 35 16 8.01 27 17 4.92 23 18 7.03 25 19 6.45 24 20 5.06 19 21 6.72 21 22 7.00 20 23 9.39 33 24 6.49 17 25 6.34 21 26 6.16 25 27 5.74 22
We want to fit an equation that predicts Eggs from Weight.
>>> summary, result = bs.simple_linear_regression(data=data, x="Weight", y="Eggs") >>> summary Coefficient 95% CI: Lower 95% CI: Upper Std. Error t Statistic p-value Intercept 12.689022 4.054035 21.324009 4.200858 3.020579 0.005598 ** Weight 1.601722 0.332202 2.871243 0.617612 2.593411 0.015401 *
The coefficients of the fitted equation, along with confidence intervals and p-values are given.
>>> result R-Squared Adj. R-Squared F Statistic p-value Model 0.205519 0.174962 6.72578 0.015401 *
The p-value < 0.05, so there is a significant relation between the predictor and response variables.