biostats.ordered_logistic_regression#
- biostats.ordered_logistic_regression(data, x_numeric, x_categorical, y, order)[source]#
Fit an equation that predicts an ordered categorical variable from other variables.
- Parameters:
- data
pandas.DataFrame
The input data. Must contain at least one categorical column and several other columns (can be either numeric or categorical).
- x_numeric
list
The list of predictor variables that are numeric.
- x_categorical
list
The list of predictor variables that are categorical. Maximum 20 groups.
- y
str
The response variable. Must be categorical. Maximum 20 groups.
- order
dict
The order of groups in the categorical variable.
- data
- Returns:
- summary
pandas.DataFrame
The coefficients of the fitted equation, along with the confidence intervals, standard errors, z statistics, and p-values.
- result
pandas.DataFrame
The pseudo R-squared and p-value of the fitted model.
- summary
See also
multiple_logistic_regression
Fit an equation that predicts a dichotomous categorical variable from other variables.
multinomial_logistic_regression
Fit an equation that predicts a multinomial categorical variable from other variables.
multiple_linear_regression
Fit an equation that predicts a numeric variable from other variables.
Examples
>>> import biostats as bs >>> data = bs.dataset("ordered_logistic_regression.csv") >>> data pared public gpa apply 0 0 0 3.26 very likely 1 1 0 3.21 somewhat likely 2 1 1 3.94 unlikely 3 0 0 2.81 somewhat likely 4 0 0 2.53 somewhat likely .. ... ... ... ... 395 0 0 3.70 unlikely 396 0 0 2.63 unlikely 397 0 0 2.25 somewhat likely 398 0 0 3.26 somewhat likely 399 0 0 3.52 very likely
We want to fit an equation that predicts apply from pared, public, and gpa.
>>> summary, result = bs.ordered_logistic_regression(data=data, x_numeric=["pared", "public", "gpa"], x_categorical=[], y="apply", ... order={"unlikely":1, "somewhat likely":2, "very likely":3}) >>> summary Coefficient 95% CI: Lower 95% CI: Upper Std. Error z Statistic p-value pared 1.047678 0.526740 1.568616 0.265789 3.941761 0.000081 *** public -0.058675 -0.642471 0.525121 0.297861 -0.196987 0.843838 NaN gpa 0.615740 0.104912 1.126568 0.260631 2.362495 0.018152 * unlikely / somewhat likely 2.203303 0.675441 3.731164 NaN NaN NaN NaN somewhat likely / very likely 4.298752 2.466471 6.182776 NaN NaN NaN NaN
The coefficients of the fitted equation, along with confidence intervals and p-values are given.
>>> result Pseudo R-Squared p-value Model 0.67443 1.125372e-11 ***
The p-value < 0.001, so there is a significant relation between the predictor and response variables.