biostats.epidemiologic_study#

biostats.epidemiologic_study(data, disease, disease_target, factor, factor_target)[source]#

Compute some common statistics of an epidemiologic study.

Parameters:
datapandas.DataFrame

The input data. Must contain at least two categorical columns.

diseasestr

The variable specifying the disease. Maximum 20 groups.

disease_targetstr

The group of the disease variable that is considered “positive”.

factorstr

The variable specifying the factor. Maximum 20 groups.

factor_targetstr

The group of the factor variable that is considered “positive”.

Returns:
summarypandas.DataFrame

The contingency table of the disease and factor.

resultpandas.DataFrame

The values and confidence intervals of risk difference, risk ratio (relative risk), odds ratio, and attributable risk.

See also

screening_test

Compute some common statistics of a screening test.

contingency

Compute the contingency table of two categorical variables.

Examples

>>> import biostats as bs
>>> data = bs.dataset("epidemiologic_study.csv")
>>> data
             MI Diabetes
0     Not occur       No
1     Not occur       No
2     Not occur       No
3     Not occur       No
4     Not occur       No
...         ...      ...
2993  Not occur       No
2994  Not occur       No
2995  Not occur       No
2996  Not occur       No
2997  Not occur       No

We want to compute the risk ratio, odds ratio and so on of the epidemiologic study that investigates the relation between MI and Diabetes.

>>> summary, result = bs.epidemiologic_study(data=data, disease="MI", disease_target="Occur", factor="Diabetes", factor_target="Yes")
>>> summary
              MI (+)  MI (-)
Diabetes (+)      48     183
Diabetes (-)     210    2557

The contingency table of MI and Diabetes is given.

>>> result
                   Estimation  95% CI: Lower  95% CI: Upper
Risk Difference      0.131898       0.078654       0.185141
Risk Ratio           2.737910       2.062282       3.634880
Odds Ratio           3.193755       2.256038       4.521233
Attributable Risk    0.118094       0.078925       0.173051

The values and confidence intervals of risk ratio, odds ratio and so on are computed.