biostats.scatter_plot#

biostats.scatter_plot(data, x, y, color=None)[source]#

Draw a scatter plot to show the relation between two numeric variables.

Parameters:
datapandas.DataFrame

The input data. Must contain at least two numeric columns.

xstr

The numeric variable to be plotted in x-axis.

ystr

The numeric variable to be plotted in y-axis.

colorstr

The categorical variable specifying groups to be plotted with different colors. Maximum 20 groups. Optional.

Returns:
figmatplotlib.figure.Figure

The generated plot.

See also

line_plot

Draw a line plot to show the relation between two numeric variables.

regression_plot

Draw a regression line to show the relation between two numeric variables.

Examples

>>> import biostats as bs
>>> import matplotlib.pyplot as plt
>>> data = bs.dataset("tips.csv")
>>> data
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

We want to visualize the relation between total_bill and tip.

>>> fig = bs.scatter_plot(data=data, x="total_bill", y="tip", color="day")
>>> plt.show()
../../_images/biostats-scatter_plot-1.png