Contents

Seaborn

Contents

Seaborn#

Seaborn is a Python library based on Matplotlib. It provides two useful features:

different pre-defined styles for Matplotlib figures,
lots of functions for visualizing complex datasets.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

The reason for importing as sns is somewhat vague. sns are the initials of Samuel Norman Seaborn, a fictional television character. See also issue #229 in Seaborns Github repository.

sns.set_style('darkgrid')
fig, ax = plt.subplots()
x = [1, 2]
y = [1, 2]
ax.plot(x, y)
plt.show()

../../../_images/seaborn_3_0.png

Seaborn also supports different scalings for different usecases. Scaling is set with sns.set_context and one of the string arguments paper, notebook, talk, poster, where notebook is the default. Different scalings allow for almost identical code to create figures for different channels of publication.

sns.set_context('talk')
fig, ax = plt.subplots()
x = [1, 2]
y = [1, 2]
ax.plot(x, y)
plt.show()

../../../_images/seaborn_5_0.png

Plots for Exploring Data Sets#

Seaborn comes with lots of functions which take a whole data set (Pandas data frame) and create complex visualizations of the dataset. To get an overview have a look at the official Seaborn tutorials and at the Seaborn gallery.

sns.set_context('notebook')
rng = np.random.default_rng(0)

# parameters for point clouds
means = [[5, 0, 0], [-5, 2, 0], [0, -3, 5]] # mean vectors
covs = [[[1, 1, 0], [1, 1, 0], [0, 0, 1]],
        [[10, 2, 0], [2, 10, 2], [0, 2, 10]],
        [[0.1, 0, 0], [0, 3, 3], [0, 3, 7]]] # covariance matrices
names = ['cloud A', 'cloud B', 'cloud C'] # names
ns = [100, 1000, 100] # samples per cloud

# create data frame with named samples from each cloud
clouds = []
for (mean, cov, name, n) in zip(means, covs, names, ns):
    x, y, z = rng.multivariate_normal(mean, cov, n).T
    cloud_data = pd.DataFrame(np.asarray([x, y, z]).T, columns=['x', 'y', 'z'])
    cloud_data['name'] = name
    clouds.append(cloud_data)
data = pd.concat(clouds)

# show data frame structure    
display(data)

# plot pairwise relations with Seaborn
sns.pairplot(data, hue='name', hue_order=['cloud B', 'cloud A', 'cloud C'])
plt.show()

	x	y	z	name
0	4.874270	-0.125730	-0.132105	cloud A
1	4.895100	-0.104900	-0.535669	cloud A
2	3.696000	-1.304000	0.947081	cloud A
3	6.265421	1.265421	-0.623274	cloud A
4	7.325031	2.325031	-0.218792	cloud A
...	...	...	...	...
95	0.044085	-4.658415	6.014424	cloud C
96	0.133556	0.038378	9.619508	cloud C
97	0.401736	-4.324301	3.687626	cloud C
98	-0.202251	-1.378106	3.291303	cloud C
99	0.110040	-2.580220	4.072073	cloud C

1200 rows × 4 columns

../../../_images/seaborn_7_1.png