Seaborn#

Seaborn is a Python library based on Matplotlib. It provides two useful features:

  • different pre-defined styles for Matplotlib figures,

  • lots of functions for visualizing complex datasets.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

The reason for importing as sns is somewhat vague. sns are the initials of Samuel Norman Seaborn, a fictional television character. See also issue #229 in Seaborns Github repository.

sns.set_style('darkgrid')
fig, ax = plt.subplots()
x = [1, 2]
y = [1, 2]
ax.plot(x, y)
plt.show()
../../../_images/seaborn_3_0.png

Seaborn also supports different scalings for different usecases. Scaling is set with sns.set_context and one of the string arguments paper, notebook, talk, poster, where notebook is the default. Different scalings allow for almost identical code to create figures for different channels of publication.

sns.set_context('talk')
fig, ax = plt.subplots()
x = [1, 2]
y = [1, 2]
ax.plot(x, y)
plt.show()
../../../_images/seaborn_5_0.png

Plots for Exploring Data Sets#

Seaborn comes with lots of functions which take a whole data set (Pandas data frame) and create complex visualizations of the dataset. To get an overview have a look at the official Seaborn tutorials and at the Seaborn gallery.

sns.set_context('notebook')
rng = np.random.default_rng(0)

# parameters for point clouds
means = [[5, 0, 0], [-5, 2, 0], [0, -3, 5]] # mean vectors
covs = [[[1, 1, 0], [1, 1, 0], [0, 0, 1]],
        [[10, 2, 0], [2, 10, 2], [0, 2, 10]],
        [[0.1, 0, 0], [0, 3, 3], [0, 3, 7]]] # covariance matrices
names = ['cloud A', 'cloud B', 'cloud C'] # names
ns = [100, 1000, 100] # samples per cloud

# create data frame with named samples from each cloud
clouds = []
for (mean, cov, name, n) in zip(means, covs, names, ns):
    x, y, z = rng.multivariate_normal(mean, cov, n).T
    cloud_data = pd.DataFrame(np.asarray([x, y, z]).T, columns=['x', 'y', 'z'])
    cloud_data['name'] = name
    clouds.append(cloud_data)
data = pd.concat(clouds)

# show data frame structure    
display(data)

# plot pairwise relations with Seaborn
sns.pairplot(data, hue='name', hue_order=['cloud B', 'cloud A', 'cloud C'])
plt.show()
x y z name
0 4.874270 -0.125730 -0.132105 cloud A
1 4.895100 -0.104900 -0.535669 cloud A
2 3.696000 -1.304000 0.947081 cloud A
3 6.265421 1.265421 -0.623274 cloud A
4 7.325031 2.325031 -0.218792 cloud A
... ... ... ... ...
95 0.044085 -4.658415 6.014424 cloud C
96 0.133556 0.038378 9.619508 cloud C
97 0.401736 -4.324301 3.687626 cloud C
98 -0.202251 -1.378106 3.291303 cloud C
99 0.110040 -2.580220 4.072073 cloud C

1200 rows × 4 columns

../../../_images/seaborn_7_1.png