Generating Handwritten Digits#

Gaussian mixture models are generative models, that is, we may use a trained model to generate new samples looking similar to the training samples. If we train a Gaussian mixture model on QMNIST, then it should be possible to generate images of handwritten digits, which are not exactly equal to one of the QMNIST images.

Loading and Preprocessing Data#

Task: Load QMNIST training images, labels and writer IDs. Center bounding boxes and crop images to 20x20 pixels.

Solution:

# your solution

Task: Use PCA with 50 components to reduce dimensionality of the data space. Else training the model will take too long. Show some image together with its projection onto the 50 dimensional subspace constructed by PCA.

Solution:

# your solution

Training the Model#

Task: Fit a Gaussian mixture model to the data. What’s a sensible number of clusters.

Solution:

# your solution

Generating Images#

Task: Use GaussianMixture.sample to generate 100 new images showing handwritten digits. Show all images.

Solution:

# your solution

Gaussian Mixture Unmixed#

If we already know the clusters and only want to generate new images, we may fit a Gaussian distribution to each cluster manually. That is, for each cluster we compute mean vector and covariance matrix.

Task: Find the writer with the highest number of images available. Show all images for this writer.

Solution:

# your solution

Task: Get means and covariances for the ten classes of digitis written by the writer.

Solution:

# your solution

Task: Use NumPy’s random number generator to generate 10 new images per class.

Solution:

# your solution