Chemnitz Trees#

The aim of this project is to create an information sheet about public trees at Chemnitz. Before you start, you should have read Matplotlib Basics.

Download and Cleaning#

Information on public trees in Chemnitz are available online: Chemnitz trees data set.

Task: Find license information. Are we allowed to create an information sheet from the data set and to publish this information sheet?

Solution:

# your answers

Task: Download the data set in CSV format and read it into a data frame. Explore the data set (columns, data types, numerical ranges, row count,…) and apply standard cleaning steps as appropriate (adjust types, rename columns, drop useless columns,…).

Solution:

# your solution

Short Names#

To get information about tree type distribution we have to unify tree names. Instead of full detailed names we want to have common short names (Linde instead of Sommerlinde, Ahorn instead of Bergahorn, Flieder instead of Syringa reticulata Ivory Pink,…).

Task: Add a column with common short names for all trees. There are many different ways to automatically derive short names. A good idea is to define a dictionary assigning short species names to search strings. Then full species names can be searched for those strings and, if there is a match, corresponding short names can be assigned. Find short names for all (!) trees. Use ‘sonstige’ for trees without species name in the data set.

Solution:

# your solution

Extract Information#

Task: Get the following information from the data set:

  • five oldest trees,

  • list of rare species (less than 5 trees),

  • list of dominant species (at least 1000 trees).

Create a pie chart showing the fraction of total population for each dominant species. Include one slice for all non-dominant trees.

Create a stacked bar plot showing fractions for dominant species by age. Group ages by decade. The horizontal axis shows age in decades starting with 0 (for decade 2020 till 2029) at the right. Vertical axis shows fractions (‘linear pie chart’).

Solution:

# your solution

Presentation#

Task: Create PDF file in A4 format showing all information extracted above. Use whatever software you like. LibreOffice Write is a good starting point.

Pimp your pie and bar plots. Format lists of oldest and rarest trees nicely. Add some visual elements (lines, boxes,…) to structure the document and guide the viewer’s eyes.

Feel free to add further information. For instance, try to find locations of old and rare trees.