Climate Change#

In this project we download historic weather data from DWD Open Data Portal and have a look at annual mean temperatures and other values at different locations in Germany.

In this project we heavily rely on techniques presented in Accessing Data and High-Level Data Management with Pandas as well as on knowledge obtained in the DWD Open Data Portal project.

We use the DWD data set Historical daily station observations for Germany, see description.

Station List#

The first step is to get a list of all weather stations in Germany.

Task: Download the station list from DWD Open Data Portal, make a nice data frame from it, and save it to a CSV file. Columns:

  • 'id' (DWD station ID, use as index, integer),

  • 'name' (string),

  • 'latitude' (float),

  • 'longitude' (float),

  • 'altitude' (integer),

  • 'first' (date of first measurement, timestamp),

  • 'last' (date of last measurement, timestamp).

Hint: pandas.read_fwf is your friend.

Solution:

# your solution

Download Measurements#

Task: Get a list of file names of all ZIP files of the data set.

Hint: A good idea is to construct file names from data in the station list (ID, first and last day of measurement). But it turns out that dates in the list in the file names do not coincide for several files. Thus, we have to scrape file names from the data set’s file listing.

Solution:

# your solution

Task: Process all files. Processing steps are:

  1. Download the file.

  2. Read the data file contained in the ZIP file.

  3. Drop and rename colums and adjust types (see below).

  4. Write data to a CSV file (one large CSV file for data from all stations).

Columns for CSV file:

  • date (timestamp of measurement),

  • id (station ID, integer),

  • 'wind_gust', 'wind_speed', 'precipitation', 'sunshine', 'snow', 'clouds', 'pressure', 'temperature', 'humidity', 'max_temp', 'min_temp', 'min_temp_ground' (float).

Solution:

# your solution

Update Station List#

Dates of first and last measurements are incorrect in the station list created above. Now, that we have the measurements, we should correct the list.

Task: For each station get dates for first and last measurement and write them to the station list CSV file. Drop all stations that do not have any measurements.

# your solution

Plots#

Task: Use Series.plot to create different plots:

  • mean annual temperature/precipitation/… for the station with highest number of years with measurements,

  • mean annual temperature/precipitation/… in Germany (mean over all stations)

  • minimum/maximum temperature in Germany for each year

# your solution