File Access#

Read Accessing Data before you start with the exercises.

Lower Case Copy#

Read some text file’s content, convert it to lower case and save it to a new text file.

Solution:

# your solution

Reading CSV Files#

Get a CSV file containing all public trees at Chemnitz from the Open data portal of Chemnitz. Read the first 10 lines from the file and show them on screen.

Hint: If you encounter cumbersome symbols in the output, have a look at byte order marks at Wikipedia.

Solution:

# your solution

Reading ZIP files#

Get ‘The Blog Authorship Corpus’ from the web. The original source https://u.cs.biu.ac.il/~koppel/BlogCorpus.htm vanished in 2022. Use https://www.fh-zwickau.de/~jef19jdw/datasets/blogs.zip. The ZIP file contains an info.txt file and the original ZIP file.

Read infos.txt to get information about the file name format used in the ZIP file.

Write a Python program which extracts all 5 features from the file names and saves them in a CSV file.

Solution:

# your solution

Reading XML Files#

Open the file 7596.male.26.Internet.Scorpio.xml from ‘The Blog Authorship Corpus’ (see exercise above) without extracting it explicitly. Print the first and the last post in the file to screen.

Solution:

# your solution