Pandas Basics
Contents
Pandas Basics#
Before solving these basic Pandas exercises you should have read Series and Data Frames.
For these exercises we use a dataset describing used cars obtained from kaggle.com. Licences: Open Data Commons Database Contents License (DbCL) v1.0 and Open Data Commons Open Database License (ODbL) .
import pandas as pd
data = pd.read_csv('cars.csv')
First Look#
Basic Information#
Print the following information about the data frame data
:
first 10 rows,
number of rows,
basic statistical information,
column labels, data types, memory usage.
Solution:
# your solution
Value Counts#
Use DataFrame.nunique
to get the number of different values per column.
Solution:
# your solution
Unique Car Models#
Use DataFrame.value_counts
to get the number of unique 'name'
-'year'
combinations.
Solution:
# your solution
Restructure Columns#
New Columns#
Append a column 'manual_trans'
containing True
where column 'transmission'
shows 'Manual'
, else False
.
Append a column 'age'
showing a car’s age (now minus 'year'
).
Solution:
# your solution
Remove Columns#
Remove columns 'seller_type'
, 'transmission'
, and 'owner'
.
Solution:
# your solution
Mean Price#
Series with String Index#
Create a Pandas series price
with column 'name'
as index and column 'selling_price'
as data.
Solution:
# your solution
Kilometers per Year#
Boolean Indexing#
Use boolean row indexing to get a data frame one_model
with columns 'km_driven'
and 'age'
containing only rows with 'name'
equal to 'Maruti Swift Dzire VDI'
.
Solution:
# your solution
New Column#
Add a column 'km_per_year'
to the one_model
data frame containing kilometers per year.
Solution:
# your solution
Oldest Car#
Find the oldest car in data
and print its name and manufacturing year. Have a look at Pandas’ documentation for suitable functions.
Solution:
# your solution