Pandas Tutorial: Multi-level Indexing
Introduction
We learned the basic concepts of Pandas in our previous chapter of our tutorial on Pandas. We introduced the data structures
- Series and
- DataFrame
We also learned how to create and manipulate the Series and DataFrame objects in numerous Python programs.
Now it is time to learn some further aspects of theses data structures in this chapter of our tutorial.
We will start with advanced indexing possibilities in Pandas.
Advanced or Multi-Level Indexing
Advanced or multi-level indexing is available both for Series and for DataFrames. It is a fascinating way of working with higher dimensional data, using Pandas data structures. It's an efficient way to store and manipulate arbitrarily high dimension data in 1-dimensional (Series) and 2-dimensional tabular (DataFrame) structures. In other words, we can work with higher dimensional data in lower dimensions. It's time to present an example in Python:
import pandas as pd
cities = ["Vienna", "Vienna", "Vienna",
"Hamburg", "Hamburg", "Hamburg",
"Berlin", "Berlin", "Berlin",
"Zürich", "Zürich", "Zürich"]
index = [cities, ["country", "area", "population",
"country", "area", "population",
"country", "area", "population",
"country", "area", "population"]]
print(index)
data = ["Austria", 414.60, 1805681,
"Germany", 755.00, 1760433,
"Germany", 891.85, 3562166,
"Switzerland", 87.88, 378884]
city_series = pd.Series(data, index=index)
print(city_series)
We can access the data of a city in the following way:
print(city_series["Vienna"])
We can also access the information about the country, area or population of a city. We can do this in two ways:
print(city_series["Vienna"]["area"])
The other way to accomplish it:
print(city_series["Vienna", "area"])
We can also get the content of multiple cities at the same time by using a list of city names as the key:
city_series["Hamburg",:]
If the index is sorted, we can also apply a slicing operation:
city_series = city_series.sort_index()
print("city_series with sorted index:")
print(city_series)
print("\n\nSlicing the city_series:")
city_series["Berlin":"Vienna"]
In the next example, we show that it is possible to access the inner keys as well:
print(city_series[:, "area"])
Swapping MultiIndex Levels
It is possible to swap the levels of a MultiIndex with the method swaplevel:
swaplevel(self, i=-2, j=-1, copy=True)
Swap levels i and j in a MultiIndex
Parameters
----------
i, j : int, string (can be mixed)
Level of index to be swapped. Can pass level name as string.
The indexes 'i' and 'j' are optional, and default to
the two innermost levels of the index
Returns
-------
swapped : Series
city_series = city_series.swaplevel()
city_series.sort_index(inplace=True)
city_series