python-course.eu

34. Multi-level Indexing in Pandas

By Bernd Klein. Last modified: 26 Apr 2023.

Introduction

Multi Level Indexing

We learned the basic concepts of Pandas in our previous chapter of our tutorial on Pandas. We introduced the data structures

We also learned how to create and manipulate the Series and DataFrame objects in numerous Python programs.

Now it is time to learn some further aspects of theses data structures in this chapter of our tutorial.

We will start with advanced indexing possibilities in Pandas.

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Enrol here

Advanced or Multi-Level Indexing

Advanced or multi-level indexing is available both for Series and for DataFrames. It is a fascinating way of working with higher dimensional data, using Pandas data structures. It's an efficient way to store and manipulate arbitrarily high dimension data in 1-dimensional (Series) and 2-dimensional tabular (DataFrame) structures. In other words, we can work with higher dimensional data in lower dimensions. It's time to present an example in Python:

import pandas as pd

cities = ["Vienna", "Vienna", "Vienna",
          "Hamburg", "Hamburg", "Hamburg",
          "Berlin", "Berlin", "Berlin",
          "Zürich", "Zürich", "Zürich"]
index = [cities, ["country", "area", "population",
                  "country", "area", "population",
                  "country", "area", "population",
                  "country", "area", "population"]]
print(index)

OUTPUT:

[['Vienna', 'Vienna', 'Vienna', 'Hamburg', 'Hamburg', 'Hamburg', 'Berlin', 'Berlin', 'Berlin', 'Zürich', 'Zürich', 'Zürich'], ['country', 'area', 'population', 'country', 'area', 'population', 'country', 'area', 'population', 'country', 'area', 'population']]
data = ["Austria", 414.60,    1805681,
        "Germany", 755.00,    1760433,
        "Germany", 891.85,    3562166,
        "Switzerland", 87.88, 378884]

city_series = pd.Series(data, index=index)
print(city_series)

OUTPUT:

Vienna   country           Austria
         area                414.6
         population        1805681
Hamburg  country           Germany
         area                755.0
         population        1760433
Berlin   country           Germany
         area               891.85
         population        3562166
Zürich   country       Switzerland
         area                87.88
         population         378884
dtype: object

We can access the data of a city in the following way:

print(city_series["Vienna"])

OUTPUT:

country       Austria
area            414.6
population    1805681
dtype: object

We can also access the information about the country, area or population of a city. We can do this in two ways:

print(city_series["Vienna"]["area"])

OUTPUT:

414.6

The other way to accomplish it:

print(city_series["Vienna", "area"])

OUTPUT:

414.6

We can also get the content of multiple cities at the same time by using a list of city names as the key:

city_series["Hamburg",:]

OUTPUT:

country       Germany
area            755.0
population    1760433
dtype: object

If the index is sorted, we can also apply a slicing operation:

city_series = city_series.sort_index()
print("city_series with sorted index:")
print(city_series)

print("\n\nSlicing the city_series:")
city_series["Berlin":"Vienna"]

OUTPUT:

city_series with sorted index:
Berlin   area               891.85
         country           Germany
         population        3562166
Hamburg  area                755.0
         country           Germany
         population        1760433
Vienna   area                414.6
         country           Austria
         population        1805681
Zürich   area                87.88
         country       Switzerland
         population         378884
dtype: object


Slicing the city_series:
Berlin   area           891.85
         country       Germany
         population    3562166
Hamburg  area            755.0
         country       Germany
         population    1760433
Vienna   area            414.6
         country       Austria
         population    1805681
dtype: object

In the next example, we show that it is possible to access the inner keys as well:

print(city_series[:, "area"])

OUTPUT:

Berlin     891.85
Hamburg     755.0
Vienna      414.6
Zürich      87.88
dtype: object

Swapping MultiIndex Levels

It is possible to swap the levels of a MultiIndex with the method swaplevel:

swaplevel(self, i=-2, j=-1, copy=True) Swap levels i and j in a MultiIndex

Parameters
----------
i, j : int, string (can be mixed)
       Level of index to be swapped. Can pass level name as string.
       The indexes 'i' and 'j' are optional, and default to
       the two innermost levels of the index

Returns
-------
swapped : Series

city_series = city_series.swaplevel()
city_series.sort_index(inplace=True)
city_series

OUTPUT:

area        Berlin          891.85
            Hamburg          755.0
            Vienna           414.6
            Zürich           87.88
country     Berlin         Germany
            Hamburg        Germany
            Vienna         Austria
            Zürich     Switzerland
population  Berlin         3562166
            Hamburg        1760433
            Vienna         1805681
            Zürich          378884
dtype: object

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Upcoming online Courses

Enrol here