python-course.eu

28. Pandas Pivot

By Bernd Klein. Last modified: 21 Feb 2024.

In the context of data manipulation, "pivoting" refers to the transformation of data from one form to another, typically involving the rotation or reorganization of the data around certain key variables. This rotation is akin to the movement around a pivot point in mechanical systems.

pivot

In Python, the pandas library provides the pivot function, which operates similarly to pivot tables in spreadsheet software. It allows users to reshape data frames by specifying index, column, and value variables, thereby pivoting the data around those variables. In other words: The pivot function is used to reshape a given DataFrame into a different shape. Pivot has three parameters:

The pivot function generally takes three main parameters:

Additionally, there is an optional parameter:

A call to pivot returns a reshaped DataFrame. The organization of the new DataFrame is done by setting the three parameters accordingly:

The values for these parameters correspond to column names or index of the DataFrame wich is to be reshaped.

We will define now a DataFrame with for columns which we will use to demonstrate the way of working of the pivot method:

import pandas as pd

# Sample data
data = {
    'Color': ['red', 'green', 'blue', 'red', 'green', 'blue'],
    'Number': ['one', 'two', 'one', 'two', 'one', 'two'],
    'Time': [345, 325, 898, 989, 23, 143],
    'Point': [1, 2, 3, 4, 5, 6]
}

# Create DataFrame
df = pd.DataFrame(data)
df
Color Number Time Point
0 red one 345 1
1 green two 325 2
2 blue one 898 3
3 red two 989 4
4 green one 23 5
5 blue two 143 6

We can choose no one of the columns as the new index and another columns as the new column names. We use A as the index and B for the new column names in the following examples. If no argument is given for the third parameter values, pivot will automatically use all the remaining columns for the dat and the result will have hierarchically indexed columns.

df2 = df.pivot(index='Color',
               columns='Number')

df2
Time Point
Number one two one two
Color
blue 898 143 3 6
green 23 325 5 2
red 345 989 1 4

The above call corresponds to the following one, in which we provide a list of value columns:

df3 = df.pivot(index='Color',
               columns='Number',
               values=['Time', 'Point'])
df3
Time Point
Number one two one two
Color
blue 898 143 3 6
green 23 325 5 2
red 345 989 1 4

In the next call, we just use one column for the values:

df3 = df.pivot(index='Color',
               columns='Number',
               values=['Time'])
df3
Time
Number one two
Color
blue 898 143
green 23 325
red 345 989

Another Example:

import pandas as pd


data = {
    'Product': ['Fancy Chair', 'Fancy Chair', 'Luxury Sofa', 'Designer Table', 'Luxury Sofa'],
    'Color': ['Blue', 'Green', 'Blue', 'Green', 'Red'],
    'Customer Price': [2345.89, 2390.50, 1820.00, 3100.00, 2750.00],
    'Non-Customer Price': [2445.89, 2495.50, 1980.00, 3400.00, 2850.00]
}

# Create DataFrame
df = pd.DataFrame(data)
print(df)

OUTPUT:

          Product  Color  Customer Price  Non-Customer Price
0     Fancy Chair   Blue         2345.89             2445.89
1     Fancy Chair  Green         2390.50             2495.50
2     Luxury Sofa   Blue         1820.00             1980.00
3  Designer Table  Green         3100.00             3400.00
4     Luxury Sofa    Red         2750.00             2850.00
df2 = df.pivot(index='Product',
              columns='Color',
              values='Customer Price')

print(df2)

OUTPUT:

Color              Blue   Green     Red
Product                                
Designer Table      NaN  3100.0     NaN
Fancy Chair     2345.89  2390.5     NaN
Luxury Sofa     1820.00     NaN  2750.0
df3 = df.pivot(index='Product',
              columns='Color',
              values=['Customer Price', 'Non-Customer Price'])

print(df3)

OUTPUT:

               Customer Price                 Non-Customer Price          \
Color                    Blue   Green     Red               Blue   Green   
Product                                                                    
Designer Table            NaN  3100.0     NaN                NaN  3400.0   
Fancy Chair           2345.89  2390.5     NaN            2445.89  2495.5   
Luxury Sofa           1820.00     NaN  2750.0            1980.00     NaN   

                        
Color              Red  
Product                 
Designer Table     NaN  
Fancy Chair        NaN  
Luxury Sofa     2850.0  

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Upcoming online Courses

Enrol here