python-course.eu

28. Pandas Styling

By Bernd Klein. Last modified: 03 Feb 2025.

Panda getting curls

Introduction

Pure text is straightforward and easy to read, but it can sometimes lack emphasis and clarity, especially in complex or lengthy content. On the other hand, text with highlights, different fonts, and colors helps guide the reader’s attention, emphasize key points, and improve readability. Using visual distinctions like bold, italics, and color coding can make important information stand out, enhance comprehension, and improve user engagement. However, overuse of these elements can create clutter and reduce readability, so balance is key.

This was an example of a pure text or matter-of-fact style as we usually find it in serious textbooks. In contrast, you might encounter a more engaging style like this:

Pure text is straightforward and easy to read, but it can sometimes lack emphasis and clarity, especially in complex or lengthy content.
On the other hand, text with highlights, different fonts, and 🎨 colors helps guide the reader’s attention, emphasize key points, and improve readability.
✅ Using bold, italics, and 🟢 color coding can make important information stand out, enhance comprehension, and improve user engagement.
⚠️ However, overuse of these elements can create clutter and reduce readability, so balance is key!

Now, you might already have an idea about the focus of this chapter in our Pandas tutorial. Let's shift our attention to Python and Pandas: Just as we can enhance and structure text to improve clarity and presentation, we can apply similar principles to a DataFrame, making it more readable and visually appealing without altering its raw data.

In programming—including Python development—a fundamental best practice is to separate raw data (logic) from its presentation (styling or rendering). This principle applies across various domains, from web development to data science and backend engineering.

In Pandas, we adhere to this principle by keeping raw data independent of its presentation. This ensures that the data remains accurate, computationally usable, and easily formatted when needed, without altering its underlying structure.

Why Separate Data from Presentation in Pandas?

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Enrol here

The .style property

Pandas provides a powerful .style property that allows you to format and style DataFrames in a visually appealing way, especially useful for Jupyter Notebooks and reports. The .style property in Pandas enables dynamic formatting and visualization without changing the raw data. It improves readability with number formatting, color gradients, and highlights while keeping computations intact.

Basis Formatting with .format

First, let's take a look at the format method .style of Pandas in the following example. The format method in Pandas .style is used to customize the display of DataFrame values without modifying the underlying data.

The following example demonstrates how to format numerical values in a DataFrame by setting column B to display only two decimal places:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4.1234, 5.5678, 6.91011]})

# Apply formatting
styled_df = df.style.format({'B': "{:,.2f}"})  # Format column 'B' to 2 decimal places
styled_df
  A B
0 1 4.12
1 2 5.57
2 3 6.91

Other formatting possibilities:

Specifier Effect Example ("{:.2f}".format(123.456))
.0f Round to 0 decimal places 123
.2f Round to 2 decimal places 123.46
.2% Convert to percentage with 2 decimals 12345.60%
.e Scientific notation (lowercase) 1.23e+02
.E Scientific notation (uppercase) 1.23E+02
.g General format (removes unnecessary decimals) 123.456
.G General format (uppercase exponent if needed) 123.456

Common Use Cases:

The data remains numerical, but is displayed in a formatted way.

Before exploring another example of format, let's first examine the wrong approach to understand why mixing logic and rendering is a bad practice. By seeing this mistake in action, you'll better appreciate the importance of keeping data processing and presentation separate:

df = pd.DataFrame({'Price': ['$1,234.57', '$9,876.43'], 'Discount': ['15%', '25%']})
df
  Price Discount
0 $1,234.57 15%
1 $9,876.43 25%

This is bad because:

Now, the correct way with style.format:

df = pd.DataFrame({'Price': [1234.57, 9876.43], 'Discount': [0.15, 0.25]})

# Apply formatting only for display (keeping numbers intact)
df.style.format({'Price': '${:,.2f}', 'Discount': '{:.0%}'})
  Price Discount
0 $1,234.57 15%
1 $9,876.43 25%

This is a lot better:

Highlighting Maximum Values

The following example demonstrates how to highlight the maximum value in each column using highlight_max(), making key values stand out visually.

import pandas as pd

df = pd.DataFrame({
    'First Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
    'Weight (kg)': [68, 85, 74, 90, 62],
    'Height (cm)': [165, 180, 175, 185, 160],
    'IQ': [120, 135, 110, 145, 125]
})

# Apply styling to highlight the maximum value in each column
styled_df = df.style.highlight_max(axis=0, color='lightgreen')

styled_df
  First Name Weight (kg) Height (cm) IQ
0 Alice 68 165 120
1 Bob 85 180 135
2 Charlie 74 175 110
3 David 90 185 145
4 Emma 62 160 125

In the previous example, we highlighted the maximum value within each column. However, highlighting the maximum values in each row wouldn’t be meaningful due to the inhomogeneous nature of the data — comparing a name, weight, height, and IQ within a single row isn't logical.

To demonstrate a case where row-wise highlighting is useful, let's explore a more suitable example. With the parameter axis, we can determine, if we want to highlight row-wise (axis=1) or column-wise (axis=0).

import pandas as pd

df = pd.DataFrame({
    'Monday': [8.5, 7.2, 9.0, 6.8, 8.3],
    'Tuesday': [7.4, 8.1, 6.5, 9.2, 7.0],
    'Wednesday': [8.0, 9.3, 7.7, 8.1, 6.9],
    'Thursday': [6.2, 8.4, 9.1, 7.3, 8.5],
    'Friday': [7.3, 6.8, 8.2, 7.6, 9.1]
}, index=['Alice', 'Bob', 'Charlie', 'David', 'Emma'])

# Apply styling to highlight the maximum value in each row
# (each person's longest workday)
styled_df = df.style.highlight_max(axis=1, color='orange')

styled_df
  Monday Tuesday Wednesday Thursday Friday
Alice 8.500000 7.400000 8.000000 6.200000 7.300000
Bob 7.200000 8.100000 9.300000 8.400000 6.800000
Charlie 9.000000 6.500000 7.700000 9.100000 8.200000
David 6.800000 9.200000 8.100000 7.300000 7.600000
Emma 8.300000 7.000000 6.900000 8.500000 9.100000

Now, we highlight the longest workday per day with axis=0:

import pandas as pd

df = pd.DataFrame({
    'Monday': [8.5, 7.2, 9.0, 6.8, 8.3],
    'Tuesday': [7.4, 8.1, 6.5, 9.2, 7.0],
    'Wednesday': [8.0, 9.3, 7.7, 8.1, 6.9],
    'Thursday': [6.2, 8.4, 9.1, 7.3, 8.5],
    'Friday': [7.3, 6.8, 8.2, 7.6, 9.1]
}, index=['Alice', 'Bob', 'Charlie', 'David', 'Emma'])

styled_df = df.style.highlight_max(axis=0, color='orange')

styled_df
  Monday Tuesday Wednesday Thursday Friday
Alice 8.500000 7.400000 8.000000 6.200000 7.300000
Bob 7.200000 8.100000 9.300000 8.400000 6.800000
Charlie 9.000000 6.500000 7.700000 9.100000 8.200000
David 6.800000 9.200000 8.100000 7.300000 7.600000
Emma 8.300000 7.000000 6.900000 8.500000 9.100000

Applying a Gradient

The .background_gradient() method applies a color gradient to each cell based on its value using a specified colormap (cmap). The logic behind it involves normalizing the data and then mapping it to a color scale.

df.style.background_gradient(cmap='Blues')
  Monday Tuesday Wednesday Thursday Friday
Alice 8.500000 7.400000 8.000000 6.200000 7.300000
Bob 7.200000 8.100000 9.300000 8.400000 6.800000
Charlie 9.000000 6.500000 7.700000 9.100000 8.200000
David 6.800000 9.200000 8.100000 7.300000 7.600000
Emma 8.300000 7.000000 6.900000 8.500000 9.100000

Step-by-Step Logic Behind the Gradient Coloring:

  1. Normalize Values
    • Pandas scales the values in each column between 0 and 1.
    • The minimum value becomes 0 (lightest color) and the maximum value becomes 1 (darkest color).
$$ \text{normalized value} = \frac{\text{cell value} - \text{min(column)}}{\text{max(column)} - \text{min(column)}} $$
  1. Apply the Colormap (cmap='Blues')

    • A matplotlib colormap (like "Blues") is used to assign colors based on the normalized values.
    • Smaller values get lighter shades of blue.
    • Larger values get darker shades of blue.
  2. Render the Colors

    • The background color for each cell is set using the corresponding color from the colormap.

By default, background_gradient() normalizes values column-wise (axis=0). To scale row-wise, you need to set axisto 1:

df.style.background_gradient(cmap='Blues', axis=1)
  Monday Tuesday Wednesday Thursday Friday
Alice 8.500000 7.400000 8.000000 6.200000 7.300000
Bob 7.200000 8.100000 9.300000 8.400000 6.800000
Charlie 9.000000 6.500000 7.700000 9.100000 8.200000
David 6.800000 9.200000 8.100000 7.300000 7.600000
Emma 8.300000 7.000000 6.900000 8.500000 9.100000

You can experiment with other cmap-values:

Diverging Colormaps are great for highlighting deviations from a central value (e.g., positive vs. negative changes).

Multi-Color & Fancy Colormaps provide unique and vibrant effects.

Here is an example using cmap='cividis':

import pandas as pd
import numpy as np

data = {
    'January': [5000, -2000, 3000, -1000, 2600],
    'February': [-1500, 4000, -2500, 500, -1100],
    'March': [3000, -1800, 6200, -700, 1000],
    'April': [-2200, 5000, -1200, 1000, 3500]
}

index = ['Nimbus Corp', 'Quantum Dynamics', 'Aurora Ventures', 
         'Vertex Solutions', 'Orion Enterprises']

df = pd.DataFrame(data, index=index)

# Apply styling with 'cividis' colormap
styled_df = df.style.background_gradient(cmap='cividis')

styled_df
  January February March April
Nimbus Corp 5000 -1500 3000 -2200
Quantum Dynamics -2000 4000 -1800 5000
Aurora Ventures 3000 -2500 6200 -1200
Vertex Solutions -1000 500 -700 1000
Orion Enterprises 2600 -1100 1000 3500

Applying Bar Charts Inside Cells

The .style.bar() method adds bar charts within the cells to visually represent the values. It helps quickly compare work hours per person (row-wise) or per day (column-wise).

import pandas as pd

df = pd.DataFrame({
    'Monday': [8.5, 5.2, 9.8, 6.1, 7.3],
    'Tuesday': [6.4, 8.7, 7.1, 9.0, 5.6],
    'Wednesday': [7.8, 9.5, 5.9, 8.3, 6.7],
    'Thursday': [5.6, 7.2, 9.3, 6.8, 8.9],
    'Friday': [9.1, 6.5, 8.0, 7.6, 9.4]
}, index=['Alice', 'Bob', 'Charlie', 'David', 'Emma'])

# Apply bar charts column-wise (each column has its own scale)
styled_columnwise_bar = df.style.bar(color='lightblue', axis=0)
# Apply bar charts row-wise (each row has its own scale)
styled_rowwise_bar = df.style.bar(color='lightgreen', 
                                  width=90, # Max width of bars in percentage
                                  axis=1)
styled_rowwise_bar
  Monday Tuesday Wednesday Thursday Friday
Alice 8.500000 6.400000 7.800000 5.600000 9.100000
Bob 5.200000 8.700000 9.500000 7.200000 6.500000
Charlie 9.800000 7.100000 5.900000 9.300000 8.000000
David 6.100000 9.000000 8.300000 6.800000 7.600000
Emma 7.300000 5.600000 6.700000 8.900000 9.400000

Another example, demonstrating the use of other parameters:

data = {
    'January': [5000, -2000, 3000, -1000, 2600],
    'February': [-1500, 4000, -2500, 500, -1100],
    'March': [3000, -1800, 6200, -700, 1000],
    'April': [-2200, 5000, -1200, 1000, 3500]
}

index = ['Nimbus Corp', 'Quantum Dynamics', 'Aurora Ventures', 
         'Vertex Solutions', 'Orion Enterprises']

df = pd.DataFrame(data, index=index)

styled_df = df.style.bar(
    color=('red', 'green'),  # Negative values in red, positive in green
    align='zero',            # Center bars around zero, other values: 'left', 'mid' (default)
    width=80,                # Set max width of bars to 80% 
    axis=0                   # Column-wise bars
)
styled_df
  January February March April
Nimbus Corp 5000 -1500 3000 -2200
Quantum Dynamics -2000 4000 -1800 5000
Aurora Ventures 3000 -2500 6200 -1200
Vertex Solutions -1000 500 -700 1000
Orion Enterprises 2600 -1100 1000 3500

Apply functions column/row-wise

The `.apply() method in Pandas allows you to apply a function to each element, row, or column of a DataFrame. It is a powerful tool for data transformation, feature engineering, and custom calculations.

Background Colours for Index, Column and Value Cells

import pandas as pd

df = pd.DataFrame({
    'Product': ['Shirt', 'Pants', 'Jacket', 'Shoes', 'Hat'],
    'Small (€)': [25.90, 40.10, 60.00, 80.00, 15.00],
    'Medium (€)': [30.50, 45.00, 65.00, 85.00, 18.00],
    'Large (€)': [35.80, 50.00, 70.00, 90.00, 20.00]
})

def highlight_values(val):
    """ Function to color value cells in light yellow """
    return 'background-color: lightyellow;'

# Apply styles to the table
styled_df = df.style.map(highlight_values).set_table_styles([
        {'selector': 'th', 'props': [('background-color', 'orange'), ('color', 'black'), ('font-weight', 'bold')]},  # Column headers
        {'selector': 'th.index', 'props': [('background-color', 'orange'), ('color', 'black'), ('font-weight', 'bold')]}  # Index (row headers)
       ])

# Display the styled DataFrame
styled_df
  Product Small (€) Medium (€) Large (€)
0 Shirt 25.900000 30.500000 35.800000
1 Pants 40.100000 45.000000 50.000000
2 Jacket 60.000000 65.000000 70.000000
3 Shoes 80.000000 85.000000 90.000000
4 Hat 15.000000 18.000000 20.000000
import pandas as pd

# Create a DataFrame with product prices in Euros (€) (All values as float)
df = pd.DataFrame({
    'Product': ['Shirt', 'Pants', 'Jacket', 'Shoes', 'Hat'],
    'Small (€)': [25.90, 40.10, 60.00, 80.00, 15.00],
    'Medium (€)': [30.50, 45.00, 65.00, 85.00, 18.00],
    'Large (€)': [35.80, 50.00, 70.00, 90.00, 20.00]
})

# Define colors for each column
column_colors = {
    'Small (€)': 'lightblue',
    'Medium (€)': 'lightgreen',
    'Large (€)': 'lightcoral'
}

# Function to apply background colors to each column
def highlight_columns(col):
    return [f'background-color: {column_colors.get(col.name, "white")}' for _ in col]

# Apply the color styling column-wise
styled_df = df.style.apply(highlight_columns, axis=0)

# Display the styled DataFrame
styled_df
  Product Small (€) Medium (€) Large (€)
0 Shirt 25.900000 30.500000 35.800000
1 Pants 40.100000 45.000000 50.000000
2 Jacket 60.000000 65.000000 70.000000
3 Shoes 80.000000 85.000000 90.000000
4 Hat 15.000000 18.000000 20.000000
import pandas as pd
import numpy as np

# Define months
months = ['January', 'February', 'March', 'April', 'May', 'June',
          'July', 'August', 'September', 'October', 'November', 'December']

# Define boutique locations
locations = ['Zürich', 'Frankfurt', 'Hamburg', 'Munich']

# Generate random income values (some losses included)
np.random.seed(42)  # For reproducibility
data = np.random.randint(-5000, 20000, size=(12, 4))  # Random income values

# Create DataFrame
df = pd.DataFrame(data, index=months, columns=locations)

# Function to highlight values: Green for profits, Red for losses
def highlight_values(val):
    if val < 0:
        return 'background-color: red; color: white; font-weight: bold'
    else:
        return 'background-color: lightgreen; color: black; font-weight: bold'

# Apply styles
styled_df = df.style.map(highlight_values) \
    .set_table_styles([
        {'selector': 'th', 
         'props': [('background-color', 'yellow'), ('color', 'black'), ('font-weight', 'bold')]},  # Column headers
        {'selector': 'th.index', 
         'props': [('background-color', 'yellow'), ('color', 'black'), ('font-weight', 'bold')]}  # Row headers
    ]) \
    .format("{:,.0f}")  # Add thousands separator for better readability

# Display the styled DataFrame
styled_df
  Zürich Frankfurt Hamburg Munich
January 18,654 10,795 -4,140 390
February 16,575 6,964 6,284 17,118
March 1,265 11,850 -574 16,962
April 9,423 6,363 11,023 3,322
May -3,315 -4,231 18,333 -2,567
June 311 51 1,420 12,568
July 15,939 14,769 1,396 3,666
August 13,942 19,233 13,431 -2,253
September -4,811 14,118 -1,995 16,042
October -3,101 19,118 -3,733 12,912
November 6,394 -1,444 -1,110 3,838
December 9,502 16,777 5,627 3,792

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Upcoming online Courses

Enrol here