28. Pandas Styling
By Bernd Klein. Last modified: 03 Feb 2025.
Introduction
Pure text is straightforward and easy to read, but it can sometimes lack emphasis and clarity, especially in complex or lengthy content. On the other hand, text with highlights, different fonts, and colors helps guide the reader’s attention, emphasize key points, and improve readability. Using visual distinctions like bold, italics, and color coding can make important information stand out, enhance comprehension, and improve user engagement. However, overuse of these elements can create clutter and reduce readability, so balance is key.
This was an example of a pure text or matter-of-fact style as we usually find it in serious textbooks. In contrast, you might encounter a more engaging style like this:
Pure text is straightforward and easy to read, but it can sometimes lack emphasis and clarity, especially in complex or lengthy content.
On the other hand, text with highlights, different fonts, and 🎨 colors helps guide the reader’s attention, emphasize key points, and improve readability.
✅ Using bold, italics, and 🟢 color coding can make important information stand out, enhance comprehension, and improve user engagement.
⚠️ However, overuse of these elements can create clutter and reduce readability, so balance is key!
Now, you might already have an idea about the focus of this chapter in our Pandas tutorial. Let's shift our attention to Python and Pandas: Just as we can enhance and structure text to improve clarity and presentation, we can apply similar principles to a DataFrame, making it more readable and visually appealing without altering its raw data.
In programming—including Python development—a fundamental best practice is to separate raw data (logic) from its presentation (styling or rendering). This principle applies across various domains, from web development to data science and backend engineering.
In Pandas, we adhere to this principle by keeping raw data independent of its presentation. This ensures that the data remains accurate, computationally usable, and easily formatted when needed, without altering its underlying structure.
Why Separate Data from Presentation in Pandas?
-
Preserves Data Integrity
The underlying numerical values remain unchanged, preventing rounding errors or data corruption.Example: Keeping a value as
1234.5678
instead of storing it as a formatted string"1,234.57 USD"
. -
Ensures Computation Compatibility
Operations like summing, averaging, or statistical analysis require pure numerical data.Example: If we store
"$100.50"
as a string, Pandas can't calculate total sales correctly. -
Allows Flexible Presentation
Formatting can be applied dynamically for different use cases (reports, dashboards, user-specific views).Example: Showing
"50%"
for display but keeping it as0.5
for calculations. -
Enhances Maintainability
If display requirements change (e.g., from USD to EUR), only the styling logic needs to be updated, not the actual data.
Live Python training
Enjoying this page? We offer live Python training courses covering the content of this site.
The .style
property
Pandas provides a powerful .style property that allows you to format and style DataFrames in a visually appealing way, especially useful for Jupyter Notebooks and reports. The .style
property in Pandas enables dynamic formatting and visualization without changing the raw data. It improves readability with number formatting, color gradients, and highlights while keeping computations intact.
Basis Formatting with .format
First, let's take a look at the format method .style
of Pandas in the following example. The format
method in Pandas .style
is used to customize the display of DataFrame values without modifying the underlying data.
The following example demonstrates how to format numerical values in a DataFrame by setting column B to display only two decimal places:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4.1234, 5.5678, 6.91011]})
# Apply formatting
styled_df = df.style.format({'B': "{:,.2f}"}) # Format column 'B' to 2 decimal places
styled_df
A | B | |
---|---|---|
0 | 1 | 4.12 |
1 | 2 | 5.57 |
2 | 3 | 6.91 |
Other formatting possibilities:
Specifier | Effect | Example ("{:.2f}".format(123.456) ) |
---|---|---|
.0f |
Round to 0 decimal places | 123 |
.2f |
Round to 2 decimal places | 123.46 |
.2% |
Convert to percentage with 2 decimals | 12345.60% |
.e |
Scientific notation (lowercase) | 1.23e+02 |
.E |
Scientific notation (uppercase) | 1.23E+02 |
.g |
General format (removes unnecessary decimals) | 123.456 |
.G |
General format (uppercase exponent if needed) | 123.456 |
Common Use Cases:
"{:,.2f}"
→ Thousands separator + 2 decimals (1,234.56
)"{:.1%}"
→ Percentage (12.3%
)"{:.3e}"
→ Scientific notation (1.235e+02
)
The data remains numerical, but is displayed in a formatted way.
Before exploring another example of format
, let's first examine the wrong approach to understand why mixing logic and rendering is a bad practice. By seeing this mistake in action, you'll better appreciate the importance of keeping data processing and presentation separate:
df = pd.DataFrame({'Price': ['$1,234.57', '$9,876.43'], 'Discount': ['15%', '25%']})
df
Price | Discount | |
---|---|---|
0 | $1,234.57 | 15% |
1 | $9,876.43 | 25% |
This is bad because:
- The values are stored as strings instead of numeric types.
- You cannot perform calculations like summing, averaging, or sorting.
- Formatting (currency symbols, percentage signs) is mixed with data, violating the separation of data and presentation principle.
Now, the correct way with style.format
:
df = pd.DataFrame({'Price': [1234.57, 9876.43], 'Discount': [0.15, 0.25]})
# Apply formatting only for display (keeping numbers intact)
df.style.format({'Price': '${:,.2f}', 'Discount': '{:.0%}'})
Price | Discount | |
---|---|---|
0 | $1,234.57 | 15% |
1 | $9,876.43 | 25% |
This is a lot better:
Price
column remains a float (e.g.,1234.57
), but displays as"$1,234.57"
.Discount
column remains a float (e.g.,0.15
), but displays as"15%"
.
Highlighting Maximum Values
The following example demonstrates how to highlight the maximum value in each column using highlight_max()
, making key values stand out visually.
import pandas as pd
df = pd.DataFrame({
'First Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
'Weight (kg)': [68, 85, 74, 90, 62],
'Height (cm)': [165, 180, 175, 185, 160],
'IQ': [120, 135, 110, 145, 125]
})
# Apply styling to highlight the maximum value in each column
styled_df = df.style.highlight_max(axis=0, color='lightgreen')
styled_df
First Name | Weight (kg) | Height (cm) | IQ | |
---|---|---|---|---|
0 | Alice | 68 | 165 | 120 |
1 | Bob | 85 | 180 | 135 |
2 | Charlie | 74 | 175 | 110 |
3 | David | 90 | 185 | 145 |
4 | Emma | 62 | 160 | 125 |
In the previous example, we highlighted the maximum value within each column. However, highlighting the maximum values in each row wouldn’t be meaningful due to the inhomogeneous nature of the data — comparing a name, weight, height, and IQ within a single row isn't logical.
To demonstrate a case where row-wise highlighting is useful, let's explore a more suitable example. With the parameter axis, we can determine, if we want to highlight row-wise (axis=1
) or column-wise (axis=0
).
import pandas as pd
df = pd.DataFrame({
'Monday': [8.5, 7.2, 9.0, 6.8, 8.3],
'Tuesday': [7.4, 8.1, 6.5, 9.2, 7.0],
'Wednesday': [8.0, 9.3, 7.7, 8.1, 6.9],
'Thursday': [6.2, 8.4, 9.1, 7.3, 8.5],
'Friday': [7.3, 6.8, 8.2, 7.6, 9.1]
}, index=['Alice', 'Bob', 'Charlie', 'David', 'Emma'])
# Apply styling to highlight the maximum value in each row
# (each person's longest workday)
styled_df = df.style.highlight_max(axis=1, color='orange')
styled_df
Monday | Tuesday | Wednesday | Thursday | Friday | |
---|---|---|---|---|---|
Alice | 8.500000 | 7.400000 | 8.000000 | 6.200000 | 7.300000 |
Bob | 7.200000 | 8.100000 | 9.300000 | 8.400000 | 6.800000 |
Charlie | 9.000000 | 6.500000 | 7.700000 | 9.100000 | 8.200000 |
David | 6.800000 | 9.200000 | 8.100000 | 7.300000 | 7.600000 |
Emma | 8.300000 | 7.000000 | 6.900000 | 8.500000 | 9.100000 |
Now, we highlight the longest workday per day with axis=0
:
import pandas as pd
df = pd.DataFrame({
'Monday': [8.5, 7.2, 9.0, 6.8, 8.3],
'Tuesday': [7.4, 8.1, 6.5, 9.2, 7.0],
'Wednesday': [8.0, 9.3, 7.7, 8.1, 6.9],
'Thursday': [6.2, 8.4, 9.1, 7.3, 8.5],
'Friday': [7.3, 6.8, 8.2, 7.6, 9.1]
}, index=['Alice', 'Bob', 'Charlie', 'David', 'Emma'])
styled_df = df.style.highlight_max(axis=0, color='orange')
styled_df
Monday | Tuesday | Wednesday | Thursday | Friday | |
---|---|---|---|---|---|
Alice | 8.500000 | 7.400000 | 8.000000 | 6.200000 | 7.300000 |
Bob | 7.200000 | 8.100000 | 9.300000 | 8.400000 | 6.800000 |
Charlie | 9.000000 | 6.500000 | 7.700000 | 9.100000 | 8.200000 |
David | 6.800000 | 9.200000 | 8.100000 | 7.300000 | 7.600000 |
Emma | 8.300000 | 7.000000 | 6.900000 | 8.500000 | 9.100000 |
Applying a Gradient
The .background_gradient()
method applies a color gradient to each cell based on its value using a specified colormap (cmap
). The logic behind it involves normalizing the data and then mapping it to a color scale.
df.style.background_gradient(cmap='Blues')
Monday | Tuesday | Wednesday | Thursday | Friday | |
---|---|---|---|---|---|
Alice | 8.500000 | 7.400000 | 8.000000 | 6.200000 | 7.300000 |
Bob | 7.200000 | 8.100000 | 9.300000 | 8.400000 | 6.800000 |
Charlie | 9.000000 | 6.500000 | 7.700000 | 9.100000 | 8.200000 |
David | 6.800000 | 9.200000 | 8.100000 | 7.300000 | 7.600000 |
Emma | 8.300000 | 7.000000 | 6.900000 | 8.500000 | 9.100000 |
Step-by-Step Logic Behind the Gradient Coloring:
- Normalize Values
- Pandas scales the values in each column between 0 and 1.
- The minimum value becomes 0 (lightest color) and the maximum value becomes 1 (darkest color).
-
Apply the Colormap (
cmap='Blues'
)- A matplotlib colormap (like
"Blues"
) is used to assign colors based on the normalized values. - Smaller values get lighter shades of blue.
- Larger values get darker shades of blue.
- A matplotlib colormap (like
-
Render the Colors
- The background color for each cell is set using the corresponding color from the colormap.
By default, background_gradient()
normalizes values column-wise (axis=0
). To scale row-wise, you need to set axis
to 1
:
df.style.background_gradient(cmap='Blues', axis=1)
Monday | Tuesday | Wednesday | Thursday | Friday | |
---|---|---|---|---|---|
Alice | 8.500000 | 7.400000 | 8.000000 | 6.200000 | 7.300000 |
Bob | 7.200000 | 8.100000 | 9.300000 | 8.400000 | 6.800000 |
Charlie | 9.000000 | 6.500000 | 7.700000 | 9.100000 | 8.200000 |
David | 6.800000 | 9.200000 | 8.100000 | 7.300000 | 7.600000 |
Emma | 8.300000 | 7.000000 | 6.900000 | 8.500000 | 9.100000 |
You can experiment with other cmap-values:
- 'Blues' → Shades of blue (good for clarity)
- 'Greens' → Shades of green (eco-friendly, positive impact)
- 'Oranges' → Shades of orange (warm tones)
- 'Purples' → Shades of purple (stylish, artistic)
- 'Reds' → Shades of red (intensity, urgency)
- 'Greys' → Shades of grey (neutral, good for subtle visual effects)
Diverging Colormaps are great for highlighting deviations from a central value (e.g., positive vs. negative changes).
- 'coolwarm' → Blue to red gradient (good for highlighting differences)
- 'RdBu' → Red to blue (opposite extremes)
- 'PiYG' → Pink to green (aesthetic, alternative to red/blue)
- 'BrBG' → Brown to green (earthy tones, good for contrasts)
Multi-Color & Fancy Colormaps provide unique and vibrant effects.
- 'viridis' → Yellow-green-blue (colorblind-friendly, widely used in data visualization)
- 'plasma' → Purple-red-yellow (high contrast, stands out)
- 'magma' → Dark purple-orange-yellow (dramatic, good for high-impact visuals)
- 'cividis' → Blue-yellow (colorblind-friendly alternative to viridis)
Here is an example using cmap='cividis'
:
import pandas as pd
import numpy as np
data = {
'January': [5000, -2000, 3000, -1000, 2600],
'February': [-1500, 4000, -2500, 500, -1100],
'March': [3000, -1800, 6200, -700, 1000],
'April': [-2200, 5000, -1200, 1000, 3500]
}
index = ['Nimbus Corp', 'Quantum Dynamics', 'Aurora Ventures',
'Vertex Solutions', 'Orion Enterprises']
df = pd.DataFrame(data, index=index)
# Apply styling with 'cividis' colormap
styled_df = df.style.background_gradient(cmap='cividis')
styled_df
January | February | March | April | |
---|---|---|---|---|
Nimbus Corp | 5000 | -1500 | 3000 | -2200 |
Quantum Dynamics | -2000 | 4000 | -1800 | 5000 |
Aurora Ventures | 3000 | -2500 | 6200 | -1200 |
Vertex Solutions | -1000 | 500 | -700 | 1000 |
Orion Enterprises | 2600 | -1100 | 1000 | 3500 |
Applying Bar Charts Inside Cells
The .style.bar() method adds bar charts within the cells to visually represent the values. It helps quickly compare work hours per person (row-wise) or per day (column-wise).
import pandas as pd
df = pd.DataFrame({
'Monday': [8.5, 5.2, 9.8, 6.1, 7.3],
'Tuesday': [6.4, 8.7, 7.1, 9.0, 5.6],
'Wednesday': [7.8, 9.5, 5.9, 8.3, 6.7],
'Thursday': [5.6, 7.2, 9.3, 6.8, 8.9],
'Friday': [9.1, 6.5, 8.0, 7.6, 9.4]
}, index=['Alice', 'Bob', 'Charlie', 'David', 'Emma'])
# Apply bar charts column-wise (each column has its own scale)
styled_columnwise_bar = df.style.bar(color='lightblue', axis=0)
# Apply bar charts row-wise (each row has its own scale)
styled_rowwise_bar = df.style.bar(color='lightgreen',
width=90, # Max width of bars in percentage
axis=1)
styled_rowwise_bar
Monday | Tuesday | Wednesday | Thursday | Friday | |
---|---|---|---|---|---|
Alice | 8.500000 | 6.400000 | 7.800000 | 5.600000 | 9.100000 |
Bob | 5.200000 | 8.700000 | 9.500000 | 7.200000 | 6.500000 |
Charlie | 9.800000 | 7.100000 | 5.900000 | 9.300000 | 8.000000 |
David | 6.100000 | 9.000000 | 8.300000 | 6.800000 | 7.600000 |
Emma | 7.300000 | 5.600000 | 6.700000 | 8.900000 | 9.400000 |
Another example, demonstrating the use of other parameters:
data = {
'January': [5000, -2000, 3000, -1000, 2600],
'February': [-1500, 4000, -2500, 500, -1100],
'March': [3000, -1800, 6200, -700, 1000],
'April': [-2200, 5000, -1200, 1000, 3500]
}
index = ['Nimbus Corp', 'Quantum Dynamics', 'Aurora Ventures',
'Vertex Solutions', 'Orion Enterprises']
df = pd.DataFrame(data, index=index)
styled_df = df.style.bar(
color=('red', 'green'), # Negative values in red, positive in green
align='zero', # Center bars around zero, other values: 'left', 'mid' (default)
width=80, # Set max width of bars to 80%
axis=0 # Column-wise bars
)
styled_df
January | February | March | April | |
---|---|---|---|---|
Nimbus Corp | 5000 | -1500 | 3000 | -2200 |
Quantum Dynamics | -2000 | 4000 | -1800 | 5000 |
Aurora Ventures | 3000 | -2500 | 6200 | -1200 |
Vertex Solutions | -1000 | 500 | -700 | 1000 |
Orion Enterprises | 2600 | -1100 | 1000 | 3500 |
Apply functions column/row-wise
The `.apply() method in Pandas allows you to apply a function to each element, row, or column of a DataFrame. It is a powerful tool for data transformation, feature engineering, and custom calculations.
Background Colours for Index, Column and Value Cells
import pandas as pd
df = pd.DataFrame({
'Product': ['Shirt', 'Pants', 'Jacket', 'Shoes', 'Hat'],
'Small (€)': [25.90, 40.10, 60.00, 80.00, 15.00],
'Medium (€)': [30.50, 45.00, 65.00, 85.00, 18.00],
'Large (€)': [35.80, 50.00, 70.00, 90.00, 20.00]
})
def highlight_values(val):
""" Function to color value cells in light yellow """
return 'background-color: lightyellow;'
# Apply styles to the table
styled_df = df.style.map(highlight_values).set_table_styles([
{'selector': 'th', 'props': [('background-color', 'orange'), ('color', 'black'), ('font-weight', 'bold')]}, # Column headers
{'selector': 'th.index', 'props': [('background-color', 'orange'), ('color', 'black'), ('font-weight', 'bold')]} # Index (row headers)
])
# Display the styled DataFrame
styled_df
Product | Small (€) | Medium (€) | Large (€) | |
---|---|---|---|---|
0 | Shirt | 25.900000 | 30.500000 | 35.800000 |
1 | Pants | 40.100000 | 45.000000 | 50.000000 |
2 | Jacket | 60.000000 | 65.000000 | 70.000000 |
3 | Shoes | 80.000000 | 85.000000 | 90.000000 |
4 | Hat | 15.000000 | 18.000000 | 20.000000 |
import pandas as pd
# Create a DataFrame with product prices in Euros (€) (All values as float)
df = pd.DataFrame({
'Product': ['Shirt', 'Pants', 'Jacket', 'Shoes', 'Hat'],
'Small (€)': [25.90, 40.10, 60.00, 80.00, 15.00],
'Medium (€)': [30.50, 45.00, 65.00, 85.00, 18.00],
'Large (€)': [35.80, 50.00, 70.00, 90.00, 20.00]
})
# Define colors for each column
column_colors = {
'Small (€)': 'lightblue',
'Medium (€)': 'lightgreen',
'Large (€)': 'lightcoral'
}
# Function to apply background colors to each column
def highlight_columns(col):
return [f'background-color: {column_colors.get(col.name, "white")}' for _ in col]
# Apply the color styling column-wise
styled_df = df.style.apply(highlight_columns, axis=0)
# Display the styled DataFrame
styled_df
Product | Small (€) | Medium (€) | Large (€) | |
---|---|---|---|---|
0 | Shirt | 25.900000 | 30.500000 | 35.800000 |
1 | Pants | 40.100000 | 45.000000 | 50.000000 |
2 | Jacket | 60.000000 | 65.000000 | 70.000000 |
3 | Shoes | 80.000000 | 85.000000 | 90.000000 |
4 | Hat | 15.000000 | 18.000000 | 20.000000 |
import pandas as pd
import numpy as np
# Define months
months = ['January', 'February', 'March', 'April', 'May', 'June',
'July', 'August', 'September', 'October', 'November', 'December']
# Define boutique locations
locations = ['Zürich', 'Frankfurt', 'Hamburg', 'Munich']
# Generate random income values (some losses included)
np.random.seed(42) # For reproducibility
data = np.random.randint(-5000, 20000, size=(12, 4)) # Random income values
# Create DataFrame
df = pd.DataFrame(data, index=months, columns=locations)
# Function to highlight values: Green for profits, Red for losses
def highlight_values(val):
if val < 0:
return 'background-color: red; color: white; font-weight: bold'
else:
return 'background-color: lightgreen; color: black; font-weight: bold'
# Apply styles
styled_df = df.style.map(highlight_values) \
.set_table_styles([
{'selector': 'th',
'props': [('background-color', 'yellow'), ('color', 'black'), ('font-weight', 'bold')]}, # Column headers
{'selector': 'th.index',
'props': [('background-color', 'yellow'), ('color', 'black'), ('font-weight', 'bold')]} # Row headers
]) \
.format("{:,.0f}") # Add thousands separator for better readability
# Display the styled DataFrame
styled_df
Zürich | Frankfurt | Hamburg | Munich | |
---|---|---|---|---|
January | 18,654 | 10,795 | -4,140 | 390 |
February | 16,575 | 6,964 | 6,284 | 17,118 |
March | 1,265 | 11,850 | -574 | 16,962 |
April | 9,423 | 6,363 | 11,023 | 3,322 |
May | -3,315 | -4,231 | 18,333 | -2,567 |
June | 311 | 51 | 1,420 | 12,568 |
July | 15,939 | 14,769 | 1,396 | 3,666 |
August | 13,942 | 19,233 | 13,431 | -2,253 |
September | -4,811 | 14,118 | -1,995 | 16,042 |
October | -3,101 | 19,118 | -3,733 | 12,912 |
November | 6,394 | -1,444 | -1,110 | 3,838 |
December | 9,502 | 16,777 | 5,627 | 3,792 |
Live Python training
Enjoying this page? We offer live Python training courses covering the content of this site.
Upcoming online Courses