Skip to content
Topics
Seaborn
Seaborn Heatmap: Complete Guide to Creating Heatmaps in Python

Seaborn Heatmap: Complete Guide to Creating Heatmaps in Python

Updated on

You have a dataset with dozens of variables. You need to understand which features correlate, where patterns hide, or why your machine learning model keeps misbehaving. Staring at rows and columns of numbers tells you almost nothing. This is the exact problem a seaborn heatmap solves -- it converts a dense matrix of values into a color-coded grid that your brain can parse in seconds.

Heatmaps are one of the most widely used visualization types in data science, and Python's seaborn library makes creating them remarkably straightforward. Whether you are building a correlation matrix, analyzing a confusion matrix, or visualizing time-series patterns, sns.heatmap() gives you a publication-ready chart with just a few lines of code.

This guide walks you through everything: basic syntax, customization options, advanced techniques like clustered heatmaps, and a full parameter reference table. Every code example is copy-paste ready.

📚

Basic Seaborn Heatmap Syntax

The core function is sns.heatmap(). It accepts a 2D dataset -- typically a pandas DataFrame or a NumPy array -- and renders it as a colored grid.

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
 
# Create sample data
data = np.random.rand(5, 7)
ax = sns.heatmap(data)
plt.title("Basic Seaborn Heatmap")
plt.show()

That is the simplest possible heatmap. Each cell's color represents its numeric value, and seaborn automatically adds a color bar on the right side. But real-world usage almost always involves more configuration, which we will cover next.

Creating a Correlation Matrix Heatmap

The most common use case for a seaborn heatmap is visualizing a correlation matrix. This tells you how strongly each pair of variables in your dataset is related.

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
 
# Load a built-in dataset
df = sns.load_dataset("mpg").select_dtypes(include="number")
 
# Compute the correlation matrix
corr = df.corr()
 
# Plot the heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(
    corr,
    annot=True,
    fmt=".2f",
    cmap="coolwarm",
    center=0,
    square=True,
    linewidths=0.5
)
plt.title("Correlation Matrix - MPG Dataset")
plt.tight_layout()
plt.show()

Key things happening here:

  • annot=True prints the correlation coefficient inside each cell.
  • fmt=".2f" formats those numbers to two decimal places.
  • cmap="coolwarm" uses a diverging color palette where negative correlations are blue and positive correlations are red.
  • center=0 ensures that zero correlation maps to the neutral midpoint color.
  • square=True forces each cell to be a perfect square for cleaner visuals.

Customization Options

Color Palettes (cmap Parameter)

The cmap parameter controls the color scheme. Choosing the right palette depends on your data type.

Palette TypeExample NamesBest For
Sequential"YlOrRd", "Blues", "viridis"Data that ranges from low to high (counts, magnitudes)
Diverging"coolwarm", "RdBu_r", "seismic"Data with a meaningful center point (correlations, residuals)
Qualitative"Set2", "Paired"Categorical data (not typical for heatmaps)
Perceptually uniform"viridis", "magma", "inferno"Ensuring accessibility and accurate perception
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
 
data = np.random.rand(6, 6)
fig, axes = plt.subplots(1, 3, figsize=(18, 5))
 
cmaps = ["viridis", "coolwarm", "YlOrRd"]
for ax, cmap in zip(axes, cmaps):
    sns.heatmap(data, cmap=cmap, ax=ax, annot=True, fmt=".2f")
    ax.set_title(f'cmap="{cmap}"')
 
plt.tight_layout()
plt.show()

Annotations (annot and fmt Parameters)

Annotations display the numeric value inside each cell. You can control their format:

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
 
data = np.random.randint(0, 1000, size=(4, 5))
 
plt.figure(figsize=(8, 5))
sns.heatmap(
    data,
    annot=True,
    fmt="d",            # integer format
    cmap="Blues",
    annot_kws={"size": 14, "weight": "bold"}  # customize font
)
plt.title("Heatmap with Integer Annotations")
plt.show()

Common fmt values: ".2f" for two decimals, "d" for integers, ".1%" for percentages, ".1e" for scientific notation.

Figure Size and Aspect Ratio

Seaborn heatmaps inherit their size from the matplotlib figure. Set it before calling sns.heatmap():

plt.figure(figsize=(12, 8))  # width=12, height=8 inches
sns.heatmap(data, cmap="viridis")
plt.show()

For square cells, pass square=True to sns.heatmap(). This overrides the figure's aspect ratio to make each cell equal-sized.

Masking the Upper or Lower Triangle

Correlation matrices are symmetric. Showing both halves is redundant. Use NumPy's triu or tril to mask one half:

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
 
df = sns.load_dataset("mpg").select_dtypes(include="number")
corr = df.corr()
 
# Create a mask for the upper triangle
mask = np.triu(np.ones_like(corr, dtype=bool))
 
plt.figure(figsize=(10, 8))
sns.heatmap(
    corr,
    mask=mask,
    annot=True,
    fmt=".2f",
    cmap="coolwarm",
    center=0,
    square=True,
    linewidths=0.5
)
plt.title("Lower Triangle Correlation Heatmap")
plt.tight_layout()
plt.show()

The mask parameter accepts a boolean array of the same shape as the data. Cells where mask=True are hidden.

Heatmap Parameter Reference Table

ParameterDescriptionDefault
data2D dataset (DataFrame, ndarray)Required
vmin / vmaxMinimum / maximum value for colormap scalingAuto from data
cmapColormap name or objectNone (seaborn default)
centerValue at which to center the colormapNone
annotShow numeric values in cellsFalse
fmtFormat string for annotations".2g"
annot_kwsDict of keyword arguments for annotation text{}
linewidthsWidth of lines separating cells0
linecolorColor of cell border lines"white"
cbarShow the color barTrue
cbar_kwsDict of keyword arguments for the color bar{}
squareForce square-shaped cellsFalse
maskBoolean array; True cells are not shownNone
xticklabelsLabels for x-axis ticksAuto
yticklabelsLabels for y-axis ticksAuto
axMatplotlib Axes object to draw onCurrent Axes

Advanced Examples

Clustered Heatmap with sns.clustermap

When you want to group similar rows and columns together, sns.clustermap() applies hierarchical clustering and reorders the axes automatically:

import seaborn as sns
import matplotlib.pyplot as plt
 
df = sns.load_dataset("mpg").select_dtypes(include="number").dropna()
corr = df.corr()
 
g = sns.clustermap(
    corr,
    annot=True,
    fmt=".2f",
    cmap="vlag",
    center=0,
    linewidths=0.5,
    figsize=(8, 8),
    dendrogram_ratio=0.15
)
g.ax_heatmap.set_title("Clustered Correlation Heatmap", pad=60)
plt.show()

The dendrograms on the left and top show the clustering hierarchy. Variables that are most closely correlated are placed next to each other, making patterns much easier to spot.

Custom Color Ranges (vmin, vmax)

By default, seaborn scales colors to the min and max of your data. You can override this to compare multiple heatmaps on the same scale or to highlight a specific range:

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
 
np.random.seed(42)
data = np.random.uniform(-1, 1, size=(8, 8))
 
plt.figure(figsize=(8, 6))
sns.heatmap(
    data,
    vmin=-1,
    vmax=1,
    center=0,
    cmap="RdBu_r",
    annot=True,
    fmt=".2f"
)
plt.title("Heatmap with Fixed Color Range (-1 to 1)")
plt.show()

Setting vmin=-1 and vmax=1 is particularly useful when plotting correlation matrices or normalized data where the theoretical range is known.

Confusion Matrix Heatmap

Another practical application is visualizing a confusion matrix from a classification model:

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import confusion_matrix
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
 
# Train a quick model
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.3, random_state=42
)
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
 
# Build and plot the confusion matrix
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(7, 5))
sns.heatmap(
    cm,
    annot=True,
    fmt="d",
    cmap="Blues",
    xticklabels=iris.target_names,
    yticklabels=iris.target_names
)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix - Iris Classification")
plt.tight_layout()
plt.show()

The diagonal shows correct predictions. Off-diagonal cells reveal where the model confuses one class for another.

Time-Series Heatmap

Heatmaps also work well for spotting patterns across time dimensions. Here is an example showing activity by day-of-week and hour:

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
 
np.random.seed(0)
hours = list(range(24))
days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]
data = pd.DataFrame(
    np.random.poisson(lam=20, size=(7, 24)),
    index=days,
    columns=hours
)
 
plt.figure(figsize=(14, 5))
sns.heatmap(data, cmap="YlOrRd", linewidths=0.3, annot=False)
plt.xlabel("Hour of Day")
plt.ylabel("Day of Week")
plt.title("Activity Heatmap by Day and Hour")
plt.tight_layout()
plt.show()

Seaborn Heatmap vs Matplotlib imshow

You can create heatmap-like visualizations with matplotlib.pyplot.imshow() as well. Here is how the two compare:

Featuresns.heatmap()plt.imshow()
Built-in color barYes, automaticManual (plt.colorbar())
Cell annotationsannot=TrueManual text placement
Accepts DataFramesYes, with automatic labelsNo, requires arrays
Tick label handlingAutomatic from DataFrame index/columnsManual setup
Masking supportBuilt-in mask parameterManual with np.ma
ClusteringVia sns.clustermap()Not built-in
Cell spacinglinewidths parameterNot directly supported
Learning curveLower for common use casesLower-level, more manual
Customization ceilingHigh (inherits matplotlib)Very high (full control)

Bottom line: Use sns.heatmap() when you want a clean, well-labeled heatmap with minimal code. Fall back to imshow() when you need pixel-level control or are working with image data rather than tabular data.

Interactive Alternative: PyGWalker

Static heatmaps are powerful for reports and papers, but during exploratory data analysis you often want to interact with your data -- filter, pivot, drill down, and switch between chart types without rewriting code.

PyGWalker (opens in a new tab) (Python binding of Graphic Walker) turns any pandas DataFrame into a Tableau-like interactive UI directly inside Jupyter Notebook. You can drag and drop fields to build heatmaps, scatter plots, bar charts, and more without writing visualization code at all.

pip install pygwalker
import pandas as pd
import pygwalker as pyg
 
df = pd.read_csv("your_data.csv")
walker = pyg.walk(df)

Once the interactive interface launches, you can:

  • Drag a categorical variable to rows, another to columns, and a measure to color to create a heatmap.
  • Switch to other chart types (bar, line, scatter) instantly.
  • Filter and aggregate without writing additional code.

This is especially useful when you are still exploring which variables to include in your final seaborn heatmap. Use PyGWalker for the exploration phase, then lock in your final static visualization with sns.heatmap() for sharing.

Frequently Asked Questions

How do I change the size of a seaborn heatmap?

Set the figure size before calling sns.heatmap() using plt.figure(figsize=(width, height)). For example, plt.figure(figsize=(12, 8)) creates a 12-by-8 inch figure. You can also pass an ax parameter if you are working with subplots.

How do I annotate a seaborn heatmap with values?

Pass annot=True to sns.heatmap(). Control the number format with the fmt parameter (e.g., fmt=".2f" for two decimal places). Customize font properties using annot_kws, for example: annot_kws={"size": 12, "weight": "bold"}.

What is the difference between sns.heatmap and sns.clustermap?

sns.heatmap() displays data in the original row and column order. sns.clustermap() applies hierarchical clustering to reorder rows and columns so that similar values are grouped together, and adds dendrograms to show the clustering structure.

How do I mask half of a correlation heatmap?

Use NumPy to create a boolean mask. For the upper triangle: mask = np.triu(np.ones_like(corr, dtype=bool)). Then pass mask=mask to sns.heatmap(). For the lower triangle, use np.tril() instead.

Can I save a seaborn heatmap as an image file?

Yes. After creating the heatmap, call plt.savefig("heatmap.png", dpi=300, bbox_inches="tight") before plt.show(). Seaborn heatmaps support all matplotlib output formats including PNG, SVG, PDF, and EPS.

Conclusion

The seaborn heatmap is one of the most versatile tools in a data scientist's visualization toolkit. From correlation analysis to confusion matrices to time-series pattern detection, sns.heatmap() handles all of these with clean syntax and publication-quality output.

Start with the basics -- pass your data and pick a colormap. Then layer on annotations, masking, custom ranges, and clustering as your analysis demands. For the exploration phase before you lock in a final visualization, tools like PyGWalker (opens in a new tab) can speed up your workflow with interactive drag-and-drop charting.

The code examples in this guide are all copy-paste ready. Pick the one closest to your use case, swap in your data, and you will have a clear, informative heatmap in under a minute.

📚