Pandas Pivot vs Melt: Reshape Data the Right Way

Name: Rajiv Chandra

Updated on 11/19/2025

Reshaping data is often the bottleneck before analysis or visualization. Picking the wrong method can duplicate rows, drop data, or produce unusable wide tables.

PAS framing:

Problem: Messy wide/long data slows down plotting and aggregation.
Agitate: Using pivot when you need aggregation raises errors; using melt incorrectly multiplies rows or loses identifiers.
Solution: Follow a tidy-data workflow—pivot_table when aggregation is required, pivot when keys are unique, melt to go long, and stack/unstack for multi-index control.

Quick Reshape Guide

Method	Direction	When to use	Handles duplicates?
`pivot`	long → wide	Keys are unique per index/column pair	🚫 (raises if duplicates)
`pivot_table`	long → wide	Need aggregation or duplicates exist	✅ via `aggfunc`
`melt`	wide → long	Normalize columns into rows	N/A
`stack` / `unstack`	wide ↔ long on index levels	MultiIndex reshaping	N/A

Sample Data

import pandas as pd
 
sales = pd.DataFrame({
    "region": ["NA", "NA", "EU", "EU"],
    "quarter": ["Q1", "Q2", "Q1", "Q2"],
    "product": ["A", "A", "A", "A"],
    "revenue": [120, 140, 110, 150],
    "units": [10, 11, 9, 12],
})

Wide Output with `pivot`

wide = sales.pivot(index="region", columns="quarter", values="revenue")

Requires unique combinations of region + quarter.
Use .reset_index() to flatten after pivoting if needed.

Aggregated Wide Output with `pivot_table`

table = sales.pivot_table(
    index="region",
    columns="quarter",
    values="revenue",
    aggfunc="sum",
    margins=True,       # totals row/column
    margins_name="Total"
)

Handles duplicate keys by aggregating.
fill_value=0 can replace missing cells for clean exports.

Multiple Values and Aggregations

multi = sales.pivot_table(
    index="region",
    columns="quarter",
    values=["revenue", "units"],
    aggfunc={"revenue": "sum", "units": "mean"},
)

Tidy column names afterward:

multi.columns = [f"{metric}_{col}" for metric, col in multi.columns]
multi = multi.reset_index()

Long Format with `melt`

long = sales.melt(
    id_vars=["region", "quarter"],
    value_vars=["revenue", "units"],
    var_name="metric",
    value_name="value",
)

Ideal for plotting libraries that expect tidy data (one column per variable, one row per observation).
Use value_vars to control which columns to unpivot.

Stack and Unstack for MultiIndex

stacked = table.stack()           # columns → rows
unstacked = stacked.unstack()     # rows → columns

Great for rearranging levels without redefining keys.
Control level with stack(level="quarter") or unstack(level=-1).

Common Pitfalls

Pitfall	Fix
`ValueError: Index contains duplicate entries` with `pivot`	Switch to `pivot_table` with an aggregation.
Column order unexpected	Sort columns: `table = table.reindex(sorted(table.columns), axis=1)`.
Multi-level columns hard to read	Flatten after `pivot_table` with a list comprehension.
Missing values after reshape	Use `fill_value` or `stack(dropna=False)` to preserve blanks.

Workflow Tips

Normalize first: melt → groupby/agg → pivot_table for presentation.
For charts, keep data long; wide is best for reporting/exporting.
After pivot_table, export to Excel with formatting: table.to_excel("report.xlsx").

Related Guides

Key Takeaways

Use pivot for unique keys, pivot_table for aggregation, melt to go long.
Flatten MultiIndex columns after complex pivots to keep exports tidy.
Combine reshape steps with groupby to create analysis-ready datasets fast.