Pandas Sort Values: Complete Guide to Sorting DataFrames in Python
Updated on
Unsorted data is hard to analyze. You scan through thousands of rows trying to find the highest sales, the most recent dates, or the lowest error rates -- and miss patterns that would be obvious if the data were ordered. Before sorting, you often need to filter rows to a relevant subset. Sorting a Python list works for simple cases, but DataFrames have multiple columns, mixed types, and missing values that lists cannot handle.
Pandas sort_values() sorts any DataFrame by one or more columns, with full control over ascending/descending order, null placement, and stability. This guide covers every sorting pattern you'll encounter in real data work.
Basic Sorting with sort_values()
Sort by Single Column
import pandas as pd
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
'score': [85, 92, 78, 95, 88],
'age': [25, 30, 22, 28, 35]
})
# Sort by score (ascending by default)
sorted_df = df.sort_values('score')
print(sorted_df)
# name score age
# 2 Charlie 78 22
# 0 Alice 85 25
# 4 Eve 88 35
# 1 Bob 92 30
# 3 Diana 95 28Sort Descending
import pandas as pd
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
'score': [85, 92, 78, 95, 88],
'age': [25, 30, 22, 28, 35]
})
# Highest scores first
sorted_df = df.sort_values('score', ascending=False)
print(sorted_df)
# name score age
# 3 Diana 95 28
# 1 Bob 92 30
# 4 Eve 88 35
# 0 Alice 85 25
# 2 Charlie 78 22Sort In Place
import pandas as pd
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'score': [85, 92, 78]
})
# Modify df directly (no copy created)
df.sort_values('score', inplace=True)
print(df)
# name score
# 2 Charlie 78
# 0 Alice 85
# 1 Bob 92Multi-Column Sorting
Sort by Multiple Columns
import pandas as pd
df = pd.DataFrame({
'department': ['Sales', 'Engineering', 'Sales', 'Engineering', 'Sales'],
'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
'salary': [70000, 85000, 65000, 90000, 70000]
})
# Sort by department first, then by salary within each department
sorted_df = df.sort_values(['department', 'salary'])
print(sorted_df)
# department name salary
# 1 Engineering Bob 85000
# 3 Engineering Diana 90000
# 2 Sales Charlie 65000
# 0 Sales Alice 70000
# 4 Sales Eve 70000Mixed Ascending/Descending
import pandas as pd
df = pd.DataFrame({
'department': ['Sales', 'Engineering', 'Sales', 'Engineering', 'Sales'],
'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
'salary': [70000, 85000, 65000, 90000, 70000]
})
# Department ascending, salary descending
sorted_df = df.sort_values(
['department', 'salary'],
ascending=[True, False]
)
print(sorted_df)
# department name salary
# 3 Engineering Diana 90000
# 1 Engineering Bob 85000
# 0 Sales Alice 70000
# 4 Sales Eve 70000
# 2 Sales Charlie 65000Handling Missing Values (NaN)
By default, NaN values go to the end regardless of sort order. Use na_position to control this:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
'score': [85, np.nan, 78, np.nan]
})
# NaN at end (default)
print(df.sort_values('score'))
# name score
# 2 Charlie 78.0
# 0 Alice 85.0
# 1 Bob NaN
# 3 Diana NaN
# NaN at beginning
print(df.sort_values('score', na_position='first'))
# name score
# 1 Bob NaN
# 3 Diana NaN
# 2 Charlie 78.0
# 0 Alice 85.0Sort by Index with sort_index()
import pandas as pd
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'score': [85, 92, 78]
}, index=[2, 0, 1])
# Sort by index
print(df.sort_index())
# name score
# 0 Bob 92
# 1 Charlie 78
# 2 Alice 85
# Sort by column names (axis=1)
df2 = pd.DataFrame({
'c': [1, 2], 'a': [3, 4], 'b': [5, 6]
})
print(df2.sort_index(axis=1))
# a b c
# 0 3 5 1
# 1 4 6 2Reset Index After Sorting
After sorting, the original index is preserved. To get a clean sequential index, use .reset_index():
import pandas as pd
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'score': [85, 92, 78]
})
sorted_df = df.sort_values('score').reset_index(drop=True)
print(sorted_df)
# name score
# 0 Charlie 78
# 1 Alice 85
# 2 Bob 92Use drop=True to discard the old index. Without it, the old index becomes a column.
Custom Sort Order
Using Categorical Types
import pandas as pd
df = pd.DataFrame({
'priority': ['Medium', 'High', 'Low', 'High', 'Medium'],
'task': ['Task A', 'Task B', 'Task C', 'Task D', 'Task E']
})
# Define custom order
priority_order = pd.CategoricalDtype(['Low', 'Medium', 'High'], ordered=True)
df['priority'] = df['priority'].astype(priority_order)
sorted_df = df.sort_values('priority')
print(sorted_df)
# priority task
# 2 Low Task C
# 0 Medium Task A
# 4 Medium Task E
# 1 High Task B
# 3 High Task DUsing key Parameter
import pandas as pd
df = pd.DataFrame({
'name': ['alice', 'Bob', 'CHARLIE', 'diana'],
'score': [85, 92, 78, 95]
})
# Case-insensitive sort
sorted_df = df.sort_values('name', key=lambda x: x.str.lower())
print(sorted_df)
# name score
# 0 alice 85
# 1 Bob 92
# 2 CHARLIE 78
# 3 diana 95Method Comparison
| Method | Purpose | Modifies Original? | Returns |
|---|---|---|---|
sort_values(col) | Sort by column values | No (unless inplace=True) | Sorted DataFrame |
sort_values([col1, col2]) | Sort by multiple columns | No (unless inplace=True) | Sorted DataFrame |
sort_index() | Sort by row index | No (unless inplace=True) | Sorted DataFrame |
nsmallest(n, col) | Get n smallest values | No | Subset DataFrame |
nlargest(n, col) | Get n largest values | No | Subset DataFrame |
rank() | Assign ranks to values | No | Series of ranks |
Performance Tips
nlargest() and nsmallest() for Top-N
When you only need the top or bottom N rows, nlargest() and nsmallest() are faster than sorting the entire DataFrame:
import pandas as pd
import numpy as np
# Large DataFrame
df = pd.DataFrame({
'id': range(1_000_000),
'value': np.random.randn(1_000_000)
})
# Faster: only find top 10
top_10 = df.nlargest(10, 'value')
# Slower: sorts everything, then slices
top_10_slow = df.sort_values('value', ascending=False).head(10)Stable vs Unstable Sorting
import pandas as pd
df = pd.DataFrame({
'group': ['A', 'B', 'A', 'B'],
'value': [1, 1, 2, 2],
'order': [1, 2, 3, 4]
})
# Stable sort (default) preserves original order for ties
stable = df.sort_values('value', kind='mergesort') # Default
# Unstable sort (faster for large datasets)
unstable = df.sort_values('value', kind='quicksort')Practical Examples
Sort Dates
import pandas as pd
df = pd.DataFrame({
'event': ['Launch', 'Meeting', 'Deadline', 'Review'],
'date': pd.to_datetime(['2026-03-15', '2026-01-10', '2026-02-28', '2026-01-05'])
})
# Chronological order
print(df.sort_values('date'))
# event date
# 3 Review 2026-01-05
# 1 Meeting 2026-01-10
# 2 Deadline 2026-02-28
# 0 Launch 2026-03-15Sort with GroupBy
Sorting often pairs with .groupby() to find top entries within each group:
import pandas as pd
df = pd.DataFrame({
'store': ['A', 'A', 'B', 'B', 'A', 'B'],
'product': ['X', 'Y', 'X', 'Y', 'Z', 'Z'],
'revenue': [100, 250, 150, 300, 200, 175]
})
# Top product by revenue in each store
top_per_store = (df.sort_values('revenue', ascending=False)
.groupby('store')
.head(1))
print(top_per_store)
# store product revenue
# 3 B Y 300
# 1 A Y 250Exploring Sorted Data Visually
After sorting your DataFrame to find patterns, PyGWalker (opens in a new tab) lets you explore the same data through interactive drag-and-drop charts in Jupyter -- no extra code needed:
import pygwalker as pyg
walker = pyg.walk(sorted_df)FAQ
How do I sort a DataFrame by column in pandas?
Use df.sort_values('column_name') for ascending order or df.sort_values('column_name', ascending=False) for descending. For multiple columns, pass a list: df.sort_values(['col1', 'col2']).
How do I sort by multiple columns with different orders?
Pass a list of booleans to ascending: df.sort_values(['col1', 'col2'], ascending=[True, False]). This sorts col1 ascending and col2 descending.
Where do NaN values go when sorting?
By default, NaN values are placed at the end regardless of sort direction. Use na_position='first' to put them at the beginning: df.sort_values('col', na_position='first').
What is the difference between sort_values and sort_index?
sort_values() sorts by column values. sort_index() sorts by the row index (or column index with axis=1). Use sort_values for data sorting and sort_index when you need rows ordered by their index labels.
How do I sort a DataFrame in place without creating a copy?
Pass inplace=True: df.sort_values('col', inplace=True). This modifies the original DataFrame and returns None. However, modern pandas style prefers reassignment: df = df.sort_values('col').
Conclusion
sort_values() is the primary tool for sorting DataFrames in pandas. Use it with a single column name for simple sorts, a list for multi-column sorts, and the ascending parameter to control direction. Handle missing values with na_position, use nlargest()/nsmallest() for top-N queries, and CategoricalDtype for custom orderings. Always remember that sorting returns a new DataFrame by default -- use reset_index(drop=True) if you want a clean sequential index afterward. Sorting also pairs well with .drop_duplicates() when you want to keep the first or last occurrence of duplicate rows, and with .value_counts() for frequency analysis.
Related Guides
- Pandas GroupBy: Split-Apply-Combine -- group data before sorting within groups
- Reset Index After Sorting -- get a clean sequential index after sort operations
- Filter Rows by Condition -- select rows before or after sorting
- Remove Duplicate Rows -- sort then deduplicate to keep specific occurrences
- Rename Columns -- standardize column names before sorting