Skip to content

Pandas Sort Values: Complete Guide to Sorting DataFrames in Python

Updated on

Unsorted data is hard to analyze. You scan through thousands of rows trying to find the highest sales, the most recent dates, or the lowest error rates -- and miss patterns that would be obvious if the data were ordered. Before sorting, you often need to filter rows to a relevant subset. Sorting a Python list works for simple cases, but DataFrames have multiple columns, mixed types, and missing values that lists cannot handle.

Pandas sort_values() sorts any DataFrame by one or more columns, with full control over ascending/descending order, null placement, and stability. This guide covers every sorting pattern you'll encounter in real data work.

📚

Basic Sorting with sort_values()

Sort by Single Column

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'score': [85, 92, 78, 95, 88],
    'age': [25, 30, 22, 28, 35]
})
 
# Sort by score (ascending by default)
sorted_df = df.sort_values('score')
print(sorted_df)
#       name  score  age
# 2  Charlie     78   22
# 0    Alice     85   25
# 4      Eve     88   35
# 1      Bob     92   30
# 3    Diana     95   28

Sort Descending

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'score': [85, 92, 78, 95, 88],
    'age': [25, 30, 22, 28, 35]
})
 
# Highest scores first
sorted_df = df.sort_values('score', ascending=False)
print(sorted_df)
#       name  score  age
# 3    Diana     95   28
# 1      Bob     92   30
# 4      Eve     88   35
# 0    Alice     85   25
# 2  Charlie     78   22

Sort In Place

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78]
})
 
# Modify df directly (no copy created)
df.sort_values('score', inplace=True)
print(df)
#       name  score
# 2  Charlie     78
# 0    Alice     85
# 1      Bob     92

Multi-Column Sorting

Sort by Multiple Columns

import pandas as pd
 
df = pd.DataFrame({
    'department': ['Sales', 'Engineering', 'Sales', 'Engineering', 'Sales'],
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'salary': [70000, 85000, 65000, 90000, 70000]
})
 
# Sort by department first, then by salary within each department
sorted_df = df.sort_values(['department', 'salary'])
print(sorted_df)
#     department     name  salary
# 1  Engineering      Bob   85000
# 3  Engineering    Diana   90000
# 2        Sales  Charlie   65000
# 0        Sales    Alice   70000
# 4        Sales      Eve   70000

Mixed Ascending/Descending

import pandas as pd
 
df = pd.DataFrame({
    'department': ['Sales', 'Engineering', 'Sales', 'Engineering', 'Sales'],
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'salary': [70000, 85000, 65000, 90000, 70000]
})
 
# Department ascending, salary descending
sorted_df = df.sort_values(
    ['department', 'salary'],
    ascending=[True, False]
)
print(sorted_df)
#     department     name  salary
# 3  Engineering    Diana   90000
# 1  Engineering      Bob   85000
# 0        Sales    Alice   70000
# 4        Sales      Eve   70000
# 2        Sales  Charlie   65000

Handling Missing Values (NaN)

By default, NaN values go to the end regardless of sort order. Use na_position to control this:

import pandas as pd
import numpy as np
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'score': [85, np.nan, 78, np.nan]
})
 
# NaN at end (default)
print(df.sort_values('score'))
#       name  score
# 2  Charlie   78.0
# 0    Alice   85.0
# 1      Bob    NaN
# 3    Diana    NaN
 
# NaN at beginning
print(df.sort_values('score', na_position='first'))
#       name  score
# 1      Bob    NaN
# 3    Diana    NaN
# 2  Charlie   78.0
# 0    Alice   85.0

Sort by Index with sort_index()

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78]
}, index=[2, 0, 1])
 
# Sort by index
print(df.sort_index())
#      name  score
# 0     Bob     92
# 1  Charlie    78
# 2   Alice     85
 
# Sort by column names (axis=1)
df2 = pd.DataFrame({
    'c': [1, 2], 'a': [3, 4], 'b': [5, 6]
})
print(df2.sort_index(axis=1))
#    a  b  c
# 0  3  5  1
# 1  4  6  2

Reset Index After Sorting

After sorting, the original index is preserved. To get a clean sequential index, use .reset_index():

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78]
})
 
sorted_df = df.sort_values('score').reset_index(drop=True)
print(sorted_df)
#       name  score
# 0  Charlie     78
# 1    Alice     85
# 2      Bob     92

Use drop=True to discard the old index. Without it, the old index becomes a column.

Custom Sort Order

Using Categorical Types

import pandas as pd
 
df = pd.DataFrame({
    'priority': ['Medium', 'High', 'Low', 'High', 'Medium'],
    'task': ['Task A', 'Task B', 'Task C', 'Task D', 'Task E']
})
 
# Define custom order
priority_order = pd.CategoricalDtype(['Low', 'Medium', 'High'], ordered=True)
df['priority'] = df['priority'].astype(priority_order)
 
sorted_df = df.sort_values('priority')
print(sorted_df)
#   priority    task
# 2      Low  Task C
# 0   Medium  Task A
# 4   Medium  Task E
# 1     High  Task B
# 3     High  Task D

Using key Parameter

import pandas as pd
 
df = pd.DataFrame({
    'name': ['alice', 'Bob', 'CHARLIE', 'diana'],
    'score': [85, 92, 78, 95]
})
 
# Case-insensitive sort
sorted_df = df.sort_values('name', key=lambda x: x.str.lower())
print(sorted_df)
#       name  score
# 0    alice     85
# 1      Bob     92
# 2  CHARLIE     78
# 3    diana     95

Method Comparison

MethodPurposeModifies Original?Returns
sort_values(col)Sort by column valuesNo (unless inplace=True)Sorted DataFrame
sort_values([col1, col2])Sort by multiple columnsNo (unless inplace=True)Sorted DataFrame
sort_index()Sort by row indexNo (unless inplace=True)Sorted DataFrame
nsmallest(n, col)Get n smallest valuesNoSubset DataFrame
nlargest(n, col)Get n largest valuesNoSubset DataFrame
rank()Assign ranks to valuesNoSeries of ranks

Performance Tips

nlargest() and nsmallest() for Top-N

When you only need the top or bottom N rows, nlargest() and nsmallest() are faster than sorting the entire DataFrame:

import pandas as pd
import numpy as np
 
# Large DataFrame
df = pd.DataFrame({
    'id': range(1_000_000),
    'value': np.random.randn(1_000_000)
})
 
# Faster: only find top 10
top_10 = df.nlargest(10, 'value')
 
# Slower: sorts everything, then slices
top_10_slow = df.sort_values('value', ascending=False).head(10)

Stable vs Unstable Sorting

import pandas as pd
 
df = pd.DataFrame({
    'group': ['A', 'B', 'A', 'B'],
    'value': [1, 1, 2, 2],
    'order': [1, 2, 3, 4]
})
 
# Stable sort (default) preserves original order for ties
stable = df.sort_values('value', kind='mergesort')  # Default
 
# Unstable sort (faster for large datasets)
unstable = df.sort_values('value', kind='quicksort')

Practical Examples

Sort Dates

import pandas as pd
 
df = pd.DataFrame({
    'event': ['Launch', 'Meeting', 'Deadline', 'Review'],
    'date': pd.to_datetime(['2026-03-15', '2026-01-10', '2026-02-28', '2026-01-05'])
})
 
# Chronological order
print(df.sort_values('date'))
#      event       date
# 3   Review 2026-01-05
# 1  Meeting 2026-01-10
# 2 Deadline 2026-02-28
# 0   Launch 2026-03-15

Sort with GroupBy

Sorting often pairs with .groupby() to find top entries within each group:

import pandas as pd
 
df = pd.DataFrame({
    'store': ['A', 'A', 'B', 'B', 'A', 'B'],
    'product': ['X', 'Y', 'X', 'Y', 'Z', 'Z'],
    'revenue': [100, 250, 150, 300, 200, 175]
})
 
# Top product by revenue in each store
top_per_store = (df.sort_values('revenue', ascending=False)
                   .groupby('store')
                   .head(1))
print(top_per_store)
#   store product  revenue
# 3     B       Y      300
# 1     A       Y      250

Exploring Sorted Data Visually

After sorting your DataFrame to find patterns, PyGWalker (opens in a new tab) lets you explore the same data through interactive drag-and-drop charts in Jupyter -- no extra code needed:

import pygwalker as pyg
 
walker = pyg.walk(sorted_df)

FAQ

How do I sort a DataFrame by column in pandas?

Use df.sort_values('column_name') for ascending order or df.sort_values('column_name', ascending=False) for descending. For multiple columns, pass a list: df.sort_values(['col1', 'col2']).

How do I sort by multiple columns with different orders?

Pass a list of booleans to ascending: df.sort_values(['col1', 'col2'], ascending=[True, False]). This sorts col1 ascending and col2 descending.

Where do NaN values go when sorting?

By default, NaN values are placed at the end regardless of sort direction. Use na_position='first' to put them at the beginning: df.sort_values('col', na_position='first').

What is the difference between sort_values and sort_index?

sort_values() sorts by column values. sort_index() sorts by the row index (or column index with axis=1). Use sort_values for data sorting and sort_index when you need rows ordered by their index labels.

How do I sort a DataFrame in place without creating a copy?

Pass inplace=True: df.sort_values('col', inplace=True). This modifies the original DataFrame and returns None. However, modern pandas style prefers reassignment: df = df.sort_values('col').

Conclusion

sort_values() is the primary tool for sorting DataFrames in pandas. Use it with a single column name for simple sorts, a list for multi-column sorts, and the ascending parameter to control direction. Handle missing values with na_position, use nlargest()/nsmallest() for top-N queries, and CategoricalDtype for custom orderings. Always remember that sorting returns a new DataFrame by default -- use reset_index(drop=True) if you want a clean sequential index afterward. Sorting also pairs well with .drop_duplicates() when you want to keep the first or last occurrence of duplicate rows, and with .value_counts() for frequency analysis.

Related Guides

📚