Pandas Sort Values: Complete Guide to Sorting DataFrames in Python

Q: How do I sort a DataFrame by column in pandas?

Use df.sort_values('column_name') for ascending order or df.sort_values('column_name', ascending=False) for descending. For multiple columns, pass a list: df.sort_values(['col1', 'col2']).

Q: How do I sort by multiple columns with different orders?

Pass a list of booleans to ascending: df.sort_values(['col1', 'col2'], ascending=[True, False]). This sorts col1 ascending and col2 descending.

Q: Where do NaN values go when sorting?

By default NaN values are placed at the end. Use na_position='first' to put them at the beginning.

Q: What is the difference between sort_values and sort_index?

sort_values() sorts by column values. sort_index() sorts by the row index. Use sort_values for data sorting and sort_index for index ordering.

Q: How do I sort a DataFrame in place without creating a copy?

Pass inplace=True: df.sort_values('col', inplace=True). Modern pandas style prefers reassignment: df = df.sort_values('col').

Name: Soren Atelier

Updated on 2/10/2026

Unsorted data is hard to analyze. You scan through thousands of rows trying to find the highest sales, the most recent dates, or the lowest error rates -- and miss patterns that would be obvious if the data were ordered. Before sorting, you often need to filter rows to a relevant subset. Sorting a Python list works for simple cases, but DataFrames have multiple columns, mixed types, and missing values that lists cannot handle.

Pandas sort_values() sorts any DataFrame by one or more columns, with full control over ascending/descending order, null placement, and stability. This guide covers every sorting pattern you'll encounter in real data work.

Basic Sorting with sort_values()

Sort by Single Column

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'score': [85, 92, 78, 95, 88],
    'age': [25, 30, 22, 28, 35]
})
 
# Sort by score (ascending by default)
sorted_df = df.sort_values('score')
print(sorted_df)
#       name  score  age
# 2  Charlie     78   22
# 0    Alice     85   25
# 4      Eve     88   35
# 1      Bob     92   30
# 3    Diana     95   28

Sort Descending

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'score': [85, 92, 78, 95, 88],
    'age': [25, 30, 22, 28, 35]
})
 
# Highest scores first
sorted_df = df.sort_values('score', ascending=False)
print(sorted_df)
#       name  score  age
# 3    Diana     95   28
# 1      Bob     92   30
# 4      Eve     88   35
# 0    Alice     85   25
# 2  Charlie     78   22

Sort In Place

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78]
})
 
# Modify df directly (no copy created)
df.sort_values('score', inplace=True)
print(df)
#       name  score
# 2  Charlie     78
# 0    Alice     85
# 1      Bob     92

Multi-Column Sorting

Sort by Multiple Columns

import pandas as pd
 
df = pd.DataFrame({
    'department': ['Sales', 'Engineering', 'Sales', 'Engineering', 'Sales'],
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'salary': [70000, 85000, 65000, 90000, 70000]
})
 
# Sort by department first, then by salary within each department
sorted_df = df.sort_values(['department', 'salary'])
print(sorted_df)
#     department     name  salary
# 1  Engineering      Bob   85000
# 3  Engineering    Diana   90000
# 2        Sales  Charlie   65000
# 0        Sales    Alice   70000
# 4        Sales      Eve   70000

Mixed Ascending/Descending

import pandas as pd
 
df = pd.DataFrame({
    'department': ['Sales', 'Engineering', 'Sales', 'Engineering', 'Sales'],
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'salary': [70000, 85000, 65000, 90000, 70000]
})
 
# Department ascending, salary descending
sorted_df = df.sort_values(
    ['department', 'salary'],
    ascending=[True, False]
)
print(sorted_df)
#     department     name  salary
# 3  Engineering    Diana   90000
# 1  Engineering      Bob   85000
# 0        Sales    Alice   70000
# 4        Sales      Eve   70000
# 2        Sales  Charlie   65000

Handling Missing Values (NaN)

By default, NaN values go to the end regardless of sort order. Use na_position to control this:

import pandas as pd
import numpy as np
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'score': [85, np.nan, 78, np.nan]
})
 
# NaN at end (default)
print(df.sort_values('score'))
#       name  score
# 2  Charlie   78.0
# 0    Alice   85.0
# 1      Bob    NaN
# 3    Diana    NaN
 
# NaN at beginning
print(df.sort_values('score', na_position='first'))
#       name  score
# 1      Bob    NaN
# 3    Diana    NaN
# 2  Charlie   78.0
# 0    Alice   85.0

Sort by Index with sort_index()

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78]
}, index=[2, 0, 1])
 
# Sort by index
print(df.sort_index())
#      name  score
# 0     Bob     92
# 1  Charlie    78
# 2   Alice     85
 
# Sort by column names (axis=1)
df2 = pd.DataFrame({
    'c': [1, 2], 'a': [3, 4], 'b': [5, 6]
})
print(df2.sort_index(axis=1))
#    a  b  c
# 0  3  5  1
# 1  4  6  2

Reset Index After Sorting

After sorting, the original index is preserved. To get a clean sequential index, use .reset_index():

import pandas as pd
 
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78]
})
 
sorted_df = df.sort_values('score').reset_index(drop=True)
print(sorted_df)
#       name  score
# 0  Charlie     78
# 1    Alice     85
# 2      Bob     92

Use drop=True to discard the old index. Without it, the old index becomes a column.

Custom Sort Order

Using Categorical Types

import pandas as pd
 
df = pd.DataFrame({
    'priority': ['Medium', 'High', 'Low', 'High', 'Medium'],
    'task': ['Task A', 'Task B', 'Task C', 'Task D', 'Task E']
})
 
# Define custom order
priority_order = pd.CategoricalDtype(['Low', 'Medium', 'High'], ordered=True)
df['priority'] = df['priority'].astype(priority_order)
 
sorted_df = df.sort_values('priority')
print(sorted_df)
#   priority    task
# 2      Low  Task C
# 0   Medium  Task A
# 4   Medium  Task E
# 1     High  Task B
# 3     High  Task D

Using key Parameter

import pandas as pd
 
df = pd.DataFrame({
    'name': ['alice', 'Bob', 'CHARLIE', 'diana'],
    'score': [85, 92, 78, 95]
})
 
# Case-insensitive sort
sorted_df = df.sort_values('name', key=lambda x: x.str.lower())
print(sorted_df)
#       name  score
# 0    alice     85
# 1      Bob     92
# 2  CHARLIE     78
# 3    diana     95

Method Comparison

Method	Purpose	Modifies Original?	Returns
`sort_values(col)`	Sort by column values	No (unless inplace=True)	Sorted DataFrame
`sort_values([col1, col2])`	Sort by multiple columns	No (unless inplace=True)	Sorted DataFrame
`sort_index()`	Sort by row index	No (unless inplace=True)	Sorted DataFrame
`nsmallest(n, col)`	Get n smallest values	No	Subset DataFrame
`nlargest(n, col)`	Get n largest values	No	Subset DataFrame
`rank()`	Assign ranks to values	No	Series of ranks

Performance Tips

nlargest() and nsmallest() for Top-N

When you only need the top or bottom N rows, nlargest() and nsmallest() are faster than sorting the entire DataFrame:

import pandas as pd
import numpy as np
 
# Large DataFrame
df = pd.DataFrame({
    'id': range(1_000_000),
    'value': np.random.randn(1_000_000)
})
 
# Faster: only find top 10
top_10 = df.nlargest(10, 'value')
 
# Slower: sorts everything, then slices
top_10_slow = df.sort_values('value', ascending=False).head(10)

Stable vs Unstable Sorting

import pandas as pd
 
df = pd.DataFrame({
    'group': ['A', 'B', 'A', 'B'],
    'value': [1, 1, 2, 2],
    'order': [1, 2, 3, 4]
})
 
# Stable sort (default) preserves original order for ties
stable = df.sort_values('value', kind='mergesort')  # Default
 
# Unstable sort (faster for large datasets)
unstable = df.sort_values('value', kind='quicksort')

Practical Examples

Sort Dates

import pandas as pd
 
df = pd.DataFrame({
    'event': ['Launch', 'Meeting', 'Deadline', 'Review'],
    'date': pd.to_datetime(['2026-03-15', '2026-01-10', '2026-02-28', '2026-01-05'])
})
 
# Chronological order
print(df.sort_values('date'))
#      event       date
# 3   Review 2026-01-05
# 1  Meeting 2026-01-10
# 2 Deadline 2026-02-28
# 0   Launch 2026-03-15

Sort with GroupBy

Sorting often pairs with .groupby() to find top entries within each group:

import pandas as pd
 
df = pd.DataFrame({
    'store': ['A', 'A', 'B', 'B', 'A', 'B'],
    'product': ['X', 'Y', 'X', 'Y', 'Z', 'Z'],
    'revenue': [100, 250, 150, 300, 200, 175]
})
 
# Top product by revenue in each store
top_per_store = (df.sort_values('revenue', ascending=False)
                   .groupby('store')
                   .head(1))
print(top_per_store)
#   store product  revenue
# 3     B       Y      300
# 1     A       Y      250

Exploring Sorted Data Visually

After sorting your DataFrame to find patterns, PyGWalker (opens in a new tab) lets you explore the same data through interactive drag-and-drop charts in Jupyter -- no extra code needed:

import pygwalker as pyg
 
walker = pyg.walk(sorted_df)

FAQ

How do I sort a DataFrame by column in pandas?

Use df.sort_values('column_name') for ascending order or df.sort_values('column_name', ascending=False) for descending. For multiple columns, pass a list: df.sort_values(['col1', 'col2']).

How do I sort by multiple columns with different orders?

Pass a list of booleans to ascending: df.sort_values(['col1', 'col2'], ascending=[True, False]). This sorts col1 ascending and col2 descending.

Where do NaN values go when sorting?

By default, NaN values are placed at the end regardless of sort direction. Use na_position='first' to put them at the beginning: df.sort_values('col', na_position='first').

What is the difference between sort_values and sort_index?

sort_values() sorts by column values. sort_index() sorts by the row index (or column index with axis=1). Use sort_values for data sorting and sort_index when you need rows ordered by their index labels.

How do I sort a DataFrame in place without creating a copy?

Pass inplace=True: df.sort_values('col', inplace=True). This modifies the original DataFrame and returns None. However, modern pandas style prefers reassignment: df = df.sort_values('col').

Conclusion

sort_values() is the primary tool for sorting DataFrames in pandas. Use it with a single column name for simple sorts, a list for multi-column sorts, and the ascending parameter to control direction. Handle missing values with na_position, use nlargest()/nsmallest() for top-N queries, and CategoricalDtype for custom orderings. Always remember that sorting returns a new DataFrame by default -- use reset_index(drop=True) if you want a clean sequential index afterward. Sorting also pairs well with .drop_duplicates() when you want to keep the first or last occurrence of duplicate rows, and with .value_counts() for frequency analysis.

Related Guides

Pandas GroupBy: Split-Apply-Combine -- group data before sorting within groups
Reset Index After Sorting -- get a clean sequential index after sort operations
Filter Rows by Condition -- select rows before or after sorting
Remove Duplicate Rows -- sort then deduplicate to keep specific occurrences
Rename Columns -- standardize column names before sorting

📚