Python Sort: Complete Guide to sorted(), list.sort(), and Custom Sorting
Updated on
Sorting data is one of the most fundamental operations in programming, yet choosing between Python's sorted() function and list.sort() method often confuses developers. Whether you're ranking student scores, organizing product prices, or processing file listings, understanding Python's sorting mechanisms saves time and prevents bugs. This guide explores every aspect of Python sorting, from basic number lists to complex custom comparisons, with practical examples you can implement immediately.
sorted() vs list.sort(): The Critical Difference
Python provides two primary sorting mechanisms that appear similar but behave fundamentally differently. The sorted() function creates and returns a new sorted list while leaving the original unchanged. The list.sort() method modifies the list in place and returns None.
# sorted() returns a new list
numbers = [3, 1, 4, 1, 5]
sorted_nums = sorted(numbers)
print(sorted_nums) # [1, 1, 3, 4, 5]
print(numbers) # [3, 1, 4, 1, 5] - original unchanged
# list.sort() modifies in place
numbers = [3, 1, 4, 1, 5]
result = numbers.sort()
print(numbers) # [1, 1, 3, 4, 5] - original modified
print(result) # NoneThis distinction matters for memory efficiency and data preservation. Use sorted() when you need to keep the original list or when sorting immutable sequences like tuples. Use list.sort() when working with large lists where creating a copy would consume excessive memory.
# sorted() works with any iterable
tuple_data = (42, 17, 93, 8)
sorted_list = sorted(tuple_data) # Returns [8, 17, 42, 93]
# list.sort() only works on lists
# tuple_data.sort() # AttributeError: 'tuple' object has no attribute 'sort'Basic Sorting: Numbers, Strings, and Mixed Types
Python's default sorting behavior uses natural ordering. Numbers sort numerically, strings sort alphabetically, and mixed-type lists require explicit handling.
# Numeric sorting
integers = [42, -7, 0, 93, 15]
print(sorted(integers)) # [-7, 0, 15, 42, 93]
floats = [3.14, 2.71, 1.41, 9.81]
print(sorted(floats)) # [1.41, 2.71, 3.14, 9.81]
# String sorting (lexicographic order)
words = ['zebra', 'apple', 'mango', 'banana']
print(sorted(words)) # ['apple', 'banana', 'mango', 'zebra']
# Mixed types cause errors in Python 3
mixed = [3, 'apple', 1.5]
# sorted(mixed) # TypeError: '<' not supported between instancesFor numeric strings that should sort numerically rather than alphabetically, convert them during sorting:
# String numbers sort alphabetically
string_nums = ['10', '2', '100', '21']
print(sorted(string_nums)) # ['10', '100', '2', '21']
# Convert to int for numeric sorting
print(sorted(string_nums, key=int)) # ['2', '10', '21', '100']Reverse Sorting with reverse=True
Both sorted() and list.sort() accept a reverse parameter that inverts the sort order. This parameter defaults to False for ascending order.
scores = [88, 92, 75, 95, 82]
# Descending order using reverse=True
print(sorted(scores, reverse=True)) # [95, 92, 88, 82, 75]
# In-place descending sort
scores.sort(reverse=True)
print(scores) # [95, 92, 88, 82, 75]
# Reverse alphabetical order
names = ['Alice', 'David', 'Bob', 'Charlie']
print(sorted(names, reverse=True)) # ['David', 'Charlie', 'Bob', 'Alice']The key Parameter: Custom Sorting Logic
The key parameter accepts a function that transforms each element before comparison. This enables sorting by specific attributes, computed values, or complex criteria without modifying the original data.
# Sort by string length
words = ['python', 'is', 'awesome', 'for', 'data']
print(sorted(words, key=len)) # ['is', 'for', 'data', 'python', 'awesome']
# Sort by absolute value
numbers = [-5, -2, 3, -8, 1]
print(sorted(numbers, key=abs)) # [1, -2, 3, -5, -8]
# Sort case-insensitive
names = ['alice', 'Bob', 'CHARLIE', 'david']
print(sorted(names, key=str.lower)) # ['alice', 'Bob', 'CHARLIE', 'david']Lambda Functions for Inline Sorting
Lambda functions provide concise syntax for simple key functions. They're ideal for single-use transformations where defining a separate function would be verbose.
# Sort tuples by second element
pairs = [(1, 'b'), (3, 'a'), (2, 'd'), (4, 'c')]
print(sorted(pairs, key=lambda x: x[1]))
# [(3, 'a'), (1, 'b'), (4, 'c'), (2, 'd')]
# Sort strings by last character
words = ['python', 'java', 'ruby', 'go']
print(sorted(words, key=lambda s: s[-1]))
# ['java', 'go', 'ruby', 'python']
# Sort by distance from target value
target = 50
values = [45, 62, 38, 55, 48]
print(sorted(values, key=lambda x: abs(x - target)))
# [48, 45, 55, 38, 62]Sorting by Multiple Criteria
When sorting requires multiple priorities, return a tuple from the key function. Python compares tuples element by element, providing natural multi-level sorting.
# Sort students by grade (descending), then name (ascending)
students = [
('Alice', 85),
('Bob', 92),
('Charlie', 85),
('David', 92)
]
sorted_students = sorted(students, key=lambda x: (-x[1], x[0]))
print(sorted_students)
# [('Bob', 92), ('David', 92), ('Alice', 85), ('Charlie', 85)]
# Sort dates by year (descending), then month (ascending)
dates = ['2024-03', '2023-12', '2024-01', '2023-08']
print(sorted(dates, key=lambda d: (-int(d[:4]), int(d[5:]))))
# ['2024-01', '2024-03', '2023-08', '2023-12']operator.itemgetter and operator.attrgetter
The operator module provides optimized alternatives to lambda functions for common sorting patterns. These functions execute faster than lambdas and improve code readability.
from operator import itemgetter, attrgetter
# Sort by specific dictionary keys
data = [
{'name': 'Alice', 'age': 30, 'score': 85},
{'name': 'Bob', 'age': 25, 'score': 92},
{'name': 'Charlie', 'age': 30, 'score': 78}
]
# Sort by age, then score
sorted_data = sorted(data, key=itemgetter('age', 'score'))
print([d['name'] for d in sorted_data]) # ['Bob', 'Charlie', 'Alice']
# Sort by tuple elements
records = [(1, 'b', 300), (2, 'a', 100), (1, 'a', 200)]
print(sorted(records, key=itemgetter(0, 1)))
# [(1, 'a', 200), (1, 'b', 300), (2, 'a', 100)]For object attributes, use attrgetter:
from operator import attrgetter
class Employee:
def __init__(self, name, salary, department):
self.name = name
self.salary = salary
self.department = department
def __repr__(self):
return f'{self.name}({self.department}, ${self.salary})'
employees = [
Employee('Alice', 75000, 'Engineering'),
Employee('Bob', 65000, 'Marketing'),
Employee('Charlie', 75000, 'Engineering')
]
# Sort by department, then salary (descending)
sorted_emps = sorted(employees, key=attrgetter('department'))
print(sorted_emps)Sorting Dictionaries
Dictionaries themselves are unordered (though Python 3.7+ preserves insertion order), but you can sort their keys, values, or items.
# Sort dictionary by keys
data = {'zebra': 3, 'apple': 1, 'mango': 2}
sorted_keys = sorted(data.keys())
print(sorted_keys) # ['apple', 'mango', 'zebra']
# Create new dict with sorted keys
sorted_dict = {k: data[k] for k in sorted_keys}
print(sorted_dict) # {'apple': 1, 'mango': 2, 'zebra': 3}
# Sort by values
sorted_by_value = sorted(data.items(), key=lambda x: x[1])
print(sorted_by_value) # [('apple', 1), ('mango', 2), ('zebra', 3)]
# Convert back to dictionary
sorted_dict = dict(sorted_by_value)For complex value types:
# Sort dictionary by nested values
products = {
'laptop': {'price': 999, 'stock': 5},
'phone': {'price': 599, 'stock': 12},
'tablet': {'price': 399, 'stock': 8}
}
# Sort by price
by_price = sorted(products.items(), key=lambda x: x[1]['price'])
print([(name, info['price']) for name, info in by_price])
# [('tablet', 399), ('phone', 599), ('laptop', 999)]
# Sort by stock (descending)
by_stock = sorted(products.items(), key=lambda x: -x[1]['stock'])
print([(name, info['stock']) for name, info in by_stock])
# [('phone', 12), ('tablet', 8), ('laptop', 5)]Sorting Lists of Tuples
Tuples sort naturally by comparing elements left to right. Customize this behavior with key functions.
# Default tuple sorting (compares element by element)
tuples = [(1, 'z'), (2, 'a'), (1, 'a'), (2, 'z')]
print(sorted(tuples))
# [(1, 'a'), (1, 'z'), (2, 'a'), (2, 'z')]
# Sort by second element only
print(sorted(tuples, key=lambda x: x[1]))
# [(2, 'a'), (1, 'a'), (1, 'z'), (2, 'z')]
# Sort by sum of numeric elements
coord_distances = [(1, 2), (3, 1), (2, 2), (1, 3)]
print(sorted(coord_distances, key=sum))
# [(1, 2), (2, 2), (1, 3), (3, 1)]Sorting Lists of Dictionaries
Sorting dictionaries within lists requires specifying which key to use for comparison.
people = [
{'name': 'Alice', 'age': 30, 'city': 'New York'},
{'name': 'Bob', 'age': 25, 'city': 'San Francisco'},
{'name': 'Charlie', 'age': 30, 'city': 'Austin'}
]
# Sort by age
by_age = sorted(people, key=lambda x: x['age'])
print([p['name'] for p in by_age]) # ['Bob', 'Alice', 'Charlie']
# Sort by age (descending), then name (ascending)
by_age_name = sorted(people, key=lambda x: (-x['age'], x['name']))
print([f"{p['name']} ({p['age']})" for p in by_age_name])
# ['Alice (30)', 'Charlie (30)', 'Bob (25)']
# Using itemgetter (more efficient)
from operator import itemgetter
by_city = sorted(people, key=itemgetter('city'))
print([p['city'] for p in by_city])
# ['Austin', 'New York', 'San Francisco']Sorting Lists of Objects
Custom classes require key functions that extract comparable attributes.
class Product:
def __init__(self, name, price, rating):
self.name = name
self.price = price
self.rating = rating
def __repr__(self):
return f'{self.name}(${self.price}, {self.rating}★)'
products = [
Product('Laptop', 999, 4.5),
Product('Phone', 599, 4.8),
Product('Tablet', 399, 4.2),
Product('Monitor', 299, 4.7)
]
# Sort by price
by_price = sorted(products, key=lambda p: p.price)
print(by_price)
# Sort by rating (descending)
by_rating = sorted(products, key=lambda p: -p.rating)
print(by_rating)
# Sort by value score (rating / price * 100)
by_value = sorted(products, key=lambda p: (p.rating / p.price) * 100, reverse=True)
print(by_value)Alternatively, implement __lt__ (less than) method for default sorting behavior:
class Student:
def __init__(self, name, gpa):
self.name = name
self.gpa = gpa
def __lt__(self, other):
return self.gpa > other.gpa # Higher GPA comes first
def __repr__(self):
return f'{self.name}({self.gpa})'
students = [Student('Alice', 3.8), Student('Bob', 3.9), Student('Charlie', 3.7)]
print(sorted(students)) # [Bob(3.9), Alice(3.8), Charlie(3.7)]Case-Insensitive String Sorting
String comparisons are case-sensitive by default, causing uppercase letters to sort before lowercase. Use str.lower() or str.casefold() for case-insensitive sorting.
# Case-sensitive sorting (default)
words = ['apple', 'Banana', 'cherry', 'Date']
print(sorted(words)) # ['Banana', 'Date', 'apple', 'cherry']
# Case-insensitive sorting
print(sorted(words, key=str.lower)) # ['apple', 'Banana', 'cherry', 'Date']
# Using casefold for Unicode handling
international = ['café', 'CAFÉ', 'Apple', 'apple']
print(sorted(international, key=str.casefold))Stable Sort Guarantee and Timsort Algorithm
Python uses the Timsort algorithm, which guarantees stable sorting. Stability means elements with equal keys retain their original relative order.
# Stable sort preserves original order for equal elements
data = [
('Alice', 85),
('Bob', 92),
('Charlie', 85), # Same score as Alice
('David', 92) # Same score as Bob
]
# Sort by score only - original order preserved for ties
by_score = sorted(data, key=lambda x: x[1])
print(by_score)
# [('Alice', 85), ('Charlie', 85), ('Bob', 92), ('David', 92)]
# Alice appears before Charlie (both 85) because she came firstThis stability enables multi-pass sorting:
# Multi-pass sorting using stability
records = [
{'name': 'Alice', 'dept': 'HR', 'salary': 60000},
{'name': 'Bob', 'dept': 'IT', 'salary': 75000},
{'name': 'Charlie', 'dept': 'HR', 'salary': 65000},
{'name': 'David', 'dept': 'IT', 'salary': 70000}
]
# First sort by salary
records.sort(key=lambda x: x['salary'], reverse=True)
# Then sort by department (stable, so salary order preserved within dept)
records.sort(key=lambda x: x['dept'])
for r in records:
print(f"{r['dept']}: {r['name']} (${r['salary']})")
# HR: Charlie ($65000)
# HR: Alice ($60000)
# IT: Bob ($75000)
# IT: David ($70000)Performance Comparison: Sorting Methods
Python's sorting has O(n log n) time complexity in the worst case, making it efficient for most datasets. Different approaches suit different scenarios:
| Method | Time Complexity | Space | Use Case |
|---|---|---|---|
list.sort() | O(n log n) | O(1) | In-place sorting, large lists |
sorted() | O(n log n) | O(n) | Preserve original, sort any iterable |
heapq.nsmallest(k, items) | O(n log k) | O(k) | Top k elements from large dataset |
heapq.nlargest(k, items) | O(n log k) | O(k) | Bottom k elements from large dataset |
bisect.insort(list, item) | O(n) | O(1) | Maintain sorted list with insertions |
import heapq
# When you only need top 3 scores from 1 million records
scores = list(range(1000000))
top_3 = heapq.nlargest(3, scores) # Much faster than sorted()[-3:]
print(top_3) # [999999, 999998, 999997]
# When you need bottom 5 values
bottom_5 = heapq.nsmallest(5, scores)
print(bottom_5) # [0, 1, 2, 3, 4]For maintaining a sorted list with frequent insertions:
import bisect
# Maintain sorted list efficiently
sorted_list = []
for value in [5, 2, 8, 1, 9, 3]:
bisect.insort(sorted_list, value)
print(sorted_list) # [1, 2, 3, 5, 8, 9]functools.cmp_to_key for Legacy Comparison Functions
Python 2 used comparison functions returning -1, 0, or 1. Python 3 requires key functions, but functools.cmp_to_key converts old comparison functions.
from functools import cmp_to_key
# Custom comparison: sort by remainder when divided by 3
def compare_mod3(x, y):
return (x % 3) - (y % 3)
numbers = [10, 11, 12, 13, 14, 15]
sorted_nums = sorted(numbers, key=cmp_to_key(compare_mod3))
print(sorted_nums) # [12, 15, 10, 13, 11, 14]
# Groups: [12,15] (mod 0), [10,13] (mod 1), [11,14] (mod 2)
# Complex comparison: version strings
def compare_versions(v1, v2):
parts1 = list(map(int, v1.split('.')))
parts2 = list(map(int, v2.split('.')))
for p1, p2 in zip(parts1, parts2):
if p1 != p2:
return p1 - p2
return len(parts1) - len(parts2)
versions = ['1.10.2', '1.9.0', '1.10.15', '2.0.0', '1.10']
sorted_versions = sorted(versions, key=cmp_to_key(compare_versions))
print(sorted_versions) # ['1.9.0', '1.10', '1.10.2', '1.10.15', '2.0.0']Sorting with None Values
None values cause comparison errors when mixed with other types. Handle them explicitly by placing them at the beginning or end.
# None values cause errors
data = [3, None, 1, 5, None, 2]
# sorted(data) # TypeError: '<' not supported between 'NoneType' and 'int'
# Place None values at the end
sorted_data = sorted(data, key=lambda x: (x is None, x))
print(sorted_data) # [1, 2, 3, 5, None, None]
# Place None values at the beginning
sorted_data = sorted(data, key=lambda x: (x is not None, x))
print(sorted_data) # [None, None, 1, 2, 3, 5]
# Replace None with specific value for sorting
sorted_data = sorted(data, key=lambda x: x if x is not None else float('-inf'))
print(sorted_data) # [None, None, 1, 2, 3, 5]For dictionaries with potentially missing keys:
records = [
{'name': 'Alice', 'score': 85},
{'name': 'Bob'}, # Missing score
{'name': 'Charlie', 'score': 92},
{'name': 'David'} # Missing score
]
# Sort with default value for missing keys
sorted_records = sorted(records, key=lambda x: x.get('score', 0))
print([r['name'] for r in sorted_records])
# ['Bob', 'David', 'Alice', 'Charlie']Real-World Example: Ranking Student Performance
This example demonstrates multiple sorting techniques combined to create a comprehensive ranking system:
class StudentRecord:
def __init__(self, name, test_scores, attendance, extra_credit=0):
self.name = name
self.test_scores = test_scores
self.attendance = attendance
self.extra_credit = extra_credit
@property
def average_score(self):
return sum(self.test_scores) / len(self.test_scores) if self.test_scores else 0
@property
def final_grade(self):
base = self.average_score * 0.8 + self.attendance * 0.2
return min(100, base + self.extra_credit)
def __repr__(self):
return f'{self.name}: {self.final_grade:.1f}'
students = [
StudentRecord('Alice', [85, 90, 88], 95, 5),
StudentRecord('Bob', [92, 88, 95], 80, 0),
StudentRecord('Charlie', [78, 82, 80], 100, 10),
StudentRecord('David', [90, 92, 88], 90, 2),
StudentRecord('Eve', [85, 85, 85], 95, 0)
]
# Rank by final grade (highest first), then name
rankings = sorted(students, key=lambda s: (-s.final_grade, s.name))
print("Student Rankings:")
for rank, student in enumerate(rankings, 1):
print(f"{rank}. {student}")Real-World Example: Processing File Listings
Sorting file metadata requires handling various data types and custom ordering logic:
import os
from datetime import datetime
class FileInfo:
def __init__(self, path):
self.name = os.path.basename(path)
self.path = path
self.size = os.path.getsize(path)
self.modified = os.path.getmtime(path)
self.extension = os.path.splitext(path)[1].lower()
def __repr__(self):
size_kb = self.size / 1024
mod_time = datetime.fromtimestamp(self.modified).strftime('%Y-%m-%d %H:%M')
return f'{self.name} ({size_kb:.1f} KB, {mod_time})'
# Simulate file information
files = [
FileInfo('/data/report.pdf'),
FileInfo('/data/summary.txt'),
FileInfo('/data/analysis.xlsx'),
FileInfo('/data/backup.pdf'),
]
# Sort by extension, then size (largest first)
by_type_size = sorted(files, key=lambda f: (f.extension, -f.size))
# Sort by modification time (most recent first)
by_recent = sorted(files, key=lambda f: -f.modified)
# Sort by size, handling different units
def format_size(bytes_size):
if bytes_size < 1024:
return f"{bytes_size} B"
elif bytes_size < 1024 * 1024:
return f"{bytes_size/1024:.1f} KB"
else:
return f"{bytes_size/(1024*1024):.1f} MB"
by_size = sorted(files, key=lambda f: f.size, reverse=True)
for f in by_size:
print(f"{f.name}: {format_size(f.size)}")Interactive Data Sorting with PyGWalker
When working with large datasets that require visual exploration and interactive sorting, PyGWalker transforms pandas DataFrames into Tableau-style interfaces. Instead of writing complex sorting logic, you can drag and drop columns to sort, filter, and analyze data visually:
import pandas as pd
import pygwalker as pyg
# Create sample sales data
sales_data = pd.DataFrame({
'product': ['Laptop', 'Phone', 'Tablet', 'Monitor', 'Keyboard'] * 100,
'region': ['North', 'South', 'East', 'West', 'Central'] * 100,
'revenue': [999, 599, 399, 299, 49] * 100,
'units_sold': [50, 120, 80, 95, 200] * 100,
'date': pd.date_range('2024-01-01', periods=500)
})
# Launch interactive visualization
pyg.walk(sales_data)PyGWalker enables:
- Multi-column sorting with visual feedback
- Dynamic filtering combined with sorting
- Sorting by calculated fields without writing code
- Exporting sorted views as charts or tables
This approach works particularly well when exploring unfamiliar datasets or presenting findings to non-technical stakeholders who need to understand data patterns without running Python code.
FAQ
What's the difference between sorted() and list.sort() in Python?
The sorted() function returns a new sorted list and works with any iterable, leaving the original unchanged. The list.sort() method modifies the list in place, returns None, and only works with list objects. Use sorted() to preserve the original data or sort non-list sequences like tuples; use list.sort() for memory efficiency when sorting large lists.
How do I sort a list in descending order in Python?
Add the reverse=True parameter to either sorted() or list.sort(). For example: sorted([3, 1, 4], reverse=True) returns [4, 3, 1]. This works with any data type that has natural ordering, including numbers, strings, and dates.
Can I sort a dictionary by value in Python?
Dictionaries themselves don't have order in the traditional sense, but you can sort their items by value using sorted() with a key function: sorted(dict.items(), key=lambda x: x[1]). This returns a list of tuples sorted by value. Convert back to a dictionary with dict(sorted(...)) if needed, though this only preserves order in Python 3.7+.
How do I sort a list of dictionaries by a specific key?
Use sorted() with a lambda function that extracts the key: sorted(list_of_dicts, key=lambda x: x['keyname']). For better performance, use operator.itemgetter: sorted(list_of_dicts, key=itemgetter('keyname')). Both approaches work for sorting by single or multiple keys.
What is the time complexity of Python's sort function?
Python's sorted() and list.sort() use the Timsort algorithm, which has O(n log n) worst-case time complexity and O(n) best-case complexity for partially sorted data. The space complexity is O(n) for sorted() (creates new list) and O(1) for list.sort() (in-place). For finding top k elements from large datasets, heapq.nlargest() offers O(n log k) complexity.
How do I handle None values when sorting?
None values cause comparison errors when mixed with other types. Use a key function that places them at the beginning or end: sorted(data, key=lambda x: (x is None, x)) puts None at the end, while sorted(data, key=lambda x: (x is not None, x)) puts them first. For dictionaries with missing keys, use x.get('key', default_value) to provide a fallback.
Is Python's sort stable?
Yes, Python's sorting is guaranteed stable, meaning elements with equal keys maintain their original relative order. This enables multi-pass sorting where you sort by one criterion, then another, with the first sort preserved for ties. Stability is guaranteed by the Timsort algorithm used in Python.
Conclusion
Python's sorting capabilities provide powerful tools for organizing data efficiently. The distinction between sorted() and list.sort() guides memory-conscious decisions, while the key parameter and lambda functions enable sophisticated custom sorting logic. Understanding stable sorting, multi-criteria ordering, and performance characteristics empowers you to choose the right approach for each scenario.
Whether you're ranking student performance, processing file metadata, or analyzing business metrics, these sorting techniques form the foundation for data manipulation workflows. For visual data exploration and interactive sorting, PyGWalker extends these capabilities with drag-and-drop interfaces that complement programmatic sorting.
Master these sorting patterns to write cleaner code, improve performance, and handle complex data organization challenges with confidence.