Skip to content

Python heapq: Priority Queues and Heap Operations Made Simple

Updated on

You need to process items by priority -- tasks in a job scheduler, events in a simulation, or the shortest path in a graph. A sorted list seems like the obvious choice, but inserting into a sorted list is O(n), which becomes a bottleneck when you're processing millions of items. Sorting the entire list every time you add an element is even worse at O(n log n).

The cost compounds quickly. A job scheduler that re-sorts its queue on every insertion wastes most of its time on bookkeeping. A pathfinding algorithm that scans a full list for the minimum distance on every step turns a linear-time operation into a quadratic one.

Python's heapq module provides a heap-based priority queue that offers O(log n) insertion and O(log n) extraction of the smallest element. It also includes optimized functions for finding the N largest or smallest elements from any iterable without sorting the entire collection. This guide covers every function in heapq with practical examples.

📚

What is a Heap?

A heap is a binary tree where each parent node is smaller than or equal to its children (min-heap). Python's heapq implements a min-heap using a regular list, where the smallest element is always at index 0.

The key operations and their time complexities:

OperationFunctionTime Complexity
Push an itemheapq.heappush(heap, item)O(log n)
Pop smallestheapq.heappop(heap)O(log n)
Push then popheapq.heappushpop(heap, item)O(log n)
Pop then pushheapq.heapreplace(heap, item)O(log n)
Build heap from listheapq.heapify(list)O(n)
N smallest itemsheapq.nsmallest(n, iterable)O(n log k)
N largest itemsheapq.nlargest(n, iterable)O(n log k)

Basic Heap Operations

Creating and Using a Heap

import heapq
 
# Start with an empty heap (just a regular list)
heap = []
 
# Push items
heapq.heappush(heap, 5)
heapq.heappush(heap, 1)
heapq.heappush(heap, 3)
heapq.heappush(heap, 2)
 
print(heap)        # [1, 2, 3, 5] (heap order, not sorted)
print(heap[0])     # 1 (smallest element, always at index 0)
 
# Pop smallest item
smallest = heapq.heappop(heap)
print(smallest)    # 1
print(heap)        # [2, 5, 3]

Converting a List to a Heap

import heapq
 
data = [5, 3, 8, 1, 2, 7, 4, 6]
heapq.heapify(data)  # In-place, O(n)
print(data)  # [1, 2, 4, 3, 5, 7, 8, 6] (heap order)
 
# Pop elements in sorted order
sorted_data = []
while data:
    sorted_data.append(heapq.heappop(data))
print(sorted_data)  # [1, 2, 3, 4, 5, 6, 7, 8]

heappushpop and heapreplace

These combined operations are more efficient than separate push/pop calls:

import heapq
 
heap = [1, 3, 5, 7]
heapq.heapify(heap)
 
# heappushpop: push first, then pop smallest
result = heapq.heappushpop(heap, 2)
print(result)  # 1 (pushed 2, popped 1)
print(heap)    # [2, 3, 5, 7]
 
# heapreplace: pop smallest first, then push
result = heapq.heapreplace(heap, 10)
print(result)  # 2 (popped 2, pushed 10)
print(heap)    # [3, 7, 5, 10]

nlargest and nsmallest

These functions find the N largest or smallest elements without sorting the entire collection.

import heapq
 
scores = [85, 92, 78, 95, 88, 72, 91, 83, 97, 69]
 
# Top 3 scores
top_3 = heapq.nlargest(3, scores)
print(top_3)  # [97, 95, 92]
 
# Bottom 3 scores
bottom_3 = heapq.nsmallest(3, scores)
print(bottom_3)  # [69, 72, 78]

With a Key Function

Both accept a key parameter for custom comparison:

import heapq
 
students = [
    {'name': 'Alice', 'gpa': 3.9},
    {'name': 'Bob', 'gpa': 3.5},
    {'name': 'Charlie', 'gpa': 3.8},
    {'name': 'Diana', 'gpa': 3.95},
    {'name': 'Eve', 'gpa': 3.2},
]
 
top_students = heapq.nlargest(3, students, key=lambda s: s['gpa'])
for s in top_students:
    print(f"{s['name']}: {s['gpa']}")
# Diana: 3.95
# Alice: 3.9
# Charlie: 3.8

When to Use nlargest/nsmallest vs sorted()

ScenarioBest ApproachWhy
N is 1min() / max()O(n), simplest
N is small relative to totalnlargest / nsmallestO(n log k), less memory
N is close to total lengthsorted(iterable)[:n]O(n log n), but fast in practice
Need all items sortedsorted()Full sort is appropriate

Implementing a Max-Heap

Python's heapq only provides a min-heap. To create a max-heap, negate the values:

import heapq
 
# Max-heap using negation
max_heap = []
for val in [5, 1, 3, 8, 2]:
    heapq.heappush(max_heap, -val)
 
# Pop largest items
while max_heap:
    print(-heapq.heappop(max_heap), end=' ')
# 8 5 3 2 1

Priority Queue with Custom Objects

For objects that aren't directly comparable, use tuples (priority, counter, item):

import heapq
from dataclasses import dataclass
 
@dataclass
class Task:
    name: str
    description: str
 
# Use (priority, tie_breaker, task) tuples
task_queue = []
counter = 0
 
def add_task(priority, task):
    global counter
    heapq.heappush(task_queue, (priority, counter, task))
    counter += 1
 
def get_next_task():
    if task_queue:
        priority, _, task = heapq.heappop(task_queue)
        return task
    return None
 
add_task(3, Task("Low", "Clean up logs"))
add_task(1, Task("Critical", "Fix production bug"))
add_task(2, Task("Medium", "Update docs"))
add_task(1, Task("Critical", "Patch security hole"))
 
while task_queue:
    task = get_next_task()
    print(f"{task.name}: {task.description}")
 
# Critical: Fix production bug
# Critical: Patch security hole
# Medium: Update docs
# Low: Clean up logs

The counter serves as a tie-breaker so items with equal priority are returned in insertion order (FIFO).

Practical Use Cases

Merge Sorted Iterables

import heapq
 
list1 = [1, 4, 7, 10]
list2 = [2, 5, 8, 11]
list3 = [3, 6, 9, 12]
 
merged = list(heapq.merge(list1, list2, list3))
print(merged)  # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

heapq.merge() is memory-efficient -- it doesn't load all lists into memory at once.

Running Median

import heapq
 
class RunningMedian:
    """Track the median of a stream of numbers."""
    def __init__(self):
        self.low = []   # max-heap (negated values)
        self.high = []  # min-heap
 
    def add(self, num):
        heapq.heappush(self.low, -num)
        heapq.heappush(self.high, -heapq.heappop(self.low))
        if len(self.high) > len(self.low):
            heapq.heappush(self.low, -heapq.heappop(self.high))
 
    def median(self):
        if len(self.low) > len(self.high):
            return -self.low[0]
        return (-self.low[0] + self.high[0]) / 2
 
rm = RunningMedian()
for num in [2, 1, 5, 7, 2, 0, 5]:
    rm.add(num)
    print(f"Added {num}, median = {rm.median()}")

Dijkstra's Shortest Path

import heapq
 
def dijkstra(graph, start):
    """Find shortest paths from start to all nodes."""
    distances = {node: float('inf') for node in graph}
    distances[start] = 0
    pq = [(0, start)]
 
    while pq:
        dist, node = heapq.heappop(pq)
        if dist > distances[node]:
            continue
        for neighbor, weight in graph[node]:
            new_dist = dist + weight
            if new_dist < distances[neighbor]:
                distances[neighbor] = new_dist
                heapq.heappush(pq, (new_dist, neighbor))
 
    return distances
 
graph = {
    'A': [('B', 1), ('C', 4)],
    'B': [('C', 2), ('D', 5)],
    'C': [('D', 1)],
    'D': [],
}
 
print(dijkstra(graph, 'A'))
# {'A': 0, 'B': 1, 'C': 3, 'D': 4}

Analyzing Heap Performance in Jupyter

When comparing heap-based algorithms against alternatives or profiling priority queue performance, interactive experimentation speeds up the process. RunCell (opens in a new tab) is an AI agent for Jupyter that can help you benchmark heap operations, visualize performance characteristics, and optimize your data structure choices interactively.

FAQ

What is heapq in Python?

heapq is a standard library module that implements a min-heap using a regular Python list. It provides functions for pushing and popping items in O(log n) time while maintaining the heap property (smallest element always at index 0). It is the standard way to implement priority queues in Python.

How do I create a max-heap in Python?

Python's heapq only supports min-heaps. To simulate a max-heap, negate the values when pushing and negate again when popping: heapq.heappush(heap, -value) to push, -heapq.heappop(heap) to get the largest. For complex objects, negate the priority in a tuple: (-priority, counter, item).

When should I use heapq vs sorted()?

Use heapq.nlargest(n, data) or heapq.nsmallest(n, data) when you need only the top N items and N is small relative to the total data size. Use sorted() when you need all items in order or when N is close to the total size. Use min()/max() when N is 1.

Is heapq thread-safe?

No. Individual heappush and heappop calls are not atomic. If multiple threads access the same heap, you need external synchronization. For thread-safe priority queues, use queue.PriorityQueue, which wraps heapq with proper locking.

What is the time complexity of heapq operations?

heappush and heappop are both O(log n). heapify converts a list to a heap in O(n). nlargest(k, data) and nsmallest(k, data) run in O(n log k) where n is the data size and k is the number of items requested.

Conclusion

Python's heapq module is the standard tool for priority queues and top-N selection. Its heap-based approach provides O(log n) push and pop operations, making it dramatically faster than maintaining a sorted list. Use heappush/heappop for priority queue patterns, nlargest/nsmallest for top-N selection, merge for combining sorted sequences, and heapify when you have an existing list to convert.

📚