Python zip() Function: Combine Iterables with Examples
Updated on
Working with multiple related lists is one of the most common tasks in Python. You have a list of names and a list of scores. A list of dates and a list of prices. A list of keys and a list of values. The natural approach is to use index-based loops with range(len()), but that creates cluttered code full of bracket indexing, makes the intent hard to read, and introduces opportunities for off-by-one errors. When your lists have different lengths, the problem gets worse -- you need boundary checks and defensive logic that obscures the actual work you are trying to do.
Python's built-in zip() function solves this cleanly. It takes two or more iterables and pairs their elements together into tuples, producing an iterator you can loop over directly. No indexing. No counters. No length checks. This guide covers every aspect of zip() -- from basic pairing to advanced patterns like matrix transposition, dictionary creation, and the strict parameter introduced in Python 3.10.
What Does zip() Do?
The zip() function takes one or more iterables as arguments and returns an iterator of tuples. Each tuple contains the corresponding elements from all the input iterables, matched by position.
names = ['Alice', 'Bob', 'Charlie']
ages = [30, 25, 35]
zipped = zip(names, ages)
print(list(zipped))
# [('Alice', 30), ('Bob', 25), ('Charlie', 35)]The function signature is straightforward:
zip(*iterables) # Python 3.0+
zip(*iterables, strict=False) # Python 3.10+zip() works with any iterable type -- lists, tuples, strings, ranges, generators, dictionaries, and file objects. It returns a lazy iterator, meaning it generates tuples on demand rather than building the entire result in memory.
Basic Usage: Combining Two Lists
The most common use case is pairing elements from two lists in a for loop:
cities = ['Tokyo', 'Paris', 'New York']
populations = [13960000, 2161000, 8336000]
for city, pop in zip(cities, populations):
print(f"{city}: {pop:,} people")
# Output:
# Tokyo: 13,960,000 people
# Paris: 2,161,000 people
# New York: 8,336,000 peopleCompare this to the index-based approach:
# Without zip (less readable)
for i in range(len(cities)):
print(f"{cities[i]}: {populations[i]:,} people")
# With zip (Pythonic)
for city, pop in zip(cities, populations):
print(f"{city}: {pop:,} people")The zip() version eliminates the index variable entirely. You work directly with the values, which makes the code easier to read and less prone to errors.
How zip() Handles Unequal Lengths
When the input iterables have different lengths, zip() stops at the shortest one. Extra elements in longer iterables are silently ignored.
letters = ['a', 'b', 'c', 'd', 'e']
numbers = [1, 2, 3]
result = list(zip(letters, numbers))
print(result)
# [('a', 1), ('b', 2), ('c', 3)]
# 'd' and 'e' are dropped silentlyThis truncation behavior is useful when you know the shorter list defines the valid range. But it can also hide bugs -- if you expected both lists to have equal length, a silent truncation means you lose data without any warning.
The strict Parameter (Python 3.10+)
Python 3.10 introduced the strict parameter to catch length mismatches:
keys = ['name', 'age', 'email']
values = ['Alice', 30]
# Default behavior: silently truncates
print(list(zip(keys, values)))
# [('name', 'Alice'), ('age', 30)]
# With strict=True: raises ValueError
try:
list(zip(keys, values, strict=True))
except ValueError as e:
print(e)
# zip() has arguments with different lengthsUse strict=True when you expect all iterables to have the same length. This is especially valuable in data processing pipelines where a length mismatch indicates corrupted or incomplete data.
| Behavior | Default (strict=False) | strict=True (3.10+) |
|---|---|---|
| Equal lengths | Returns all pairs | Returns all pairs |
| Unequal lengths | Truncates to shortest | Raises ValueError |
| Use case | When truncation is intentional | When lengths must match |
| Safety | Silent data loss possible | Explicit error on mismatch |
zip_longest from itertools
When you want to keep all elements -- even from longer iterables -- use itertools.zip_longest(). It pads shorter iterables with a fill value (default is None).
from itertools import zip_longest
names = ['Alice', 'Bob', 'Charlie', 'Diana']
scores = [85, 92]
# Default fill value is None
result = list(zip_longest(names, scores))
print(result)
# [('Alice', 85), ('Bob', 92), ('Charlie', None), ('Diana', None)]
# Custom fill value
result = list(zip_longest(names, scores, fillvalue=0))
print(result)
# [('Alice', 85), ('Bob', 92), ('Charlie', 0), ('Diana', 0)]This is useful when processing data that may have missing values, or when you are aligning sequences that should eventually have the same length.
from itertools import zip_longest
# Aligning columns of text with different row counts
col1 = ['Name', 'Alice', 'Bob']
col2 = ['Score', '85', '92', '78', '95']
for left, right in zip_longest(col1, col2, fillvalue=''):
print(f"{left:<10} {right}")
# Output:
# Name Score
# Alice 85
# Bob 92
# 78
# 95Unzipping with zip(*zipped)
The zip() function can reverse itself using the unpacking operator *. This pattern is commonly called "unzipping":
pairs = [('Alice', 85), ('Bob', 92), ('Charlie', 78)]
# Unzip into separate tuples
names, scores = zip(*pairs)
print(names)
# ('Alice', 'Bob', 'Charlie')
print(scores)
# (85, 92, 78)The * operator unpacks the list of tuples into separate arguments for zip(). So zip(*pairs) is equivalent to zip(('Alice', 85), ('Bob', 92), ('Charlie', 78)), which groups first elements together and second elements together.
Note that the result is tuples, not lists. If you need lists, convert them:
names_list = list(names)
scores_list = list(scores)Round-trip: Zip and Unzip
A useful property is that zipping and unzipping are inverse operations:
a = [1, 2, 3]
b = ['x', 'y', 'z']
# Zip then unzip
zipped = list(zip(a, b))
a_restored, b_restored = zip(*zipped)
print(list(a_restored)) # [1, 2, 3]
print(list(b_restored)) # ['x', 'y', 'z']Creating Dictionaries with zip()
One of the most practical uses of zip() is building dictionaries from two parallel lists -- one for keys and one for values:
countries = ['Japan', 'France', 'Brazil', 'Canada']
capitals = ['Tokyo', 'Paris', 'Brasilia', 'Ottawa']
country_capitals = dict(zip(countries, capitals))
print(country_capitals)
# {'Japan': 'Tokyo', 'France': 'Paris', 'Brazil': 'Brasilia', 'Canada': 'Ottawa'}This pattern is concise, readable, and fast. It works because dict() accepts an iterable of key-value pairs, and zip() produces exactly that.
Dictionary with Computed Values
Combine zip() with list comprehensions or transformations:
products = ['Widget', 'Gadget', 'Doohickey']
base_prices = [10.0, 25.0, 5.0]
# Apply 20% markup
catalog = dict(zip(products, [p * 1.2 for p in base_prices]))
print(catalog)
# {'Widget': 12.0, 'Gadget': 30.0, 'Doohickey': 6.0}
# Or use a dictionary comprehension with zip
catalog = {name: price * 1.2 for name, price in zip(products, base_prices)}
print(catalog)
# {'Widget': 12.0, 'Gadget': 30.0, 'Doohickey': 6.0}Merging Two Dictionaries by Aligned Keys
keys = ['a', 'b', 'c']
english = {'a': 'apple', 'b': 'banana', 'c': 'cherry'}
spanish = {'a': 'manzana', 'b': 'platano', 'c': 'cereza'}
bilingual = {
k: (english[k], spanish[k])
for k in keys
}
print(bilingual)
# {'a': ('apple', 'manzana'), 'b': ('banana', 'platano'), 'c': ('cherry', 'cereza')}Combining zip() with enumerate()
When you need both the index and paired values, combine enumerate() with zip():
students = ['Alice', 'Bob', 'Charlie']
grades = ['A', 'B+', 'A-']
for rank, (student, grade) in enumerate(zip(students, grades), start=1):
print(f"#{rank}: {student} - Grade {grade}")
# Output:
# #1: Alice - Grade A
# #2: Bob - Grade B+
# #3: Charlie - Grade A-This pattern is common in reporting and display logic where you need a numbered list of paired data.
# Building a numbered lookup table
headers = ['Name', 'Age', 'City']
row = ['Alice', 30, 'Tokyo']
for i, (header, value) in enumerate(zip(headers, row)):
print(f" Column {i}: {header} = {value}")
# Output:
# Column 0: Name = Alice
# Column 1: Age = 30
# Column 2: City = TokyoTransposing a Matrix with zip()
One of the most elegant uses of zip() is transposing a matrix -- converting rows into columns and columns into rows:
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
transposed = list(zip(*matrix))
print(transposed)
# [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
# Convert inner tuples to lists if needed
transposed_lists = [list(row) for row in zip(*matrix)]
print(transposed_lists)
# [[1, 4, 7], [2, 5, 8], [3, 6, 9]]The *matrix unpacks the rows as separate arguments to zip(), and zip() groups the elements by column position. This is a one-liner that replaces a nested loop:
# The equivalent nested loop approach
rows = len(matrix)
cols = len(matrix[0])
transposed_manual = []
for c in range(cols):
new_row = []
for r in range(rows):
new_row.append(matrix[r][c])
transposed_manual.append(new_row)The zip(*matrix) approach is shorter, faster, and more Pythonic.
zip() with More Than Two Iterables
zip() accepts any number of iterables, not just two:
names = ['Alice', 'Bob', 'Charlie']
ages = [30, 25, 35]
cities = ['Tokyo', 'Paris', 'New York']
roles = ['Engineer', 'Designer', 'Manager']
for name, age, city, role in zip(names, ages, cities, roles):
print(f"{name}, {age}, {city} - {role}")
# Output:
# Alice, 30, Tokyo - Engineer
# Bob, 25, Paris - Designer
# Charlie, 35, New York - ManagerThis scales cleanly to any number of parallel sequences. The truncation rule still applies -- the shortest iterable determines the output length.
Building Records from Parallel Lists
fields = ['name', 'age', 'city', 'role']
values_list = [
['Alice', 30, 'Tokyo', 'Engineer'],
['Bob', 25, 'Paris', 'Designer'],
['Charlie', 35, 'New York', 'Manager'],
]
records = [dict(zip(fields, values)) for values in values_list]
for record in records:
print(record)
# Output:
# {'name': 'Alice', 'age': 30, 'city': 'Tokyo', 'role': 'Engineer'}
# {'name': 'Bob', 'age': 25, 'city': 'Paris', 'role': 'Designer'}
# {'name': 'Charlie', 'age': 35, 'city': 'New York', 'role': 'Manager'}This pattern is frequently used when parsing CSV data or API responses where headers and rows come as separate lists.
Performance Characteristics
zip() is implemented in C in CPython, making it highly efficient. It returns a lazy iterator, so it consumes minimal memory regardless of input size.
| Aspect | zip() | Manual Index Loop | List Comprehension with Indexing |
|---|---|---|---|
| Memory | O(1) -- lazy iterator | O(1) | O(n) if creating list |
| Speed | Fast (C implementation) | Slower (Python-level indexing) | Medium |
| Readability | High | Low | Medium |
| Handles generators | Yes | No (needs len()) | No (needs len()) |
| Multiple iterables | Any number | Error-prone with multiple | Verbose |
import timeit
data_a = list(range(100_000))
data_b = list(range(100_000))
# Time zip-based iteration
time_zip = timeit.timeit(
'for a, b in zip(data_a, data_b): pass',
globals={'data_a': data_a, 'data_b': data_b},
number=100
)
# Time index-based iteration
time_index = timeit.timeit(
'for i in range(len(data_a)): a, b = data_a[i], data_b[i]',
globals={'data_a': data_a, 'data_b': data_b},
number=100
)
print(f"zip: {time_zip:.4f}s")
print(f"index: {time_index:.4f}s")
# zip is typically 20-40% faster due to fewer lookupsThe speed advantage of zip() comes from avoiding repeated __getitem__ calls. Each index lookup in data_a[i] involves a Python method call, while zip() yields values directly from the underlying iterator protocol at the C level.
Common Patterns and Recipes
Sliding Window (Pairs of Consecutive Elements)
data = [10, 20, 30, 40, 50]
# Pair each element with its successor
for current, next_val in zip(data, data[1:]):
print(f"{current} -> {next_val}")
# Output:
# 10 -> 20
# 20 -> 30
# 30 -> 40
# 40 -> 50Calculating Differences Between Consecutive Elements
prices = [100, 105, 102, 110, 108]
changes = [b - a for a, b in zip(prices, prices[1:])]
print(changes)
# [5, -3, 8, -2]Grouping Elements in Chunks
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
# Group into triples using zip with iterators
it = iter(data)
chunks = list(zip(it, it, it))
print(chunks)
# [(1, 2, 3), (4, 5, 6), (7, 8, 9)]This works because zip() pulls one element at a time from the same iterator, advancing it three times per tuple. Note that elements that do not fit into a complete chunk are dropped.
Dot Product of Two Vectors
vector_a = [1, 2, 3]
vector_b = [4, 5, 6]
dot_product = sum(a * b for a, b in zip(vector_a, vector_b))
print(dot_product)
# 32 (1*4 + 2*5 + 3*6)Comparing Two Lists Element-wise
expected = [10, 20, 30, 40]
actual = [10, 21, 30, 39]
mismatches = [
(i, exp, act)
for i, (exp, act) in enumerate(zip(expected, actual))
if exp != act
]
print(mismatches)
# [(1, 20, 21), (3, 40, 39)]Interleaving Two Lists
a = [1, 3, 5]
b = [2, 4, 6]
interleaved = [val for pair in zip(a, b) for val in pair]
print(interleaved)
# [1, 2, 3, 4, 5, 6]Using zip() with Pandas DataFrames
When working with DataFrames, zip() is useful for creating new columns from multiple existing columns without using apply():
import pandas as pd
df = pd.DataFrame({
'first_name': ['Alice', 'Bob', 'Charlie'],
'last_name': ['Smith', 'Jones', 'Brown'],
'score': [85, 92, 78]
})
# Create full name column using zip (faster than apply)
df['full_name'] = [
f"{first} {last}"
for first, last in zip(df['first_name'], df['last_name'])
]
print(df)This approach avoids the overhead of df.apply() with a lambda and runs significantly faster on large DataFrames.
If you work in Jupyter notebooks and want to experiment with these zip() patterns interactively, RunCell (opens in a new tab) provides an AI-powered environment that understands your notebook context. It can suggest vectorized alternatives when you are using zip() over DataFrame columns, and help you debug iteration patterns by inspecting variable states at each step.
FAQ
What does the zip() function do in Python?
The zip() function takes two or more iterables (lists, tuples, strings, etc.) and returns an iterator of tuples. Each tuple groups the elements from the input iterables that share the same position index. For example, zip([1, 2], ['a', 'b']) produces [(1, 'a'), (2, 'b')]. It is the standard tool for iterating over multiple sequences in parallel without manual indexing.
What happens when zip() receives lists of different lengths?
By default, zip() stops when the shortest iterable is exhausted. Extra elements in longer iterables are silently discarded. In Python 3.10 and later, you can pass strict=True to raise a ValueError if the lengths differ. Alternatively, use itertools.zip_longest() to pad shorter iterables with a fill value instead of truncating.
How do you unzip a list of tuples in Python?
Use the unpacking operator * with zip(). If you have pairs = [(1, 'a'), (2, 'b'), (3, 'c')], then a, b = zip(*pairs) produces a = (1, 2, 3) and b = ('a', 'b', 'c'). The * unpacks each tuple as a separate argument to zip(), effectively transposing the data from row-oriented to column-oriented.
Is zip() memory efficient for large datasets?
Yes. The zip() function returns a lazy iterator that generates one tuple at a time on demand. It does not store the entire result in memory. This means you can zip() two million-element lists without creating a million tuples upfront. Memory usage stays constant regardless of input size, as long as you iterate rather than converting to a list.
Can you use zip() to create a dictionary?
Yes, and it is one of the most common patterns. Pass a zip() of keys and values to the dict() constructor: dict(zip(keys, values)). This creates a dictionary where each key from the first list is mapped to the corresponding value from the second list. It is concise, readable, and the standard Pythonic way to build dictionaries from parallel lists.
Conclusion
Python's zip() function is a fundamental building block for working with parallel sequences. It replaces verbose index-based loops with clean, readable code that directly expresses the intent of pairing elements together. Whether you are creating dictionaries from key-value lists, transposing matrices, computing dot products, or iterating over multiple columns of data, zip() provides an elegant and efficient solution.
The key points to remember:
zip()pairs elements by position and returns a lazy iterator of tuples.- It truncates to the shortest iterable by default. Use
strict=True(Python 3.10+) to catch mismatches, orzip_longest()to pad shorter sequences. - Unzipping with
zip(*data)reverses the operation. dict(zip(keys, values))is the standard way to build dictionaries from parallel lists.zip()is implemented in C and is faster than manual index-based iteration.- It works with any number of iterables and any iterable type, including generators and file objects.
Master these patterns and zip() becomes one of the most versatile tools in your Python toolkit -- simplifying code, reducing bugs, and making your intent clear to every reader.