Skip to content

Python zip() Function: Combine Iterables with Examples

Updated on

Working with multiple related lists is one of the most common tasks in Python. You have a list of names and a list of scores. A list of dates and a list of prices. A list of keys and a list of values. The natural approach is to use index-based loops with range(len()), but that creates cluttered code full of bracket indexing, makes the intent hard to read, and introduces opportunities for off-by-one errors. When your lists have different lengths, the problem gets worse -- you need boundary checks and defensive logic that obscures the actual work you are trying to do.

Python's built-in zip() function solves this cleanly. It takes two or more iterables and pairs their elements together into tuples, producing an iterator you can loop over directly. No indexing. No counters. No length checks. This guide covers every aspect of zip() -- from basic pairing to advanced patterns like matrix transposition, dictionary creation, and the strict parameter introduced in Python 3.10.

📚

What Does zip() Do?

The zip() function takes one or more iterables as arguments and returns an iterator of tuples. Each tuple contains the corresponding elements from all the input iterables, matched by position.

names = ['Alice', 'Bob', 'Charlie']
ages = [30, 25, 35]
 
zipped = zip(names, ages)
print(list(zipped))
# [('Alice', 30), ('Bob', 25), ('Charlie', 35)]

The function signature is straightforward:

zip(*iterables)           # Python 3.0+
zip(*iterables, strict=False)  # Python 3.10+

zip() works with any iterable type -- lists, tuples, strings, ranges, generators, dictionaries, and file objects. It returns a lazy iterator, meaning it generates tuples on demand rather than building the entire result in memory.

Basic Usage: Combining Two Lists

The most common use case is pairing elements from two lists in a for loop:

cities = ['Tokyo', 'Paris', 'New York']
populations = [13960000, 2161000, 8336000]
 
for city, pop in zip(cities, populations):
    print(f"{city}: {pop:,} people")
 
# Output:
# Tokyo: 13,960,000 people
# Paris: 2,161,000 people
# New York: 8,336,000 people

Compare this to the index-based approach:

# Without zip (less readable)
for i in range(len(cities)):
    print(f"{cities[i]}: {populations[i]:,} people")
 
# With zip (Pythonic)
for city, pop in zip(cities, populations):
    print(f"{city}: {pop:,} people")

The zip() version eliminates the index variable entirely. You work directly with the values, which makes the code easier to read and less prone to errors.

How zip() Handles Unequal Lengths

When the input iterables have different lengths, zip() stops at the shortest one. Extra elements in longer iterables are silently ignored.

letters = ['a', 'b', 'c', 'd', 'e']
numbers = [1, 2, 3]
 
result = list(zip(letters, numbers))
print(result)
# [('a', 1), ('b', 2), ('c', 3)]
# 'd' and 'e' are dropped silently

This truncation behavior is useful when you know the shorter list defines the valid range. But it can also hide bugs -- if you expected both lists to have equal length, a silent truncation means you lose data without any warning.

The strict Parameter (Python 3.10+)

Python 3.10 introduced the strict parameter to catch length mismatches:

keys = ['name', 'age', 'email']
values = ['Alice', 30]
 
# Default behavior: silently truncates
print(list(zip(keys, values)))
# [('name', 'Alice'), ('age', 30)]
 
# With strict=True: raises ValueError
try:
    list(zip(keys, values, strict=True))
except ValueError as e:
    print(e)
# zip() has arguments with different lengths

Use strict=True when you expect all iterables to have the same length. This is especially valuable in data processing pipelines where a length mismatch indicates corrupted or incomplete data.

BehaviorDefault (strict=False)strict=True (3.10+)
Equal lengthsReturns all pairsReturns all pairs
Unequal lengthsTruncates to shortestRaises ValueError
Use caseWhen truncation is intentionalWhen lengths must match
SafetySilent data loss possibleExplicit error on mismatch

zip_longest from itertools

When you want to keep all elements -- even from longer iterables -- use itertools.zip_longest(). It pads shorter iterables with a fill value (default is None).

from itertools import zip_longest
 
names = ['Alice', 'Bob', 'Charlie', 'Diana']
scores = [85, 92]
 
# Default fill value is None
result = list(zip_longest(names, scores))
print(result)
# [('Alice', 85), ('Bob', 92), ('Charlie', None), ('Diana', None)]
 
# Custom fill value
result = list(zip_longest(names, scores, fillvalue=0))
print(result)
# [('Alice', 85), ('Bob', 92), ('Charlie', 0), ('Diana', 0)]

This is useful when processing data that may have missing values, or when you are aligning sequences that should eventually have the same length.

from itertools import zip_longest
 
# Aligning columns of text with different row counts
col1 = ['Name', 'Alice', 'Bob']
col2 = ['Score', '85', '92', '78', '95']
 
for left, right in zip_longest(col1, col2, fillvalue=''):
    print(f"{left:<10} {right}")
 
# Output:
# Name       Score
# Alice      85
# Bob        92
#            78
#            95

Unzipping with zip(*zipped)

The zip() function can reverse itself using the unpacking operator *. This pattern is commonly called "unzipping":

pairs = [('Alice', 85), ('Bob', 92), ('Charlie', 78)]
 
# Unzip into separate tuples
names, scores = zip(*pairs)
 
print(names)
# ('Alice', 'Bob', 'Charlie')
 
print(scores)
# (85, 92, 78)

The * operator unpacks the list of tuples into separate arguments for zip(). So zip(*pairs) is equivalent to zip(('Alice', 85), ('Bob', 92), ('Charlie', 78)), which groups first elements together and second elements together.

Note that the result is tuples, not lists. If you need lists, convert them:

names_list = list(names)
scores_list = list(scores)

Round-trip: Zip and Unzip

A useful property is that zipping and unzipping are inverse operations:

a = [1, 2, 3]
b = ['x', 'y', 'z']
 
# Zip then unzip
zipped = list(zip(a, b))
a_restored, b_restored = zip(*zipped)
 
print(list(a_restored))  # [1, 2, 3]
print(list(b_restored))  # ['x', 'y', 'z']

Creating Dictionaries with zip()

One of the most practical uses of zip() is building dictionaries from two parallel lists -- one for keys and one for values:

countries = ['Japan', 'France', 'Brazil', 'Canada']
capitals = ['Tokyo', 'Paris', 'Brasilia', 'Ottawa']
 
country_capitals = dict(zip(countries, capitals))
print(country_capitals)
# {'Japan': 'Tokyo', 'France': 'Paris', 'Brazil': 'Brasilia', 'Canada': 'Ottawa'}

This pattern is concise, readable, and fast. It works because dict() accepts an iterable of key-value pairs, and zip() produces exactly that.

Dictionary with Computed Values

Combine zip() with list comprehensions or transformations:

products = ['Widget', 'Gadget', 'Doohickey']
base_prices = [10.0, 25.0, 5.0]
 
# Apply 20% markup
catalog = dict(zip(products, [p * 1.2 for p in base_prices]))
print(catalog)
# {'Widget': 12.0, 'Gadget': 30.0, 'Doohickey': 6.0}
 
# Or use a dictionary comprehension with zip
catalog = {name: price * 1.2 for name, price in zip(products, base_prices)}
print(catalog)
# {'Widget': 12.0, 'Gadget': 30.0, 'Doohickey': 6.0}

Merging Two Dictionaries by Aligned Keys

keys = ['a', 'b', 'c']
english = {'a': 'apple', 'b': 'banana', 'c': 'cherry'}
spanish = {'a': 'manzana', 'b': 'platano', 'c': 'cereza'}
 
bilingual = {
    k: (english[k], spanish[k])
    for k in keys
}
print(bilingual)
# {'a': ('apple', 'manzana'), 'b': ('banana', 'platano'), 'c': ('cherry', 'cereza')}

Combining zip() with enumerate()

When you need both the index and paired values, combine enumerate() with zip():

students = ['Alice', 'Bob', 'Charlie']
grades = ['A', 'B+', 'A-']
 
for rank, (student, grade) in enumerate(zip(students, grades), start=1):
    print(f"#{rank}: {student} - Grade {grade}")
 
# Output:
# #1: Alice - Grade A
# #2: Bob - Grade B+
# #3: Charlie - Grade A-

This pattern is common in reporting and display logic where you need a numbered list of paired data.

# Building a numbered lookup table
headers = ['Name', 'Age', 'City']
row = ['Alice', 30, 'Tokyo']
 
for i, (header, value) in enumerate(zip(headers, row)):
    print(f"  Column {i}: {header} = {value}")
 
# Output:
#   Column 0: Name = Alice
#   Column 1: Age = 30
#   Column 2: City = Tokyo

Transposing a Matrix with zip()

One of the most elegant uses of zip() is transposing a matrix -- converting rows into columns and columns into rows:

matrix = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]
 
transposed = list(zip(*matrix))
print(transposed)
# [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
 
# Convert inner tuples to lists if needed
transposed_lists = [list(row) for row in zip(*matrix)]
print(transposed_lists)
# [[1, 4, 7], [2, 5, 8], [3, 6, 9]]

The *matrix unpacks the rows as separate arguments to zip(), and zip() groups the elements by column position. This is a one-liner that replaces a nested loop:

# The equivalent nested loop approach
rows = len(matrix)
cols = len(matrix[0])
transposed_manual = []
for c in range(cols):
    new_row = []
    for r in range(rows):
        new_row.append(matrix[r][c])
    transposed_manual.append(new_row)

The zip(*matrix) approach is shorter, faster, and more Pythonic.

zip() with More Than Two Iterables

zip() accepts any number of iterables, not just two:

names = ['Alice', 'Bob', 'Charlie']
ages = [30, 25, 35]
cities = ['Tokyo', 'Paris', 'New York']
roles = ['Engineer', 'Designer', 'Manager']
 
for name, age, city, role in zip(names, ages, cities, roles):
    print(f"{name}, {age}, {city} - {role}")
 
# Output:
# Alice, 30, Tokyo - Engineer
# Bob, 25, Paris - Designer
# Charlie, 35, New York - Manager

This scales cleanly to any number of parallel sequences. The truncation rule still applies -- the shortest iterable determines the output length.

Building Records from Parallel Lists

fields = ['name', 'age', 'city', 'role']
values_list = [
    ['Alice', 30, 'Tokyo', 'Engineer'],
    ['Bob', 25, 'Paris', 'Designer'],
    ['Charlie', 35, 'New York', 'Manager'],
]
 
records = [dict(zip(fields, values)) for values in values_list]
for record in records:
    print(record)
 
# Output:
# {'name': 'Alice', 'age': 30, 'city': 'Tokyo', 'role': 'Engineer'}
# {'name': 'Bob', 'age': 25, 'city': 'Paris', 'role': 'Designer'}
# {'name': 'Charlie', 'age': 35, 'city': 'New York', 'role': 'Manager'}

This pattern is frequently used when parsing CSV data or API responses where headers and rows come as separate lists.

Performance Characteristics

zip() is implemented in C in CPython, making it highly efficient. It returns a lazy iterator, so it consumes minimal memory regardless of input size.

Aspectzip()Manual Index LoopList Comprehension with Indexing
MemoryO(1) -- lazy iteratorO(1)O(n) if creating list
SpeedFast (C implementation)Slower (Python-level indexing)Medium
ReadabilityHighLowMedium
Handles generatorsYesNo (needs len())No (needs len())
Multiple iterablesAny numberError-prone with multipleVerbose
import timeit
 
data_a = list(range(100_000))
data_b = list(range(100_000))
 
# Time zip-based iteration
time_zip = timeit.timeit(
    'for a, b in zip(data_a, data_b): pass',
    globals={'data_a': data_a, 'data_b': data_b},
    number=100
)
 
# Time index-based iteration
time_index = timeit.timeit(
    'for i in range(len(data_a)): a, b = data_a[i], data_b[i]',
    globals={'data_a': data_a, 'data_b': data_b},
    number=100
)
 
print(f"zip:   {time_zip:.4f}s")
print(f"index: {time_index:.4f}s")
# zip is typically 20-40% faster due to fewer lookups

The speed advantage of zip() comes from avoiding repeated __getitem__ calls. Each index lookup in data_a[i] involves a Python method call, while zip() yields values directly from the underlying iterator protocol at the C level.

Common Patterns and Recipes

Sliding Window (Pairs of Consecutive Elements)

data = [10, 20, 30, 40, 50]
 
# Pair each element with its successor
for current, next_val in zip(data, data[1:]):
    print(f"{current} -> {next_val}")
 
# Output:
# 10 -> 20
# 20 -> 30
# 30 -> 40
# 40 -> 50

Calculating Differences Between Consecutive Elements

prices = [100, 105, 102, 110, 108]
 
changes = [b - a for a, b in zip(prices, prices[1:])]
print(changes)
# [5, -3, 8, -2]

Grouping Elements in Chunks

data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
 
# Group into triples using zip with iterators
it = iter(data)
chunks = list(zip(it, it, it))
print(chunks)
# [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

This works because zip() pulls one element at a time from the same iterator, advancing it three times per tuple. Note that elements that do not fit into a complete chunk are dropped.

Dot Product of Two Vectors

vector_a = [1, 2, 3]
vector_b = [4, 5, 6]
 
dot_product = sum(a * b for a, b in zip(vector_a, vector_b))
print(dot_product)
# 32  (1*4 + 2*5 + 3*6)

Comparing Two Lists Element-wise

expected = [10, 20, 30, 40]
actual = [10, 21, 30, 39]
 
mismatches = [
    (i, exp, act)
    for i, (exp, act) in enumerate(zip(expected, actual))
    if exp != act
]
print(mismatches)
# [(1, 20, 21), (3, 40, 39)]

Interleaving Two Lists

a = [1, 3, 5]
b = [2, 4, 6]
 
interleaved = [val for pair in zip(a, b) for val in pair]
print(interleaved)
# [1, 2, 3, 4, 5, 6]

Using zip() with Pandas DataFrames

When working with DataFrames, zip() is useful for creating new columns from multiple existing columns without using apply():

import pandas as pd
 
df = pd.DataFrame({
    'first_name': ['Alice', 'Bob', 'Charlie'],
    'last_name': ['Smith', 'Jones', 'Brown'],
    'score': [85, 92, 78]
})
 
# Create full name column using zip (faster than apply)
df['full_name'] = [
    f"{first} {last}"
    for first, last in zip(df['first_name'], df['last_name'])
]
print(df)

This approach avoids the overhead of df.apply() with a lambda and runs significantly faster on large DataFrames.

If you work in Jupyter notebooks and want to experiment with these zip() patterns interactively, RunCell (opens in a new tab) provides an AI-powered environment that understands your notebook context. It can suggest vectorized alternatives when you are using zip() over DataFrame columns, and help you debug iteration patterns by inspecting variable states at each step.

FAQ

What does the zip() function do in Python?

The zip() function takes two or more iterables (lists, tuples, strings, etc.) and returns an iterator of tuples. Each tuple groups the elements from the input iterables that share the same position index. For example, zip([1, 2], ['a', 'b']) produces [(1, 'a'), (2, 'b')]. It is the standard tool for iterating over multiple sequences in parallel without manual indexing.

What happens when zip() receives lists of different lengths?

By default, zip() stops when the shortest iterable is exhausted. Extra elements in longer iterables are silently discarded. In Python 3.10 and later, you can pass strict=True to raise a ValueError if the lengths differ. Alternatively, use itertools.zip_longest() to pad shorter iterables with a fill value instead of truncating.

How do you unzip a list of tuples in Python?

Use the unpacking operator * with zip(). If you have pairs = [(1, 'a'), (2, 'b'), (3, 'c')], then a, b = zip(*pairs) produces a = (1, 2, 3) and b = ('a', 'b', 'c'). The * unpacks each tuple as a separate argument to zip(), effectively transposing the data from row-oriented to column-oriented.

Is zip() memory efficient for large datasets?

Yes. The zip() function returns a lazy iterator that generates one tuple at a time on demand. It does not store the entire result in memory. This means you can zip() two million-element lists without creating a million tuples upfront. Memory usage stays constant regardless of input size, as long as you iterate rather than converting to a list.

Can you use zip() to create a dictionary?

Yes, and it is one of the most common patterns. Pass a zip() of keys and values to the dict() constructor: dict(zip(keys, values)). This creates a dictionary where each key from the first list is mapped to the corresponding value from the second list. It is concise, readable, and the standard Pythonic way to build dictionaries from parallel lists.

Conclusion

Python's zip() function is a fundamental building block for working with parallel sequences. It replaces verbose index-based loops with clean, readable code that directly expresses the intent of pairing elements together. Whether you are creating dictionaries from key-value lists, transposing matrices, computing dot products, or iterating over multiple columns of data, zip() provides an elegant and efficient solution.

The key points to remember:

  • zip() pairs elements by position and returns a lazy iterator of tuples.
  • It truncates to the shortest iterable by default. Use strict=True (Python 3.10+) to catch mismatches, or zip_longest() to pad shorter sequences.
  • Unzipping with zip(*data) reverses the operation.
  • dict(zip(keys, values)) is the standard way to build dictionaries from parallel lists.
  • zip() is implemented in C and is faster than manual index-based iteration.
  • It works with any number of iterables and any iterable type, including generators and file objects.

Master these patterns and zip() becomes one of the most versatile tools in your Python toolkit -- simplifying code, reducing bugs, and making your intent clear to every reader.

📚