Skip to content

NumPy Concatenate: Join Arrays with np.concatenate, vstack, and hstack

Updated on

Combining arrays is one of the most common operations in numerical computing. You split data for processing, then need to reassemble it. You load features from multiple sources into a single matrix. You stack predictions from multiple models for ensembling -- for example, combining outputs from a Random Forest ensemble. Using Python lists to combine arrays with loops or + destroys performance and creates copies unnecessarily.

NumPy provides specialized functions for array joining that operate at C speed and give you precise control over which axis to concatenate along. This guide covers np.concatenate(), np.vstack(), np.hstack(), np.stack(), and their use cases.

📚

np.concatenate() -- The Core Function

np.concatenate() joins a sequence of arrays along an existing axis.

1D Arrays

import numpy as np
 
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.array([7, 8, 9])
 
result = np.concatenate([a, b, c])
print(result)  # [1 2 3 4 5 6 7 8 9]
print(result.shape)  # (9,)

2D Arrays Along Rows (axis=0)

import numpy as np
 
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
 
# Stack vertically (add rows)
result = np.concatenate([a, b], axis=0)
print(result)
# [[1 2]
#  [3 4]
#  [5 6]
#  [7 8]]
print(result.shape)  # (4, 2)

2D Arrays Along Columns (axis=1)

import numpy as np
 
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
 
# Stack horizontally (add columns)
result = np.concatenate([a, b], axis=1)
print(result)
# [[1 2 5 6]
#  [3 4 7 8]]
print(result.shape)  # (2, 4)

Shape Requirements

All arrays must have the same shape except along the concatenation axis.

import numpy as np
 
a = np.ones((3, 4))    # 3 rows, 4 cols
b = np.zeros((2, 4))   # 2 rows, 4 cols
 
# OK: different row count, same column count, axis=0
result = np.concatenate([a, b], axis=0)
print(result.shape)  # (5, 4)
 
# ERROR: different column counts
c = np.ones((3, 5))
# np.concatenate([a, c], axis=0)  # ValueError: dimensions don't match

np.vstack() -- Vertical Stacking

np.vstack() stacks arrays vertically (along axis 0). Equivalent to np.concatenate(arrays, axis=0) but also handles 1D arrays by treating them as single rows.

import numpy as np
 
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
 
# vstack treats 1D arrays as rows
result = np.vstack([a, b])
print(result)
# [[1 2 3]
#  [4 5 6]]
print(result.shape)  # (2, 3)
 
# With 2D arrays
c = np.array([[1, 2], [3, 4]])
d = np.array([[5, 6]])
result = np.vstack([c, d])
print(result)
# [[1 2]
#  [3 4]
#  [5 6]]

np.hstack() -- Horizontal Stacking

np.hstack() stacks arrays horizontally (along axis 1 for 2D, axis 0 for 1D).

import numpy as np
 
# 1D arrays: concatenates like np.concatenate
a = np.array([1, 2, 3])
b = np.array([4, 5])
result = np.hstack([a, b])
print(result)  # [1 2 3 4 5]
 
# 2D arrays: adds columns
c = np.array([[1], [2], [3]])
d = np.array([[4], [5], [6]])
result = np.hstack([c, d])
print(result)
# [[1 4]
#  [2 5]
#  [3 6]]

np.stack() -- Creates a New Axis

Unlike concatenate, stack joins arrays along a new axis, increasing the dimension count by 1.

import numpy as np
 
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
 
# Stack along new axis 0 (default)
result = np.stack([a, b])
print(result)
# [[1 2 3]
#  [4 5 6]]
print(result.shape)  # (2, 3)
 
# Stack along new axis 1
result = np.stack([a, b], axis=1)
print(result)
# [[1 4]
#  [2 5]
#  [3 6]]
print(result.shape)  # (3, 2)

All input arrays must have the same shape for np.stack().

Function Comparison

FunctionAxisCreates New Axis?1D Input Handling
np.concatenateExisting (default 0)NoConcatenates as-is
np.vstack0 (vertical)NoTreats 1D as row
np.hstack1 (horizontal)NoConcatenates 1D arrays
np.stackNew axisYes (+1 dimension)Creates new dimension
np.column_stack1NoTreats 1D as column
np.row_stack0NoSame as vstack

Practical Examples

Combine Features for Machine Learning

import numpy as np
 
# Feature arrays from different sources
ages = np.array([25, 30, 35, 40])
incomes = np.array([50000, 60000, 75000, 90000])
scores = np.array([720, 680, 750, 800])
 
# Stack as columns to create feature matrix
X = np.column_stack([ages, incomes, scores])
print(X)
# [[   25 50000   720]
#  [   30 60000   680]
#  [   35 75000   750]
#  [   40 90000   800]]
print(X.shape)  # (4, 3)

Batch Processing Results

import numpy as np
 
# Process data in batches, combine results
results = []
for batch_start in range(0, 100, 25):
    batch = np.random.randn(25, 10)  # 25 samples, 10 features
    processed = batch * 2 + 1  # Some processing
    results.append(processed)
 
# Combine all batches vertically
all_results = np.vstack(results)
print(all_results.shape)  # (100, 10)

Image Processing -- Combine Channels

import numpy as np
 
height, width = 100, 100
 
# Separate color channels
red = np.random.randint(0, 256, (height, width))
green = np.random.randint(0, 256, (height, width))
blue = np.random.randint(0, 256, (height, width))
 
# Combine into RGB image
rgb_image = np.stack([red, green, blue], axis=2)
print(rgb_image.shape)  # (100, 100, 3)

Visualizing Combined Data

After concatenating arrays from different sources, PyGWalker (opens in a new tab) lets you explore the combined dataset interactively in Jupyter:

import pandas as pd
import pygwalker as pyg
 
# Convert concatenated array to DataFrame
df = pd.DataFrame(X, columns=['age', 'income', 'score'])
walker = pyg.walk(df)

FAQ

What is the difference between np.concatenate and np.stack?

np.concatenate joins arrays along an existing axis without changing the number of dimensions. np.stack joins arrays along a new axis, adding one dimension. For example, stacking two (3,) arrays with concatenate gives (6,), while stack gives (2, 3).

What does axis mean in np.concatenate?

The axis parameter specifies which dimension to join along. axis=0 concatenates along rows (adding more rows), axis=1 along columns (adding more columns). For 2D arrays, axis=0 is vertical stacking and axis=1 is horizontal stacking.

When should I use vstack vs hstack vs concatenate?

Use vstack to add rows (stack vertically), hstack to add columns (stack horizontally), and concatenate when you need to specify an arbitrary axis. vstack and hstack are convenience functions that handle 1D arrays more intuitively than concatenate.

Why do I get a ValueError when concatenating arrays?

All arrays must have the same shape except along the concatenation axis. If you're concatenating along axis=0, all arrays must have the same number of columns. Check shapes with array.shape before concatenating.

How do I concatenate arrays of different dimensions?

Reshape the arrays to have compatible dimensions first. Use np.expand_dims(array, axis) to add a dimension, or array.reshape(new_shape). For example, to vstack a 1D array with a 2D array, reshape the 1D array: array.reshape(1, -1).

Conclusion

NumPy provides a complete toolkit for joining arrays: np.concatenate() for general-purpose joining along any axis, np.vstack()/np.hstack() for intuitive vertical/horizontal stacking, and np.stack() when you need to create a new dimension. Remember that concatenate joins along existing axes while stack creates new ones. Use np.reshape when you need to fix dimension mismatches before concatenating. Generate the arrays themselves with np.arange or np.linspace. Match dimensions carefully, and use vstack/hstack for the most readable code when working with common row/column operations.

Related Guides

📚