DEV Community

Cover image for How to Master Joining and Splitting Numpy Arrays: A Comprehensive Guide
Lohith
Lohith

Posted on • Updated on

How to Master Joining and Splitting Numpy Arrays: A Comprehensive Guide

In this blog post we delve into the power and versatility of functions like concatenate, hstack, vstack, split, hsplit, and vsplit, unveiling how they streamline array manipulation and data organization.


numpy.concatenate is a function in the NumPy library used for joining arrays along a specified axis. It takes a sequence of arrays as input and concatenates them together. The axis parameter determines the axis along which the arrays will be joined.

Here's a breakdown of numpy.concatenate:

  • Syntax: numpy.concatenate((arrays), axis=0, out=None)

    • arrays: Sequence of arrays to be concatenated.
    • axis (optional): Specifies the axis along which the arrays will be joined. Default is 0.
    • out (optional): If provided, the result will be placed into this array. It must have the same shape as the expected output but the type will be cast if necessary.
  • Returns: The concatenated array.

Let's illustrate this with an example:

import numpy as np

# Creating two arrays
array1 = np.array([[1, 2, 3],
                   [4, 5, 6]])

array2 = np.array([[7, 8, 9],
                   [10, 11, 12]])

# Concatenating along axis 0 (vertical stacking)
result_vertical = np.concatenate((array1, array2), axis=0)

print("Result of vertical concatenation:")
print(result_vertical)
print()

# Concatenating along axis 1 (horizontal stacking)
result_horizontal = np.concatenate((array1, array2), axis=1)

print("Result of horizontal concatenation:")
print(result_horizontal)
Enter fullscreen mode Exit fullscreen mode
  • We create two NumPy arrays, array1 and array2, each containing two rows and three columns.
  • To concatenate these arrays, we use the np.concatenate function. We pass the arrays to be concatenated as a tuple (array1, array2), and specify the axis along which the concatenation will be performed.
  • When axis=0, the arrays are stacked vertically, meaning one is placed below the other. This is often referred to as vertical concatenation or stacking along rows.
  • When axis=1, the arrays are stacked horizontally, meaning one is placed beside the other. This is often referred to as horizontal concatenation or stacking along columns. Output:
Result of vertical concatenation:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

Result of horizontal concatenation:
[[ 1  2  3  7  8  9]
 [ 4  5  6 10 11 12]]
Enter fullscreen mode Exit fullscreen mode

This demonstrates how numpy.concatenate works to join arrays together. You can specify the axis parameter to change how the arrays are concatenated: 0 for vertical concatenation, 1 for horizontal concatenation, and so on for higher dimensions.


numpy.hstack is a function in the NumPy library of Python, used to stack arrays horizontally (i.e., column-wise) to create a single array. It's particularly useful when you want to concatenate arrays along the second axis, while keeping the number of rows unchanged.

Here's a simple example to illustrate its usage:

import numpy as np

# Creating two arrays
array1 = np.array([[1, 2, 3],
                    [4, 5, 6]])

array2 = np.array([[7, 8],
                    [9, 10]])

# Stacking arrays horizontally
stacked_array = np.hstack((array1, array2))

print("Array 1:")
print(array1)
print("\nArray 2:")
print(array2)
print("\nStacked Array:")
print(stacked_array)
Enter fullscreen mode Exit fullscreen mode

Output:

Array 1:
[[1 2 3]
 [4 5 6]]

Array 2:
[[ 7  8]
 [ 9 10]]

Stacked Array:
[[ 1  2  3  7  8]
 [ 4  5  6  9 10]]
Enter fullscreen mode Exit fullscreen mode

As you can see, np.hstack() stacks the arrays array1 and array2 horizontally, resulting in a new array where the elements of array2 are appended as new columns to array1.


numpy.vstack is a function in the NumPy library in Python that is used to vertically stack arrays. It takes a sequence of arrays as input and stacks them vertically to form a single array. This function is particularly useful when you want to concatenate arrays along the vertical axis.

Here's the syntax:

numpy.vstack(tup)
Enter fullscreen mode Exit fullscreen mode
  • tup: It is a sequence of arrays to be stacked vertically. They must have the same number of columns (the same shape along the second axis).

Here's a simple example demonstrating how numpy.vstack works:

import numpy as np

# Create two arrays
array1 = np.array([[1, 2, 3],
                    [4, 5, 6]])

array2 = np.array([[7, 8, 9],
                    [10, 11, 12]])

# Stack the arrays vertically
result = np.vstack((array1, array2))

print("Stacked array:")
print(result)
Enter fullscreen mode Exit fullscreen mode

Output:

Stacked array:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
Enter fullscreen mode Exit fullscreen mode

In this example, numpy.vstack vertically stacks array1 and array2, resulting in a new array where the rows of array2 are placed below the rows of array1.


numpy.split is a function in the NumPy library in Python used to split an array into multiple sub-arrays along a specified axis. It takes three parameters:

  1. ary: The array to be split.
  2. indices_or_sections: If it's an integer, then it indicates the number of equal partitions to create. If it's a 1-D array of sorted integers, it indicates the indices at which the array is split.
  3. axis: The axis along which the array is split. Default is 0 (along the rows).

Here's the syntax:

numpy.split(ary, indices_or_sections, axis=0)
Enter fullscreen mode Exit fullscreen mode

And here's an example to illustrate how numpy.split works:

import numpy as np

# Create an array
arr = np.arange(10)

# Split the array into three sub-arrays
sub_arrays = np.split(arr, 3)

print("Sub-arrays:")
for sub_arr in sub_arrays:
    print(sub_arr)
Enter fullscreen mode Exit fullscreen mode

Output:

Sub-arrays:
[0 1 2]
[3 4 5]
[6 7 8 9]
Enter fullscreen mode Exit fullscreen mode

In this example, the numpy.split function splits the array arr into three sub-arrays of equal size along the first axis (rows), since axis=0 by default. If you want to split along a different axis, you can specify the axis parameter.


numpy.hsplit and numpy.vsplit are functions provided by the NumPy library in Python for splitting arrays into multiple sub-arrays along horizontal and vertical axes, respectively.

  1. numpy.hsplit(array, indices_or_sections): This function splits an array horizontally (column-wise) into multiple sub-arrays. It takes two parameters:

    • array: The array to be split.
    • indices_or_sections: It can be an integer specifying the number of equally shaped sub-arrays to create, or it can be a sequence of integers indicating the column indices where the splits occur.
  2. numpy.vsplit(array, indices_or_sections): This function splits an array vertically (row-wise) into multiple sub-arrays. It also takes two parameters:

    • array: The array to be split.
    • indices_or_sections: Similar to hsplit, it can be an integer specifying the number of equally shaped sub-arrays to create, or it can be a sequence of integers indicating the row indices where the splits occur.

Here's an example demonstrating both functions:

import numpy as np

# Create a 2D array
arr = np.array([[1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12]])

# Using hsplit to split the array horizontally
hsplit_result = np.hsplit(arr, 2)  # Split into 2 equal parts horizontally
print("Horizontal split:")
for part in hsplit_result:
    print(part)

# Using vsplit to split the array vertically
vsplit_result = np.vsplit(arr, 3)  # Split into 3 equal parts vertically
print("\nVertical split:")
for part in vsplit_result:
    print(part)
Enter fullscreen mode Exit fullscreen mode

Output:

Horizontal split:
[[ 1  2]
 [ 5  6]
 [ 9 10]]
[[ 3  4]
 [ 7  8]
 [11 12]]

Vertical split:
[[1 2 3 4]]
[[5 6 7 8]]
[[ 9 10 11 12]]
Enter fullscreen mode Exit fullscreen mode

In this example, np.hsplit(arr, 2) splits the array arr horizontally into 2 equal parts, resulting in two sub-arrays. np.vsplit(arr, 3) splits the array arr vertically into 3 equal parts, resulting in three sub-arrays.

In conclusion, the array manipulation functions provided by NumPy such as concatenate, hstack, vstack, split, hsplit, and vsplit offer a powerful toolkit for efficiently organizing and manipulating data. Whether it's combining arrays, splitting them into smaller segments along horizontal or vertical axes, or reshaping them to suit specific needs, these functions streamline the data handling process, making it easier for users to perform complex operations with minimal effort.

Top comments (0)