Find the union of two NumPy arrays
Finding the union of two NumPy arrays means combining both arrays into one while removing any duplicate values. For example, if you have two arrays [10, 20, 30] and [20, 30, 40], the union would be [10, 20, 30, 40]. Let's explore different ways to do this efficiently.
Using np.union1d()
If you want to combine two arrays without duplicates, np.union1d() is the cleanest and most efficient way. It returns a sorted array of unique elements from both arrays, handling the merge and deduplication automatically.
import numpy as np
a = np.array([10, 20, 30, 40])
b = np.array([20, 40, 60, 80])
res = np.union1d(a,b)
print(res)
Output
[10 20 30 40 60 80]
Explanation: np.union1d(a, b) merges two arrays, removes duplicates automatically and returns a sorted array of unique elements from both.
Using np.concatenate()
First, combine the arrays using np.concatenate(), which joins them as-is. Then, apply np.unique() to remove duplicates and automatically sort the result. This method is more manual, making it great for learning and understanding how each step in NumPy works.
import numpy as np
a = np.array([10, 20, 30, 40])
b = np.array([20, 40, 60, 80])
c = np.concatenate((a,b))
res = np.unique(c)
print(res)
Output
[10 20 30 40 60 80]
Explanation: np.concatenate() join arrays a and b, then np.unique() to remove duplicates and return a sorted array of unique elements.
Using functools.reduce()
To combine multiple arrays, use functools.reduce() with np.union1d(). It merges all arrays while removing duplicates, making it perfect for getting one clean array of unique values.
import numpy as np
from functools import reduce
a = [
np.array([1, 2, 3]),
np.array([3, 4, 5]),
np.array([5, 6, 7]),
np.array([0, 2, 4])
]
res = reduce(np.union1d, a)
print(res)
Output
[0 1 2 3 4 5 6 7]
Explanation: reduce() with np.union1d combines multiple NumPy arrays by merging them pairwise and removing duplicates at each step.
Using set()
By first combining arrays with np.concatenate(), then converting to a set, duplicates are removed. The result is then converted back to a NumPy array and optionally sorted for clarity.
import numpy as np
a = np.array([10, 20, 30])
b = np.array([20, 30, 40])
res = np.array(list(set(np.concatenate((a,b)))))
res.sort()
print(res)
Output
[10 20 30 40]
Explanation: This code joins arrays a and b with np.concatenate(), removes duplicates using set(), converts back to a NumPy array, and sorts the result.