numpy elementwise outer product
I want to do the element-wise outer product of two 2d arrays in numpy.
A.shape = (100, 3) # A numpy ndarray
B.shape = (100, 5) # A numpy ndarray
C = element_wise_outer_product(A, B) # A function that does the trick
C.shape = (100, 3, 5) # This should be the result
C[i] = np.outer(A[i], B[i]) # This should be the result
A naive implementation can the following.
tmp = []
for i in range(len(A):
outer_product = np.outer(A[i], B[i])
tmp.append(outer_product)
C = np.array(tmp)
A better solution inspired from stack overflow.
big_outer = np.multiply.outer(A, B)
tmp = np.swapaxes(tmp, 1, 2)
C_tmp = [tmp[i][i] for i in range(len(A)]
C = np.array(C_tmp)
I'm looking for a vectorized implementation that gets rid the for loop. Does anyone have an idea? Thank you!
Solution 1:
Extend A
and B
to 3D
keeping their first axis aligned and introducing new axes along the third and second ones respectively with None/np.newaxis
and then multiply with each other. This would allow broadcasting
to come into play for a vectorized solution.
Thus, an implementation would be -
A[:,:,None]*B[:,None,:]
We could shorten it a bit by using ellipsis
for A's : :,:
and skip listing the leftover last axis with B
, like so -
A[...,None]*B[:,None]
As another vectorized approach we could also use np.einsum
, which might be more intuitive once we get past the string notation syntax and consider those notations being representatives of the iterators involved in a naive loopy implementation, like so -
np.einsum('ij,ik->ijk',A,B)
Solution 2:
Another solution using np.lib.stride_tricks.as_strided()
..
Here the strategy is to, in essence, build a (100, 3, 5)
array As
and a (100, 3, 5)
array Bs
such that the normal element-wise product of these arrays will produce the desired result. Of course, we don't actually build big memory consuming arrays, thanks to as_strided()
. (as_strided()
is like a blueprint that tells NumPy how you'd map data from the original arrays to construct As
and Bs
.)
def outer_prod_stride(A, B):
"""stride trick"""
a = A.shape[-1]
b = B.shape[-1]
d = A.strides[-1]
new_shape = A.shape + (b,)
As = np.lib.stride_tricks.as_strided(A, shape=new_shape, strides=(a*d, d, 0))
Bs = np.lib.stride_tricks.as_strided(B, shape=new_shape, strides=(b*d, 0, d))
return As * Bs
Timings
def outer_prod_broadcasting(A, B):
"""Broadcasting trick"""
return A[...,None]*B[:,None]
def outer_prod_einsum(A, B):
"""einsum() trick"""
return np.einsum('ij,ik->ijk',A,B)
def outer_prod_stride(A, B):
"""stride trick"""
a = A.shape[-1]
b = B.shape[-1]
d = A.strides[-1]
new_shape = A.shape + (b,)
As = np.lib.stride_tricks.as_strided(A, shape=new_shape, strides=(a*d, d, 0))
Bs = np.lib.stride_tricks.as_strided(B, shape=new_shape, strides=(b*d, 0, d))
return As * Bs
%timeit op1 = outer_prod_broadcasting(A, B)
2.54 µs ± 436 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit op2 = outer_prod_einsum(A, B)
3.03 µs ± 637 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit op3 = outer_prod_stride(A, B)
16.6 µs ± 5.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Seems my stride trick solution is slower than both @Divkar's solutions. ..still an interesting method worth knowing though.