Correlation of 2 time dependent multidimensional signals (signal vectors)

Solution 1:

Based on this solution to finding correlation matrix between two 2D arrays, we can have a similar one for finding correlation vector that computes correlation between corresponding rows in the two arrays. The implementation would look something like this -

def corr2_coeff_rowwise(A,B):
    # Rowwise mean of input arrays & subtract from input arrays themeselves
    A_mA = A - A.mean(1)[:,None]
    B_mB = B - B.mean(1)[:,None]

    # Sum of squares across rows
    ssA = (A_mA**2).sum(1);
    ssB = (B_mB**2).sum(1);

    # Finally get corr coeff
    return np.einsum('ij,ij->i',A_mA,B_mB)/np.sqrt(ssA*ssB)

We can further optimize the part to get ssA and ssB by introducing einsum magic there too!

def corr2_coeff_rowwise2(A,B):
    A_mA = A - A.mean(1)[:,None]
    B_mB = B - B.mean(1)[:,None]
    ssA = np.einsum('ij,ij->i',A_mA,A_mA)
    ssB = np.einsum('ij,ij->i',B_mB,B_mB)
    return np.einsum('ij,ij->i',A_mA,B_mB)/np.sqrt(ssA*ssB)

Sample run -

In [164]: M1 = np.array ([
     ...:     [1, 2, 3, 4],
     ...:     [2, 3, 1, 4.5]
     ...: ])
     ...: 
     ...: M2 = np.array ([
     ...:     [10, 20, 33, 40],
     ...:     [20, 35, 15, 40]
     ...: ])
     ...: 

In [165]: corr2_coeff_rowwise(M1, M2)
Out[165]: array([ 0.99411402,  0.96131896])

In [166]: corr2_coeff_rowwise2(M1, M2)
Out[166]: array([ 0.99411402,  0.96131896])

Runtime test -

In [97]: M1 = np.random.rand(256,200)
    ...: M2 = np.random.rand(256,200)
    ...: 

In [98]: out1 = np.diagonal (np.corrcoef (M1, M2), M1.shape [0])
    ...: out2 = corr2_coeff_rowwise(M1, M2)
    ...: out3 = corr2_coeff_rowwise2(M1, M2)
    ...: 

In [99]: np.allclose(out1, out2)
Out[99]: True

In [100]: np.allclose(out1, out3)
Out[100]: True

In [101]: %timeit np.diagonal (np.corrcoef (M1, M2), M1.shape [0])
     ...: %timeit corr2_coeff_rowwise(M1, M2)
     ...: %timeit corr2_coeff_rowwise2(M1, M2)
     ...: 
100 loops, best of 3: 9.5 ms per loop
1000 loops, best of 3: 554 µs per loop
1000 loops, best of 3: 430 µs per loop

20x+ speedup there with einsum over the built-in np.corrcoef!