Concatenate Numpy arrays without copying

The memory belonging to a Numpy array must be contiguous. If you allocated the arrays separately, they are randomly scattered in memory, and there is no way to represent them as a view Numpy array.

If you know beforehand how many arrays you need, you can instead start with one big array that you allocate beforehand, and have each of the small arrays be a view to the big array (e.g. obtained by slicing).


Just initialize the array before you fill it with data. If you want you can allocate more space than needed and it will not take up more RAM because of the way numpy works.

A = np.zeros(R,C)
A[row] = [data]

The memory is used only once data is put into the array. Creating a new array from concatenating two will never finish on a dataset of any size, i.e. dataset > 1GB or so.


Not really elegant at all but you can get close to what you want using a tuple to store pointers to the arrays. Now I have no idea how I would use it in the case but I have done things like this before.

>>> X = np.array([[1,2,3]])
>>> Y = np.array([[-1,-2,-3],[4,5,6]])
>>> z = (X, Y)
>>> z[0][:] = 0
>>> z
(array([[0, 0, 0]]), array([[-1, -2, -3],
       [ 4,  5,  6]]))
>>> X
array([[0, 0, 0]])

I had the same problem and ended up doing it reversed, after concatenating normally (with copy) I reassigned the original arrays to become views on the concatenated one:

import numpy as np

def concat_no_copy(arrays):
    """ Concats the arrays and returns the concatenated array 
    in addition to the original arrays as views of the concatenated one.

    Parameters:
    -----------
    arrays: list
        the list of arrays to concatenate
    """
    con = np.concatenate(arrays)

    viewarrays = []
    for i, arr in enumerate(arrays):
        arrnew = con[sum(len(a) for a in arrays[:i]):
                     sum(len(a) for a in arrays[:i + 1])]
        viewarrays.append(arrnew)
        assert all(arr == arrnew)

    # return the view arrays, replace the old ones with these
    return con, viewarrays

You can test it as follows:

def test_concat_no_copy():
    arr1 = np.array([0, 1, 2, 3, 4])
    arr2 = np.array([5, 6, 7, 8, 9])
    arr3 = np.array([10, 11, 12, 13, 14])

    arraylist = [arr1, arr2, arr3]

    con, newarraylist = concat_no_copy(arraylist)

    assert all(con == np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
                                11, 12, 13, 14]))

    for old, new in zip(arraylist, newarraylist):
        assert all(old == new)