Long-form Pandas Dataframe from wide-form numpy arrays

Suppose I have the following two numpy.ndarrays

  1. pixels_array.shape = (1000, 28, 28)
  2. labels_array.shape = (1000, 0)

Pixels_array is 1000 item array of 28 x 28 pixel values, and labels_array is simply a 1000 item list of labels for those pixel values. I am attempting to merge those arrays into a long-form dataframe that looks like (did not include array examples due to space):

ID Label Pixels
1 9 28x28 array
2 B 28x28 array
3 Q 28x28 array
4 8 28x28 array
5 Z 28x28 array

What is the best way to do this? I have been messing with this about an hour and just cannot get melt to work the way I expect. Sometimes I get a row for each item in each array, other time I get a total of 2 rows. Any help would be appreciated.


Solution 1:

You should be able to do that with following.

df = pd.DataFrame({'Pixels': [pixels_array[i] for i in range(1000)],
                   'Label': labels_array.flatten()})

Solution 2:

What you are asking for is rarely recommend, but one way is to coerce the array to a list first, i.e.:

arr1 = np.random.randint(1, 10, size=(1000, 28, 28))                                                                                              
arr2 = np.random.randn(1000)                                                                                                                                                 
df = pd.Series(arr2, name="Label").to_frame()                                                                                                                                
df['pixels'] = arr1.tolist()

Then if you want you can convert it back to an array:

df.pixels = df.pixels.apply(np.array)                                                                                                                                        

Output:

     Label                                             pixels
0 -0.187183  [[7, 9, 6, 5, 5, 7, 6, 9, 1, 7, 7, 7, 2, 8, 8,...
1  0.360777  [[1, 4, 6, 7, 7, 4, 9, 1, 1, 8, 8, 6, 9, 3, 6,...
2  0.206012  [[7, 4, 8, 3, 4, 3, 8, 9, 1, 9, 6, 8, 7, 5, 3,...
3  0.726619  [[1, 8, 8, 4, 5, 1, 2, 2, 3, 4, 8, 3, 6, 4, 1,...
4  0.801372  [[3, 5, 7, 3, 5, 7, 4, 1, 5, 1, 6, 3, 8, 5, 9,...
(...)