Arrays of strings into numpy.amax
Solution 1:
Instead of storing your strings as variable length data in the numpy
array, you could try storing them as Python object
s instead. Numpy will treat these as references to the original Python string objects, and you can then treat them like you might expect:
t = np.array([['one','two','three'],['four','five','six']], dtype=object)
np.min(t)
# gives 'five'
np.max(t)
# gives 'two'
Keep in mind that here, the np.min
and np.max
calls are ordering the strings lexicographically - so "two" does indeed come after "five". To change the comparison operator to look at the length of each string, you could try creating a new numpy
array identical in form, but containing each string's length instead of its reference. You could then do a numpy.argmin
call on that array (which returns the index of the minimum) and look up the value of the string in the original array.
Example code:
# Vectorize takes a Python function and converts it into a Numpy
# vector function that operates on arrays
np_len = np.vectorize(lambda x: len(x))
np_len(t)
# gives array([[3, 3, 5], [4, 4, 3]])
idx = np_len(t).argmin(0) # get the index along the 0th axis
# gives array([0, 0, 1])
result = t
for i in idx[1:]:
result = result[i]
print result
# gives "two", the string with the smallest length