TypeError: only length-1 arrays can be converted to Python scalars while plot showing
I have such Python code:
import numpy as np
import matplotlib.pyplot as plt
def f(x):
return np.int(x)
x = np.arange(1, 15.1, 0.1)
plt.plot(x, f(x))
plt.show()
And such error:
TypeError: only length-1 arrays can be converted to Python scalars
How can I fix it?
The error "only length-1 arrays can be converted to Python scalars" is raised when the function expects a single value but you pass an array instead.
If you look at the call signature of np.int
, you'll see that it accepts a single value, not an array. In general, if you want to apply a function that accepts a single element to every element in an array, you can use np.vectorize
:
import numpy as np
import matplotlib.pyplot as plt
def f(x):
return np.int(x)
f2 = np.vectorize(f)
x = np.arange(1, 15.1, 0.1)
plt.plot(x, f2(x))
plt.show()
You can skip the definition of f(x) and just pass np.int to the vectorize function: f2 = np.vectorize(np.int)
.
Note that np.vectorize
is just a convenience function and basically a for loop. That will be inefficient over large arrays. Whenever you have the possibility, use truly vectorized functions or methods (like astype(int)
as @FFT suggests).
Use:
x.astype(int)
Here is the reference.
dataframe['column'].squeeze()
should solve this. It basically changes the dataframe column to a list.
Take note of what is printed for x
. You are trying to convert an array (basically just a list) into an int. length-1
would be an array of a single number, which I assume numpy just treats as a float. You could do this, but it's not a purely-numpy solution.
EDIT: I was involved in a post a couple of weeks back where numpy was slower an operation than I had expected and I realised I had fallen into a default mindset that numpy was always the way to go for speed. Since my answer was not as clean as ayhan's, I thought I'd use this space to show that this is another such instance to illustrate that vectorize
is around 10% slower than building a list in Python. I don't know enough about numpy to explain why this is the case but perhaps someone else does?
import numpy as np
import matplotlib.pyplot as plt
import datetime
time_start = datetime.datetime.now()
# My original answer
def f(x):
rebuilt_to_plot = []
for num in x:
rebuilt_to_plot.append(np.int(num))
return rebuilt_to_plot
for t in range(10000):
x = np.arange(1, 15.1, 0.1)
plt.plot(x, f(x))
time_end = datetime.datetime.now()
# Answer by ayhan
def f_1(x):
return np.int(x)
for t in range(10000):
f2 = np.vectorize(f_1)
x = np.arange(1, 15.1, 0.1)
plt.plot(x, f2(x))
time_end_2 = datetime.datetime.now()
print time_end - time_start
print time_end_2 - time_end