Convert pandas series into numpy array [duplicate]
I am new to pandas and python. My input data is like
category text
1 hello iam fine. how are you
1 iam good. how are you doing.
inputData= pd.read_csv(Input', sep='\t', names=['category','text'])
X = inputData["text"]
Y = inputData["category"]
here Y is the panda series object, which i want to convert into numpy array. so i tried .as_matrix
YArray= Y.as_matrix(columns=None)
print YArray
But i got the output as [1,1] (which is wrong since i have only one column category and two rows). I want the result as 2x1 matrix.
To get numpy array, you need
Y.values
Try this:
after applying the .as_matrix on your series object
Y.reshape((2,1))
Since .as_matrix() only returns a numpy-array NOT a numpy-matrix. Link here
If df is your dataframe, then a column of the dataframe is a series and to convert it into an array,
df = pd.DataFrame()
x = df.values
print(x.type)
The following prints,
<class 'numpy.ndarray'>
successfully converting it to an array.