sklearn Logistic Regression "ValueError: Found array with dim 3. Estimator expected <= 2."
I attempt to solve this problem 6 in this notebook. The question is to train a simple model on this data using 50, 100, 1000 and 5000 training samples by using the LogisticRegression model from sklearn.linear_model
.
lr = LogisticRegression()
lr.fit(train_dataset,train_labels)
This is the code i trying to do and it give me the error.
ValueError: Found array with dim 3. Estimator expected <= 2.
Any idea?
UPDATE 1: Update the link to the Jupyter Notebook.
scikit-learn expects 2d num arrays for the training dataset for a fit function. The dataset you are passing in is a 3d array you need to reshape the array into a 2d.
nsamples, nx, ny = train_dataset.shape
d2_train_dataset = train_dataset.reshape((nsamples,nx*ny))
In LSTM, GRU, and TCN layers, the return_sequence in last layer before Dence Layer must set False . It is one of conditions that you encounter to this error message .
If anyone is stumbling onto this question from using LSTM or any RNN for two or more time series, this might be a solution.
However, to those who want error between two different values predicted, if for example you're trying to predict two completely different time series, then you can do the following:
from sklearn import mean_squared_error
# Any sklearn function that takes 2D data only
# 3D data
real = np.array([
[
[1,60],
[2,70],
[3,80]
],
[
[2,70],
[3,80],
[4,90]
]
])
pred = np.array([
[
[1.1,62.1],
[2.1,72.1],
[3.1,82.1]
],
[
[2.1,72.1],
[3.1,82.1],
[4.1,92.1]
]
])
# Error/Some Metric on Feature 1:
print(mean_squared_error(real[:,:,0], pred[:,:,0]) # 0.1000
# Error/Some Metric on Feature 2:
print(mean_squared_error(real[:,:,1], pred[:,:,1]) # 2.0000
Additional Info from the numpy indexing