GridSearch for an estimator inside a OneVsRestClassifier

I want to perform GridSearchCV in a SVC model, but that uses the one-vs-all strategy. For the latter part, I can just do this:

model_to_set = OneVsRestClassifier(SVC(kernel="poly"))

My problem is with the parameters. Let's say I want to try the following values:

parameters = {"C":[1,2,4,8], "kernel":["poly","rbf"],"degree":[1,2,3,4]}

In order to perform GridSearchCV, I should do something like:

 cv_generator = StratifiedKFold(y, k=10)
 model_tunning = GridSearchCV(model_to_set, param_grid=parameters, score_func=f1_score, n_jobs=1, cv=cv_generator)

However, then I execute it I get:

Traceback (most recent call last):
  File "/.../main.py", line 66, in <module>
    argclass_sys.set_model_parameters(model_name="SVC", verbose=3, file_path=PATH_ROOT_MODELS)
  File "/.../base.py", line 187, in set_model_parameters
    model_tunning.fit(self.feature_encoder.transform(self.train_feats), self.label_encoder.transform(self.train_labels))
  File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 354, in fit
    return self._fit(X, y)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 392, in _fit
    for clf_params in grid for train, test in cv)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 473, in __call__
    self.dispatch(function, args, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 296, in dispatch
    job = ImmediateApply(func, args, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 124, in __init__
    self.results = func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 85, in fit_grid_point
    clf.set_params(**clf_params)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/base.py", line 241, in set_params
    % (key, self.__class__.__name__))
ValueError: Invalid parameter kernel for estimator OneVsRestClassifier

Basically, since the SVC is inside a OneVsRestClassifier and that's the estimator I send to the GridSearchCV, the SVC's parameters can't be accessed.

In order to accomplish what I want, I see two solutions:

  1. When creating the SVC, somehow tell it not to use the one-vs-one strategy but the one-vs-all.
  2. Somehow indicate the GridSearchCV that the parameters correspond to the estimator inside the OneVsRestClassifier.

I'm yet to find a way to do any of the mentioned alternatives. Do you know if there's a way to do any of them? Or maybe you could suggest another way to get to the same result?

Thanks!


When you use nested estimators with grid search you can scope the parameters with __ as a separator. In this case the SVC model is stored as an attribute named estimator inside the OneVsRestClassifier model:

from sklearn.datasets import load_iris
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC
from sklearn.grid_search import GridSearchCV
from sklearn.metrics import f1_score

iris = load_iris()

model_to_set = OneVsRestClassifier(SVC(kernel="poly"))

parameters = {
    "estimator__C": [1,2,4,8],
    "estimator__kernel": ["poly","rbf"],
    "estimator__degree":[1, 2, 3, 4],
}

model_tunning = GridSearchCV(model_to_set, param_grid=parameters,
                             score_func=f1_score)

model_tunning.fit(iris.data, iris.target)

print model_tunning.best_score_
print model_tunning.best_params_

That yields:

0.973290762737
{'estimator__kernel': 'poly', 'estimator__C': 1, 'estimator__degree': 2}

param_grid  = {"estimator__alpha": [10**-5, 10**-3, 10**-1, 10**1, 10**2]}

clf = OneVsRestClassifier(SGDClassifier(loss='log',penalty='l1'))

model = GridSearchCV(clf,param_grid, scoring = 'f1_micro', cv=2,n_jobs=-1)

model.fit(x_train_multilabel, y_train)

For Python 3, the following code should be used

from sklearn.datasets import load_iris
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import f1_score

iris = load_iris()

model_to_set = OneVsRestClassifier(SVC(kernel="poly"))

parameters = {
    "estimator__C": [1,2,4,8],
    "estimator__kernel": ["poly","rbf"],
    "estimator__degree":[1, 2, 3, 4],
}

model_tunning = GridSearchCV(model_to_set, param_grid=parameters,
                             scoring='f1_weighted')

model_tunning.fit(iris.data, iris.target)

print(model_tunning.best_score_)
print(model_tunning.best_params_)