Sklearn, gridsearch: how to print out progress during the execution?
I am using GridSearch
from sklearn
to optimize parameters of the classifier. There is a lot of data, so the whole process of optimization takes a while: more than a day. I would like to watch the performance of the already-tried combinations of parameters during the execution. Is it possible?
Set the verbose
parameter in GridSearchCV
to a positive number (the greater the number the more detail you will get). For instance:
GridSearchCV(clf, param_grid, cv=cv, scoring='accuracy', verbose=10)
I would just like to complement DavidS's answer
To give you an idea, for a very simple case, this is how it looks with verbose=1
:
Fitting 10 folds for each of 1 candidates, totalling 10 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 10 out of 10 | elapsed: 1.2min finished
And this is how it looks with verbose=10
:
Fitting 10 folds for each of 1 candidates, totalling 10 fits
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.637, total= 7.1s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 7.0s remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.630, total= 6.5s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 13.5s remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.637, total= 6.5s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 20.0s remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.637, total= 6.7s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 4 out of 4 | elapsed: 26.7s remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.632, total= 7.9s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 34.7s remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.622, total= 6.9s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 6 out of 6 | elapsed: 41.6s remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.627, total= 7.1s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 7 out of 7 | elapsed: 48.7s remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.628, total= 7.2s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 8 out of 8 | elapsed: 55.9s remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.640, total= 6.6s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1
[Parallel(n_jobs=1)]: Done 9 out of 9 | elapsed: 1.0min remaining: 0.0s
[CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.629, total= 6.6s
[Parallel(n_jobs=1)]: Done 10 out of 10 | elapsed: 1.2min finished
In my case, verbose=1
does the trick.