Gridsearchcv vs Bayesian optimization

Solution 1:

There is no better here, they are different approaches.

In Grid Search you try all the possible hyperparameters combinations within some ranges.

In Bayesian you don't try all the combinations, you search along the space of hyperparameters learning as you try them. This enables to avoid trying ALL the combinations.

So the pro of Grid Search is that you are exhaustive and the pro of Bayesian is that you don't need to be, basically if you can in terms of computing power go for Grid Search but if the space to search is too big go for Bayesian.

Solution 2:

Grid search is known to be worse than random search for optimizing hyperparameters [1], both in theory and in practice. Never use grid search unless you are optimizing one parameter only. On the other hand, Bayesian optimization is stated to outperform random search on various problems, also for optimizing hyperparameters [2]. However, this does not take into account several things: the generalization capabilities of models that use those hyperparameters, the effort to use Bayesian optimization compared to the much simpler random search, and the possibility to use random search in parallel.

So in conclusion, my recommendation is: never use grid search, use random search if you just want to try a few hyperparameters and can try them in parallel (or if you want the hyperparameters to generalize to different problems), and use Bayesian optimization if you want the best results and are willing to use a more advanced method.

[1] Random Search for Hyper-Parameter Optimization, Bergstra & Bengio 2012.

[2] Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020, Turner et al. 2021.