Title: On the Difficulty of DNN Hyperparameter Optimization Using Learning Curve Prediction
Abstract: With the recent success of deep learning on a variety of applications, efficiently tuning hyperparameters of Deep Neural Networks (DNNs) with less effort has become a timely and practical topic. As an algorithmic solution, automatic hyperparameter optimization methods like Bayesian optimization have gained popularity for achieving human-comparable or even human-surpassing performance. To further speed up hyperparameter optimization, learning curves of DNNs can be predicted and used to early terminate the training phase of the chosen hyperparameter setting when the expected training performance is not satisfactory. While the previous studies show promising results, it is still unclear if an effective general rule can be derived for a broad spectrum of DNN hyperparameter optimization problems. In this work, we consider hyperparameter optimization of MNIST and CIFAR-10, and for each task, we analyze the characteristics of the 20,000 learning curves that correspond to the 20,000 different hyperparameter configurations. By investigating a large number of learning curves for a given task, we find that the characteristics of learning curve shapes can drastically change depending on the choice and range of hyperparameters. Therefore, utilizing learning curves for speed improvement is not a simple task and can be dependent on many factors. Based on the observations and analyses on the 20,000 learning curves, we design two early termination rules, ETR-1 and ETR-2, and show that the rules can be beneficial in the best case but can be harmful as well. Our observations and experimental results highlight that hyperparameter optimization of DNNs using learning curve prediction is challenging. In particular, the results of recent studies that are based on at most thousands of learning curves of a limited number of tasks should be carefully interpreted depending on the task, DNN model, hyperparameter choice, and hyperparameter range.
Publication Year: 2018
Publication Date: 2018-10-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 11
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot