Taming Hyper-parameters in Deep Learning Systems

Abstract

Deep learning (DL) systems expose many tuning parameters (“hyper-parameters”) that affect the performance and accuracy of trained models. Increasingly users struggle to configure hyper-parameters, and a substantial portion of time is spent tuning them empirically. We argue that future DL systems should be designed to help manage hyper-parameters. We describe how a distributed DL system can (i) remove the impact of hyper-parameters on both performance and accuracy, thus making it easier to decide on a good setting, and (ii) support more powerful dynamic policies for adapting hyper-parameters, which take monitored training metrics into account. We report results from prototype implementations that show the practicality of DL system designs that are hyper-parameter-friendly.

Publication
In ACM SIGOPS Operating Systems Review
Luo Mai
Luo Mai
Assistant Professor

My research interests include computer systems, machine learning and data management.

Related