You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Paper:**[Effect of Optimizer Selection and Hyperparameter Tuning on Training Efficiency and LLM Performance](paper/optimizer_inclusions.pdf)
4
4
5
5
- The choice of optimization algorithm for training Large Language Models (LLMs) significantly impacts both training speed and final predictive performance. We demonstrate the critical importance of hyperparameter tuning protocols in optimizer comparisons for LLMs. Our work reveals that inclusion relationships between optimizers play a crucial role in practice and consistently predict optimizer performance.
0 commit comments