Grams: Gradient Descent with Adaptive Momentum Scaling (ICLR 2025 SCOPE Workshop)

This repository contains the official PyTorch implementation for Grams optimizer.

We introduce Gradient Descent with Adaptive Momentum Scaling (Grams), a novel optimization algorithm that decouples the direction and magnitude of parameter updates in deep learning. Unlike traditional optimizers that directly integrate momentum into updates, Grams separates the update direction, derived from current gradients, from momentum, which is used solely for adaptive magnitude scaling. This approach enables Grams to achieve improved loss descent compared to state-of-the-art cautious and momentum-based optimizers.

Install

Use the following command to install our pytorch implementation for Grams:

pip install grams-pytorch

How to use Grams

Switching from Adam/AdamW to Grams is simple and requires only two lines of code:

Before:

import torch
optimizer = torch.optim.adam(model.parameters(), lr=1e-3, weight_decay=0.0)

Switching to Grams:

from grams import Grams
optimizer = Grams(model.parameters(), lr=1e-3, weight_decay=0.0)

Just import Grams and swap the optimizer—everything else remains the same!

Citation

Please cite our work!

@inproceedings{cao2025grams,
title={Grams: Gradient Descent with Adaptive Momentum Scaling},
author={Yang Cao and Xiaoyu Li and Zhao Song},
booktitle={ICLR 2025 First Workshop on Scalable Optimization for Efficient and Adaptive Foundation Models},
year={2025},
url={https://openreview.net/forum?id=GmKQnpQdsc}
}

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
grams		grams
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grams: Gradient Descent with Adaptive Momentum Scaling (ICLR 2025 SCOPE Workshop)

Install

How to use Grams

Citation

About

Releases 1

Languages

License

Gunale0926/Grams

Folders and files

Latest commit

History

Repository files navigation

Grams: Gradient Descent with Adaptive Momentum Scaling (ICLR 2025 SCOPE Workshop)

Install

How to use Grams

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages