Trained MoBA Llama model #31

wxthu · 2025-04-19T13:52:30Z

where can I find your fine-tuned Llama-8B1M-MoBA and Llama-8B-1M-Full model mentioned in your paper, thx!

The text was updated successfully, but these errors were encountered:

whitelez · 2025-04-20T11:08:28Z

Greetings.

Currently, we do not have plans to open-source both of these models.

wxthu · 2025-04-21T06:32:15Z

I would like to ask about the size of your dataset, how many GPU resources you used, and how long your fine-tuning process took.

wxthu closed this as completed Apr 26, 2025

Provide feedback