-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix nits in docs #49
fix nits in docs #49
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to change anything here just leaving some feedback on the changes you made to the core "what is good about this" pitch.
|
||
## Serverless Axolotl | ||
|
||
This repository gives the popular [`axolotl`](https://github.com/OpenAccess-AI-Collective/axolotl) fine-tuning library a serverless twist. It uses Modal's serverless infrastructure to run your fine-tuning jobs in the cloud, so you can train your models without worrying about building images or idling expensive GPU VMs. | ||
This repository gives the popular `axolotl` fine-tuning library a serverless twist. It uses Modal's serverless infrastructure to run your fine-tuning jobs in the cloud, so you can train your models without worrying about building images or idling expensive GPU VMs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess technically, one still needs to worry about building images. It's fine, but we could continue refining the pitch once we clarify the goal of this repository. e.g. one potential benefit of modal is running a bunch of training jobs in parallel. But the current setup actually does not make that particularly easy.
|
||
Any application written with Modal, including this one, can be trivially scaled across many GPUs in a reproducible and efficient manner. This makes any fine-tuning job you prototype with this repository production-ready. | ||
Any application written with Modal can be trivially scaled across many GPUs. This ensures that any fine-tuning job prototyped with this repository is ready for production. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might have lost the thread on the argument here too. I think the main selling point for modal in terms of "prototype to production" is that we (try to) offer the dev ex of local work — with fast iteration and minimal fuss about platform configuration — on the production infra in the cloud. So you never get something running locally only to have to it break once you "ship it to prod". That is more than just autoscaling (although that's nice too).
@@ -62,7 +62,7 @@ modal run -q src.inference --run-name <run_tag> | |||
Our quickstart example trains a 7B model on a text-to-SQL dataset as a proof of concept (it takes just a few minutes). It uses DeepSpeed ZeRO stage 1 to use data parallelism across 2 H100s. Inference on the fine-tuned model displays conformity to the output structure (`[SQL] ... [/SQL]`). To achieve better results, you would need to use more data! Refer to the full development section below. | |||
|
|||
> [!TIP] | |||
> DeepSpeed ZeRO-1 is not the best choice if your model doesn't comfortably fit on a single GPU. For larger models, we recommend DeepSpeed Zero stage 3 instead by changing the `deepspeed` configuration path. Modal mounts the [`deepspeed_configs` folder](https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/deepspeed_configs) from the `axolotl` repository. You reference these configurations in your `config.yml` like so: `deepspeed: /root/axolotl/deepspeed_configs/zero3_bf16.json`. If you need to change these standard configurations, you will need to modify the `train.py` script to load your own custom deepspeed configuration. | |||
> You modify the `deepspeed` stage by changing the configuration path. Modal mounts the [`deepspeed_configs` folder](https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/deepspeed_configs) from the `axolotl` repository. You reference these configurations in your `config.yml` like so: `deepspeed: /root/axolotl/deepspeed_configs/zero3_bf16.json`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super pedantically, we don't "mount" the deepspeed configs (which is a separate concept in modal), they're just baked into the image we pull from Dockerhub.
Thanks for reverting the changes to the configs. |
Fixing nits discussed in #48