Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix nits in docs #49

Merged
merged 3 commits into from
Apr 23, 2024
Merged

fix nits in docs #49

merged 3 commits into from
Apr 23, 2024

Conversation

hamelsmu
Copy link
Contributor

@hamelsmu hamelsmu commented Apr 22, 2024

Fixing nits discussed in #48

@hamelsmu hamelsmu requested a review from mwaskom April 22, 2024 20:01
Copy link
Collaborator

@mwaskom mwaskom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to change anything here just leaving some feedback on the changes you made to the core "what is good about this" pitch.


## Serverless Axolotl

This repository gives the popular [`axolotl`](https://github.com/OpenAccess-AI-Collective/axolotl) fine-tuning library a serverless twist. It uses Modal's serverless infrastructure to run your fine-tuning jobs in the cloud, so you can train your models without worrying about building images or idling expensive GPU VMs.
This repository gives the popular `axolotl` fine-tuning library a serverless twist. It uses Modal's serverless infrastructure to run your fine-tuning jobs in the cloud, so you can train your models without worrying about building images or idling expensive GPU VMs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess technically, one still needs to worry about building images. It's fine, but we could continue refining the pitch once we clarify the goal of this repository. e.g. one potential benefit of modal is running a bunch of training jobs in parallel. But the current setup actually does not make that particularly easy.


Any application written with Modal, including this one, can be trivially scaled across many GPUs in a reproducible and efficient manner. This makes any fine-tuning job you prototype with this repository production-ready.
Any application written with Modal can be trivially scaled across many GPUs. This ensures that any fine-tuning job prototyped with this repository is ready for production.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might have lost the thread on the argument here too. I think the main selling point for modal in terms of "prototype to production" is that we (try to) offer the dev ex of local work — with fast iteration and minimal fuss about platform configuration — on the production infra in the cloud. So you never get something running locally only to have to it break once you "ship it to prod". That is more than just autoscaling (although that's nice too).

@@ -62,7 +62,7 @@ modal run -q src.inference --run-name <run_tag>
Our quickstart example trains a 7B model on a text-to-SQL dataset as a proof of concept (it takes just a few minutes). It uses DeepSpeed ZeRO stage 1 to use data parallelism across 2 H100s. Inference on the fine-tuned model displays conformity to the output structure (`[SQL] ... [/SQL]`). To achieve better results, you would need to use more data! Refer to the full development section below.

> [!TIP]
> DeepSpeed ZeRO-1 is not the best choice if your model doesn't comfortably fit on a single GPU. For larger models, we recommend DeepSpeed Zero stage 3 instead by changing the `deepspeed` configuration path. Modal mounts the [`deepspeed_configs` folder](https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/deepspeed_configs) from the `axolotl` repository. You reference these configurations in your `config.yml` like so: `deepspeed: /root/axolotl/deepspeed_configs/zero3_bf16.json`. If you need to change these standard configurations, you will need to modify the `train.py` script to load your own custom deepspeed configuration.
> You modify the `deepspeed` stage by changing the configuration path. Modal mounts the [`deepspeed_configs` folder](https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/deepspeed_configs) from the `axolotl` repository. You reference these configurations in your `config.yml` like so: `deepspeed: /root/axolotl/deepspeed_configs/zero3_bf16.json`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super pedantically, we don't "mount" the deepspeed configs (which is a separate concept in modal), they're just baked into the image we pull from Dockerhub.

@mwaskom
Copy link
Collaborator

mwaskom commented Apr 23, 2024

Thanks for reverting the changes to the configs.

@mwaskom mwaskom merged commit a883c1f into main Apr 23, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants