Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add web endpoint to the inference server #38

Merged
merged 4 commits into from
Mar 12, 2024

Conversation

erikbern
Copy link
Contributor

@erikbern erikbern commented Mar 12, 2024

This lets you do

modal serve src.inference

in one window, and

curl 'https://modal-labs--example-axolotl-inference-web-dev.modal.run?input=what+time+is+it' in another window

it streams!

Note that most of the complexity here is that we want to support an unparametrized class constructor, so we need to pick the run id automatically. In order to do that it was slightly easier to change the argument to be run_name rather than the full path. If run_name isn't provided (which it never is in the case of web endpoints, since we don't support parametrized functions with web endpoints), then it just picks the last run by doing a directory listing.

@erikbern erikbern force-pushed the erikbern/inference-web-endpoint branch from 3658c66 to 3aa7a9e Compare March 12, 2024 01:57
@erikbern erikbern requested a review from gongy March 12, 2024 02:00
@mwaskom
Copy link
Collaborator

mwaskom commented Mar 12, 2024

I think the run_name change makes sense even just on its own but can we update the readme too?

src/inference.py Outdated
config = yaml.safe_load(f.read())
model_path = (Path(run_folder) / config["output_dir"] / "merged").resolve()

def __init__(self, run_name: str = "", model_dir: str = "/runs") -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model_dir is a little confusing since the two things we store in volumes are (1) pretrained models and (2) finetuned models. They’re all models :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but everything under /runs is finetuned models right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably not since there’s config, training logs, metrics, etc. but it’s a pretty nitpicky point either way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i'll rename it to run_dir

src/inference.py Outdated
run_name = VOLUME_CONFIG[self.model_dir].listdir("/")[-1].path

# Grab the output dir (usually "lora-out")
with open(f"/runs/{run_name}/config.yml") as f:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we have /runs hardcoded here even though it’s a parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep let me fix

@erikbern erikbern force-pushed the erikbern/inference-web-endpoint branch from 3aa7a9e to 29ca3b7 Compare March 12, 2024 02:53
@erikbern erikbern merged commit 0772e84 into main Mar 12, 2024
4 checks passed
@erikbern erikbern deleted the erikbern/inference-web-endpoint branch March 12, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants