Add web endpoint to the inference server #38

erikbern · 2024-03-12T01:53:55Z

This lets you do

modal serve src.inference

in one window, and

curl 'https://modal-labs--example-axolotl-inference-web-dev.modal.run?input=what+time+is+it' in another window

it streams!

Note that most of the complexity here is that we want to support an unparametrized class constructor, so we need to pick the run id automatically. In order to do that it was slightly easier to change the argument to be run_name rather than the full path. If run_name isn't provided (which it never is in the case of web endpoints, since we don't support parametrized functions with web endpoints), then it just picks the last run by doing a directory listing.

mwaskom · 2024-03-12T02:46:41Z

I think the run_name change makes sense even just on its own but can we update the readme too?

mwaskom · 2024-03-12T02:48:48Z

src/inference.py

-            config = yaml.safe_load(f.read())
-        model_path = (Path(run_folder) / config["output_dir"] / "merged").resolve()
-
+    def __init__(self, run_name: str = "", model_dir: str = "/runs") -> None:


model_dir is a little confusing since the two things we store in volumes are (1) pretrained models and (2) finetuned models. They’re all models :)

but everything under /runs is finetuned models right?

Arguably not since there’s config, training logs, metrics, etc. but it’s a pretty nitpicky point either way

ok i'll rename it to run_dir

mwaskom · 2024-03-12T02:50:10Z

src/inference.py

+            run_name = VOLUME_CONFIG[self.model_dir].listdir("/")[-1].path
+
+        # Grab the output dir (usually "lora-out")
+        with open(f"/runs/{run_name}/config.yml") as f:


Looks like we have /runs hardcoded here even though it’s a parameter?

yep let me fix

erikbern force-pushed the erikbern/inference-web-endpoint branch from 3658c66 to 3aa7a9e Compare March 12, 2024 01:57

erikbern requested a review from gongy March 12, 2024 02:00

mwaskom reviewed Mar 12, 2024

View reviewed changes

Add web endpoint to the inference server

29ca3b7

erikbern force-pushed the erikbern/inference-web-endpoint branch from 3aa7a9e to 29ca3b7 Compare March 12, 2024 02:53

erikbern added 3 commits March 12, 2024 09:45

model_dir -> run_dir

aab83e5

minor beautification of imports

4bba53a

Fix CI

3cea777

erikbern merged commit 0772e84 into main Mar 12, 2024
4 checks passed

erikbern deleted the erikbern/inference-web-endpoint branch March 12, 2024 14:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add web endpoint to the inference server #38

Add web endpoint to the inference server #38

erikbern commented Mar 12, 2024 •

edited

Loading

mwaskom commented Mar 12, 2024

mwaskom Mar 12, 2024

erikbern Mar 12, 2024

mwaskom Mar 12, 2024

erikbern Mar 12, 2024

mwaskom Mar 12, 2024

erikbern Mar 12, 2024

Add web endpoint to the inference server #38

Add web endpoint to the inference server #38

Conversation

erikbern commented Mar 12, 2024 • edited Loading

mwaskom commented Mar 12, 2024

mwaskom Mar 12, 2024

Choose a reason for hiding this comment

erikbern Mar 12, 2024

Choose a reason for hiding this comment

mwaskom Mar 12, 2024

Choose a reason for hiding this comment

erikbern Mar 12, 2024

Choose a reason for hiding this comment

mwaskom Mar 12, 2024

Choose a reason for hiding this comment

erikbern Mar 12, 2024

Choose a reason for hiding this comment

erikbern commented Mar 12, 2024 •

edited

Loading