Add LoRa support to the `txt2img` and `img2img` pipelines #119

stronk-dev · 2024-07-10T10:25:45Z

Adds support to load in arbitrary embeddings, modules (like LCM), etc.

Still requires:
~~- Testing if it works~~
~~- Gracefully deal with non-existing requested LoRas~~
Design decision: do we want to keep LoRas loaded, or always unload already loaded weights like we do now
Design decision: use the current method of requesting LoRas, or explore other options
Design decision: abort inference if one of the loras param is invalid or it fails to load one of the LoRas, or continue on like it does now

LoRas can be loaded by passing a new loras parameter. In the current design this needs to be a string, parseable as JSON. For example: curl -X POST -H "Content-Type: application/json" localhost:8000/text-to-image -d '{"prompt":"light saber battle in the death star", "loras": "{ \"nerijs/pixel-art-xl\" : 1.2 }"}'

stronk-dev · 2024-07-16T13:51:23Z

Did some testing:

2024-07-16 13:38:15,602 INFO:     Application startup complete.
2024-07-16 13:38:15,604 INFO:     Uvicorn running on https://0.0.0.0:8000 (Press CTRL+C to quit)
100%|██████████| 50/50 [00:06<00:00,  7.93it/s]
2024-07-16 13:38:23,625 INFO:     172.17.0.1:52384 - "POST /text-to-image HTTP/1.1" 200 OK
100%|██████████| 50/50 [00:08<00:00,  6.24it/s]
2024-07-16 13:39:10,578 INFO:     172.17.0.1:35758 - "POST /text-to-image HTTP/1.1" 200 OK
100%|██████████| 50/50 [00:06<00:00,  8.13it/s]
2024-07-16 13:39:22,707 INFO:     172.17.0.1:34084 - "POST /text-to-image HTTP/1.1" 200 OK
100%|██████████| 50/50 [00:08<00:00,  6.23it/s]
2024-07-16 13:39:36,599 INFO:     172.17.0.1:39376 - "POST /text-to-image HTTP/1.1" 200 OK

All images on a 4090, SDXL base model using prompt light saber battle in the death star. Requests 1 and 3 are without LoRas. Requests 2 and 4 requested the nerijs/pixel-art-xl LoRa.

Interesting inference is a bit slower when using LoRas. I could've used a better trigger words in the prompt, but certainly seems like the LoRa was loaded.

One thing which we might want to rethink if the way to pass loras parameter:

curl -X POST -H "Content-Type: application/json" localhost:8000/text-to-image -d '{"prompt":"light saber battle in the death star", "loras": "{ \"nerijs/pixel-art-xl\" : 1.2 }"}'

stronk-dev · 2024-07-16T14:15:12Z

If the user requests an invalid LoRa repo, it will print the error 2024-07-16 14:14:03,062 - app.pipelines.util - WARNING - Unable to load LoRas for adapter 'nerijs/pixel-ar' (RepositoryNotFoundError)

We can make this more verbose by printing the entire exeption. The runner will continue with inference, but without using the LoRa

stronk-dev · 2024-07-16T14:26:19Z

(as a sidenote: i think it would be useful if exceptions like that are collected and passed back. Make a best effort to complete inference and inform the user of any issues it found during the job. Alternatively we could also abort inference)

eliteprox · 2024-08-07T13:23:26Z

If the user requests an invalid LoRa repo, it will print the error 2024-07-16 14:14:03,062 - app.pipelines.util - WARNING - Unable to load LoRas for adapter 'nerijs/pixel-ar' (RepositoryNotFoundError)

We can make this more verbose by printing the entire exeption. The runner will continue with inference, but without using the LoRa

I like your LoRa input validation solution because it handles all incorrect values sufficiently. However, we might want to return these errors to the gateway later to inform the user. I think we should return the error from load_loras and return a bad request in the runner in case of an invalid lora or weight so go-livepeer can return it. I like the error messages you have now, I don't think more detail is needed on them.

We could also hold off on making that change until we develop the go-livepeer side, @rickstaa any thoughts on that approach?

eliteprox · 2024-08-07T14:26:29Z

Design decision: do we want to keep LoRas loaded, or always unload already loaded weights like we do now

VRAM usage looks great, I tested with multiple concurrent requests of different loras. I think this is working good as it is. If this would enhance inference time, I think we should backlog it as a pipeline improvement.

Design decision: use the current method of requesting LoRas, or explore other options

This implementation appears to be working great, I tried a few LoRas from hugging-face and they are downloaded automatically

eliteprox

Thanks for the PR @stronk-dev! This a nice addition to the image pipelines. I tested both text-to-image and image-to-image using the ByteDance/SDXL-Lightning mdoel with two different loras and multiple images. The pipelines are working great with LoRa support.

See my comments above on the remaining design decisions. I think responding with an informative bad request response when invalid LoRa values are passed will help inform the user on the gateway side. If you can make that change (or we decide to do it during go-livepeer integration) and resolve conflicts then the PR looks good to me.

rickstaa · 2024-08-10T07:34:45Z

Intersting 🤔! Not sure what went on during the OpenAPI spec configuration -> dcaa961. Maybe my generation code is no longer sufficient. Will revert dcaa961 and check on monday.

stronk-dev · 2024-08-14T10:36:31Z

See my comments above on the remaining design decisions. I think responding with an informative bad request response when invalid LoRa values are passed will help inform the user on the gateway side. If you can make that change (or we decide to do it during go-livepeer integration) and resolve conflicts then the PR looks good to me.

Just to confirm: we should return a bad request error and abort inference for any of the Exceptions in load_loras function?

eliteprox

This change is ready merge if there's no further optimizations. Please review @stronk-dev and @stronk-dev if you could take a quick look. I've fully tested both text-to-image and image-to-image pipelines with various loras

eliteprox · 2024-08-23T12:37:13Z

See my comments above on the remaining design decisions. I think responding with an informative bad request response when invalid LoRa values are passed will help inform the user on the gateway side. If you can make that change (or we decide to do it during go-livepeer integration) and resolve conflicts then the PR looks good to me.

Just to confirm: we should return a bad request error and abort inference for any of the Exceptions in load_loras function?

Correct and I'd like to send the specific message back to the gateway. I think this would help developers learn the API faster, but not required for this initial release of LoRa support in my opinion, open to suggestions

stronk-dev · 2024-08-27T15:06:50Z

Thanks for all the tweaks, @eliteprox ! LGTM

* fix go-binding, fix python error when invalid lora provided * fix loading/unloading of lora weights

eliteprox · 2024-08-27T16:25:10Z

Thanks for all the tweaks, @eliteprox ! LGTM

@rickstaa I've resolved conflicts on runner.gen.go and re-generated the openapi schema. This is ready for merge along with livepeer/go-livepeer#3154

rickstaa force-pushed the main branch 3 times, most recently from cd1feb4 to 0d03040 Compare July 16, 2024 13:10

stronk-dev marked this pull request as ready for review July 16, 2024 13:51

stronk-dev requested a review from rickstaa as a code owner July 16, 2024 13:51

eliteprox requested changes Aug 7, 2024

View reviewed changes

eliteprox force-pushed the feature/loras branch 3 times, most recently from 6544ef0 to f6c7e94 Compare August 10, 2024 00:57

rickstaa force-pushed the feature/loras branch from dcaa961 to f6c7e94 Compare August 10, 2024 07:35

eliteprox force-pushed the feature/loras branch from b11f621 to faf8302 Compare August 22, 2024 21:33

eliteprox approved these changes Aug 23, 2024

View reviewed changes

eliteprox mentioned this pull request Aug 26, 2024

print error message on gateway for bad request lora errors livepeer/go-livepeer#3154

Open

5 tasks

stronk-dev added 8 commits August 27, 2024 12:14

Add loras parameter to txt2img and img2img routes

b510fc8

Util: add load_loras function to load LoRas to a given pipeline

1e09a0a

txt2img: add LoRa support

6c33d63

img2img: add LoRa support

8b7f1bf

Fix derp

510cae8

Sanity check: don't try to load LoRas if there are none to load

a56b4fe

Sanity check: gracefully deal with a failed load_lora_weights request

461b314

Fix exception handling in load_loras

cffb3bd

stronk-dev and others added 16 commits August 27, 2024 12:14

Try to print the exception

0784dd7

Even better exception printing?

72c262e

Remove unnecessary bits from exception print

b5d7909

Add another sanity check to load_loras

a465a9d

fix merge

392ab59

lora pipeline fixes (#150)

63f0665

* fix go-binding, fix python error when invalid lora provided * fix loading/unloading of lora weights

fix incorrect variable

18f3842

remove semicolon

9756c1e

set default value for loras parameter

c99268c

add lora logic to image-to-image, fix lora parsing error

cccdb8f

define loaded_loras as array in image-to-image

4320f35

use blank string instead of None for loras parameter

80bf126

abort inference on lora loading error and return msg to gateway

b817b0c

return error msg on bad request for t2i

4ae1d56

update error message accordingly

93c8bd6

update runner after rebase

515d6be

eliteprox force-pushed the feature/loras branch from a6baf94 to 515d6be Compare August 27, 2024 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LoRa support to the `txt2img` and `img2img` pipelines #119

Add LoRa support to the `txt2img` and `img2img` pipelines #119

stronk-dev commented Jul 10, 2024 •

edited

Loading

stronk-dev commented Jul 16, 2024

stronk-dev commented Jul 16, 2024

stronk-dev commented Jul 16, 2024

eliteprox commented Aug 7, 2024

eliteprox commented Aug 7, 2024 •

edited

Loading

eliteprox left a comment

rickstaa commented Aug 10, 2024

stronk-dev commented Aug 14, 2024

eliteprox left a comment •

edited

Loading

eliteprox commented Aug 23, 2024 •

edited

Loading

stronk-dev commented Aug 27, 2024

eliteprox commented Aug 27, 2024

Add LoRa support to the txt2img and img2img pipelines #119

Are you sure you want to change the base?

Add LoRa support to the txt2img and img2img pipelines #119

Conversation

stronk-dev commented Jul 10, 2024 • edited Loading

stronk-dev commented Jul 16, 2024

stronk-dev commented Jul 16, 2024

stronk-dev commented Jul 16, 2024

eliteprox commented Aug 7, 2024

eliteprox commented Aug 7, 2024 • edited Loading

eliteprox left a comment

Choose a reason for hiding this comment

rickstaa commented Aug 10, 2024

stronk-dev commented Aug 14, 2024

eliteprox left a comment • edited Loading

Choose a reason for hiding this comment

eliteprox commented Aug 23, 2024 • edited Loading

stronk-dev commented Aug 27, 2024

eliteprox commented Aug 27, 2024

Add LoRa support to the `txt2img` and `img2img` pipelines #119

Add LoRa support to the `txt2img` and `img2img` pipelines #119

stronk-dev commented Jul 10, 2024 •

edited

Loading

eliteprox commented Aug 7, 2024 •

edited

Loading

eliteprox left a comment •

edited

Loading

eliteprox commented Aug 23, 2024 •

edited

Loading