-
Notifications
You must be signed in to change notification settings - Fork 27.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstructBLIP] qformer_tokenizer is required input #33222
[InstructBLIP] qformer_tokenizer is required input #33222
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
LMK when you want a review Amy! |
@LysandreJik Yes please! Would also like a review from @zucchini-nlp, who I discussed this with and who has been doing lots of work with the processors recently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good to me! We also have InstructBlipVideo which has the same processor, can you propagate changes there pls?
all_kwargs = { | ||
"common_kwargs": {"return_tensors": "pt"}, | ||
"images_kwargs": {"size": {"height": 214, "width": 214}}, | ||
"text_kwargs": {"padding": "max_length", "max_length": 76}, | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be missing smth here, I am not sure how these tests are passing if InstructBlip hasn't standardized the kwargs. If these pass without standardization, so prob the tests are not written properly and we need to write better ones in another PR 🤔
EDIT: My bad, just noticed the skip_processor_without_typed_kwargs
thing which skips tests hehe
@zucchini-nlp Done! Added processor tests for InstructBlipVideoProcessor too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for adding instructBlipVideo
@@ -0,0 +1,428 @@ | |||
# Copyright 2023 The HuggingFace Team. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 2024 I guess :)
# Ignore copy | ||
def prepare_image_inputs(self): | ||
"""This function prepares a list of PIL images, or a list of numpy arrays if one specifies numpify=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to test video inputs as well, but we can work on it later. Maybe I'll have to do some pre-standardization on videos first and then add proper tests everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated so that this method passes in a list of list of frames
tests/models/instructblipvideo/test_processor_instructblipvideo.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @amyeroberts! Nice tests
qformer_present = "qformer_tokenizer" in self.attributes | ||
if qformer_present: | ||
self.attributes.remove("qformer_tokenizer") | ||
|
||
outputs = super().save_pretrained(save_directory, **kwargs) | ||
|
||
if qformer_present: | ||
self.attributes += ["qformer_tokenizer"] | ||
return outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why don't we want the qformer_tokenizer to be saved here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in processor.save_pretrained
is that it iterates over the classes listed in processor.attributes
. In the case of qformer_tokenizer
, if it was saved out, its tokenizer files e.g. tokenizer.json
would just overwrite the files of the tokenizer
class earlier in the list. To avoid this, a workaround was done, such that the qformer_tokenizer
was saved to a subfolder e.g. like here, and just the tokenizer was saved to the top-level of the checkpoint.
* [InstructBLIP] qformer_tokenizer is required input * Bit safer * Add to instructblipvideo processor * Fix up * Use video inputs * Update tests/models/instructblipvideo/test_processor_instructblipvideo.py
What does this PR do?
At the moment,
InstructBLIP
doesn't haveqformer_tokenizer
listed as one of its attributes. This means that the processor is not currently compatible with tests using theProcessorTesterMixin
, as the component won't be automatically loaded when constructing the processor.As the assumption that
processor.attributes
lists all the processing classes for the processor I believe is a reasonable one, I've modified the processor here to include it.This does require some hacky logic for when the processor is saved, to avoid overwriting the tokenizer configs and still saving the qformer_tokenizer configs to a seaprate subfolder.
For future processors, I think it would be better for us to handle this more cleanly such that many processing classes of the same type can be bundled and saved together easily and automatically. For example, having all the qformer_tokenzer files saved with a
qformer_
prefix on the top level. This would enable e.g. having image processors and video processors saved together too.This isn't possible for this processor because of backward compatibility, but we can consider what it would look like in the future.