Skip to content

Latest commit

 

History

History
 
 

Idefics2

Idefics2 notebooks

This folder contains notebooks regarding Idefics2, a powerful vision-language model developed by Hugging Face.

Notes

  1. I just uploaded a similar notebook for LLaVa: it works just as well, and I removed the addition of special tokens to make the logic simpler. Can be done for Idefics2, too.

  2. The notebook I currently include here is aimed for extraction use cases (image->text or JSON).

If you have a chatbot use case, I'd recommend taking a look at the experimental support for VLMs in the TRL library: