Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image GPT Support #282

Closed
appvoid opened this issue Jun 24, 2023 · 2 comments
Closed

Image GPT Support #282

appvoid opened this issue Jun 24, 2023 · 2 comments

Comments

@appvoid
Copy link
Contributor

appvoid commented Jun 24, 2023

https://github.com/openai/image-gpt

i don't know how hard would be to implement this model. It seems to have the same architecture as gpt-2.

What is fantastic though it's the possibilities around this model. Just imagine having a stable diffusion like model in your raspberry pi. And even though the 32x32 size is pretty limited, there is already things like realesrgan which amplifies the resolution upto 4 times!

128x128 high-quality images without any dependencies, please considered at least!

igpt-xl-miscellaneous-29-orig.png

igpt-xl-miscellaneous-29-3.png

igpt-xl-miscellaneous-29-2.png

@appvoid
Copy link
Contributor Author

appvoid commented Jun 24, 2023

I'm pretty bad at these things (I'm just a python guy) but here's a high level overview if anyone wants to try:

  • This involves reshaping the images into a 1D sequence and applying the transformer decoder to predict the pixels.
  • They define a "mask" that randomly hides certain elements (pixels) in an image using BERT. I think we have it here: bert.cpp
  • Parameter counts are s:76M, m:455M, l:1362M (Don't know yet if they published xl)

To give an idea on how good this could be:

Test image (64x64)
test
Native 4x upscaling (256x256)
output

I used an ncnn implementation of realesrgan to upscale it. Meaning that this could be a good oportunity of doing a pretty cool spinoff with the ncnn community hence proving that open source projects yet similar, can cohexist and work together somehow.

@appvoid
Copy link
Contributor Author

appvoid commented Jun 27, 2023

Closing this as it seems like there is not any interest in it in general. It appears to be just an image completion tool and since projects like this already exist, it doesn't worth it trying at least for now.

@appvoid appvoid closed this as completed Jun 27, 2023
CCLDArjun pushed a commit to CCLDArjun/ggml that referenced this issue Dec 18, 2023
* working but ugly

* add arg flag, not working on embedding mode

* typo

* Working! Thanks to @nullhook

* make params argument instead of hardcoded boolean. remove useless time check

* start doing the instructions but not finished. This probably doesnt compile

* Embeddings extraction support

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant