Skip to content
View j-min's full-sized avatar

Highlights

  • Pro

Organizations

@PyTorchKR @PyTorchKorea
Block or Report

Block or report j-min

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A reading list of video generation

300 23 Updated Jul 9, 2024

LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR team.

JavaScript 215 13 Updated Jun 27, 2024
Python 3,800 249 Updated Mar 15, 2024

Official implementation of SEED-LLaMA (ICLR 2024).

Python 517 29 Updated Apr 11, 2024

Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Python 343 14 Updated Jun 15, 2024

4M: Massively Multimodal Masked Modeling

Python 1,304 69 Updated Jul 5, 2024

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …

Python 3,814 287 Updated Apr 30, 2024

Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts (CVPR 2024)

Python 55 3 Updated Jun 26, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,078 275 Updated May 4, 2024

OCR, layout analysis, reading order, line detection in 90+ languages

Python 9,091 564 Updated Jul 8, 2024

Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

Python 25 1 Updated Mar 19, 2024

Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data

Python 29 1 Updated Mar 12, 2024

Code for the paper "pix2gestalt: Amodal Segmentation by Synthesizing Wholes" (CVPR 2024)

Python 120 8 Updated May 3, 2024

MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer

Python 171 10 Updated Apr 3, 2024

Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

87 5 Updated Feb 7, 2024

[CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"

Python 59 9 Updated May 1, 2024
Python 1,696 51 Updated Jun 28, 2024

[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gra…

Python 172 4 Updated May 12, 2024

Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥

Python 994 71 Updated Feb 13, 2024

Collection of notebook guides created by the Brev.dev team!

Jupyter Notebook 1,561 252 Updated Jun 24, 2024

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Python 4,202 215 Updated Jun 14, 2024

Set-of-Mark Prompting for LMMs

Python 1,016 80 Updated Jun 5, 2024

Python datetimes made easy

Python 6,121 372 Updated Jun 1, 2024
Python 145 4 Updated Feb 14, 2024

Open reproduction of MUSE for fast text2image generation.

Python 305 24 Updated Jun 1, 2024

A Jupyter widget for annotating images with bounding boxes

Python 101 18 Updated May 31, 2023
Python 4,505 762 Updated Jul 9, 2024
Python 107 14 Updated Feb 17, 2024

The unofficial python package that returns response of Google Bard through cookie value.

Python 5,353 532 Updated Apr 24, 2024
Next