-
ANU
- Australia Canberra
- https://gewelsji.github.io
Highlights
- Pro
Block or Report
Block or report GewelsJI
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gra…
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
A high-throughput and memory-efficient inference and serving engine for LLMs
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Advances in recent large vision language models (LVLMs)
(ECCV 2024) Open-Vocabulary Camouflaged Object Segmentation
[AAAI2024] Official implementation of SurgicalSAM
Efficient Multimodal Large Language Models: A Survey
A curated list of practical guide resources of Medical LLMs (Medical LLMs Tree, Tables, and Papers)
Statistics and Visualization of acceptance information, main keyword of CVPR 2022 accepted papers for the main Computer Vision conferences (CVPR/ICCV/WACV...)
[ICML 2024] Official repository of the paper: "Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset"
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
The medical imaging meta-learning toolbox allows to build models that learn to learn in a setting with diverse tasks. It also provides code for working with the MIMeta Dataset as well as simple bas…
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Official PyTorch code of "Grounded Question-Answering in Long Egocentric Videos", accepted by CVPR 2024.
Collection of AWESOME vision-language models for vision tasks
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
A Framework of Small-scale Large Multimodal Models
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.