Charleshhy

Charleshhy

29 followers · 22 following

Melbourne
https://charles-haoyuhe.github.io/

Achievements

Block or Report

Block or report Charleshhy

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

bytedance / tarsier

Python 34 1 Updated Jul 8, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,473 89 Updated Jul 6, 2024

eric-ai-lab / MMWorld

Official repo of the paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"

Python 14 1 Updated Jul 2, 2024

showlab / Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

257 7 Updated Jul 3, 2024

ziplab / LongVLM

24 1 Updated Apr 3, 2024

mlvlab / Flipped-VQA

Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)

Python 62 7 Updated Apr 23, 2024

doc-doc / NExT-OE

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

Python 23 1 Updated Jul 18, 2023

neelsjain / NEFTune

Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning

Python 345 18 Updated May 17, 2024

AkariAsai / self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

Python 1,622 140 Updated May 25, 2024

gpt4video / GPT4Video

Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation

126 4 Updated Dec 6, 2023

jxzhangjhu / Awesome-LLM-RAG

Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models

704 44 Updated May 28, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,035 1,955 Updated Jul 3, 2024

yongliang-wu / ExploreCfg

[NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning

Python 27 Updated Jul 5, 2024

chunmeifeng / SPRC

【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval

Python 54 3 Updated Apr 16, 2024

farewellthree / BT-Adapter

[CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"

Python 22 Updated Feb 2, 2024

RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 242 21 Updated Jun 6, 2024

dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

Python 630 40 Updated Jan 10, 2024

ThisisBillhe / torch_quantizer

torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.

C++ 14 Updated Mar 29, 2024

jianghaojun / Awesome-Parameter-Efficient-Transfer-Learning

A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.

369 23 Updated Jun 12, 2024

labring / FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…

TypeScript 15,352 4,011 Updated Jul 8, 2024

IST-DASLab / qmoe

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

Python 255 22 Updated Nov 3, 2023

linzhiqiu / cross_modal_adaptation

Cross-modal few-shot adaptation with CLIP

Python 292 32 Updated Mar 13, 2024

davda54 / sam

SAM: Sharpness-Aware Minimization (PyTorch)

Python 1,700 192 Updated Feb 21, 2024

facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All

Python 8,071 734 Updated Jul 5, 2024

tomgoldstein / loss-landscape

Code for visualizing the loss landscape of neural nets

Python 2,721 388 Updated Apr 5, 2022

rese1f / MovieChat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

Python 457 37 Updated Jun 16, 2024

ziplab / Stitched_LLaMA

[CVPR 2024] A framework to fine-tune LLaMAs on instruction-following task and get many Stitched LLaMAs with customized number of parameters, e.g., Stitched LLaMA 8B, 9B, and 10B...

7 Updated Dec 1, 2023

google-research / arxiv-latex-cleaner

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Python 5,014 317 Updated Jun 27, 2024

OFA-Sys / ONE-PEACE

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Python 885 54 Updated Jun 27, 2024

ziplab / MPVSS

Python 24 1 Updated Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Charleshhy

Achievements

Achievements

Block or report Charleshhy

Stars

bytedance / tarsier

cambrian-mllm / cambrian

eric-ai-lab / MMWorld

showlab / Awesome-MLLM-Hallucination

ziplab / LongVLM

mlvlab / Flipped-VQA

doc-doc / NExT-OE

neelsjain / NEFTune

AkariAsai / self-rag

gpt4video / GPT4Video

jxzhangjhu / Awesome-LLM-RAG

haotian-liu / LLaVA

yongliang-wu / ExploreCfg

chunmeifeng / SPRC

farewellthree / BT-Adapter

RenShuhuai-Andy / TimeChat

dvlab-research / LLaMA-VID

ThisisBillhe / torch_quantizer

jianghaojun / Awesome-Parameter-Efficient-Transfer-Learning

labring / FastGPT

IST-DASLab / qmoe

linzhiqiu / cross_modal_adaptation

davda54 / sam

facebookresearch / ImageBind

tomgoldstein / loss-landscape

rese1f / MovieChat

ziplab / Stitched_LLaMA

google-research / arxiv-latex-cleaner

OFA-Sys / ONE-PEACE

ziplab / MPVSS