gyxxyg

Yongxin Guo gyxxyg

Ph.D. Student at CUHKSZ, Research Intern at Tencent

43 followers · 66 following

https://gyxxyg.github.io/yongxinguo/

Achievements

Stars

Yangyi-Chen / Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

541 34 Updated Nov 3, 2024

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 2,044 160 Updated Oct 31, 2024

zhaorui-tan / L-Reg_NeurIPS24

Offical code for Interpret Your Decision: Logical Reasoning Regularization for Generalization in Visual Classification (NeurIPS 2024 Spotlight)

Python 6 Updated Oct 21, 2024

mira-space / MiraData

Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"

Python 364 9 Updated Sep 2, 2024

westlake-baichuan-mllm / bc-omni

Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊

222 7 Updated Nov 2, 2024

rhymes-ai / Aria

Codebase for Aria - an Open Multimodal Native MoE

Jupyter Notebook 768 66 Updated Nov 4, 2024

gyxxyg / TRACE

[Preprint] TRACE: Temporal Grounding Video LLM via Casual Event Modeling

Python 34 Updated Nov 2, 2024

mbzuai-oryx / VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Python 213 14 Updated Aug 11, 2024

WHB139426 / Grounded-Video-LLM

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Python 51 4 Updated Nov 3, 2024

kyegomez / Mixture-of-Depths

Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Python 67 5 Updated Oct 28, 2024

baaivision / Emu3

Next-Token Prediction is All You Need

Python 1,764 65 Updated Oct 24, 2024

PolyU-ChenLab / ETBench

👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)

Python 23 Updated Nov 4, 2024

yfzhang114 / Awesome-Multimodal-Large-Language-Models

Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models

131 6 Updated Nov 4, 2024

deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 997 48 Updated Jan 16, 2024

XueFuzhao / OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Python 1,382 71 Updated Mar 8, 2024

October2001 / Awesome-KV-Cache-Compression

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

109 2 Updated Nov 3, 2024

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

4,967 278 Updated Nov 1, 2024

ZhenweiAn / Dynamic_MoE

Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"

Python 22 2 Updated Jul 30, 2024

hzwer / WritingAIPaper

Writing AI Conference Papers: A Handbook for Beginners

1,278 44 Updated Oct 29, 2024

minghangz / TFVTG

Python 12 1 Updated Sep 13, 2024

FreedomIntelligence / LongLLaVA

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture

Python 175 11 Updated Oct 12, 2024

MILVLG / activitynet-qa

An VideoQA dataset based on the videos from ActivityNet

Python 67 9 Updated Nov 22, 2020

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,906 172 Updated Oct 4, 2024

xudejing / video-question-answering

Video Question Answering via Gradually Refined Attention over Appearance and Motion

Python 152 27 Updated Dec 5, 2017

VITA-MLLM / VITA

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Python 937 55 Updated Oct 24, 2024

shawntan / scattermoe

Triton-based implementation of Sparse Mixture of Experts.

Python 183 14 Updated Oct 10, 2024

NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,958 156 Updated Oct 31, 2024

X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Python 2,308 176 Updated Oct 15, 2024

lucidrains / ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Python 474 27 Updated Oct 25, 2024

EvolvingLMMs-Lab / LongVA

Long Context Transfer from Language to Vision

Python 326 17 Updated Oct 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yongxin Guo gyxxyg

Achievements

Achievements

Block or report gyxxyg

Stars

Yangyi-Chen / Multimodal-AND-Large-Language-Models

THUDM / GLM-4-Voice

zhaorui-tan / L-Reg_NeurIPS24

mira-space / MiraData

westlake-baichuan-mllm / bc-omni

rhymes-ai / Aria

gyxxyg / TRACE

mbzuai-oryx / VideoGPT-plus

WHB139426 / Grounded-Video-LLM

kyegomez / Mixture-of-Depths

baaivision / Emu3

PolyU-ChenLab / ETBench

yfzhang114 / Awesome-Multimodal-Large-Language-Models

deepseek-ai / DeepSeek-MoE

XueFuzhao / OpenMoE

October2001 / Awesome-KV-Cache-Compression

hijkzzz / Awesome-LLM-Strawberry

ZhenweiAn / Dynamic_MoE

hzwer / WritingAIPaper

minghangz / TFVTG

FreedomIntelligence / LongLLaVA

MILVLG / activitynet-qa

QwenLM / Qwen2-VL

xudejing / video-question-answering

VITA-MLLM / VITA

shawntan / scattermoe

NVlabs / VILA

X-PLUG / mPLUG-Owl

lucidrains / ring-attention-pytorch

EvolvingLMMs-Lab / LongVA