-
SIAT-CAS
Highlights
- Pro
Stars
LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)
The official implementation of ICLR 2020, "Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering".
O1 Replication Journey: A Strategic Progress Report – Part I
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Generative Representational Instruction Tuning
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.
xFinder: Robust and Pinpoint Answer Extraction for Large Language Models
A resource repository for machine unlearning in large language models
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
Reproduced LLaVA-NeXT with training code and scripts.
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
An open-source implementation for training LLaVA-NeXT.
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Accelerating the development of large multimodal models (LMMs) with lmms-eval
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
ReFT: Representation Finetuning for Language Models