- Shenzhen, China
- @felix1987_
Block or Report
Block or report numb3r3
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (2)
Sort Name ascending (A-Z)
Language
Sort by: Recently starred
Starred repositories
Improving Text Embedding of Language Models Using Contrastive Fine-tuning
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "
A Blazing Fast AI Gateway. Route to 200+ LLMs with 1 fast & friendly API.
Fast inference from large lauguage models via speculative decoding
Utilities intended for use with Llama models.
Implementation of Infini-Transformer in Pytorch
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)
EfficientViT is a new family of vision models for efficient high-resolution vision.
Work-in-progress vector search SQLite extension that runs anywhere.
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
aohan237 / TF-ID
Forked from ai8hyf/TF-IDTF-ID: Table/Figure IDentifier for academic papers
GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
A pytorch port of Google's RETSim model used in UniSim
This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Gemma 2B with 10M context length using Infini-attention.
Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)