Highlights
- Pro
Block or Report
Block or report dwzhu-pku
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and…
Efficient retrieval head analysis with triton flash attention that supports topK probability
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
[ACL 2024] A Prospector of Long-Dependency Data for Large Language Models
The this is the official implementation of "CAPE: Context-Adaptive Positional Encoding for Length Extrapolation"
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
Sequence Parallel Attention for Long Context LLM Model Training and Inference
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
The official repo for "LLoCo: Learning Long Contexts Offline"
Go ahead and axolotl questions
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
LongHeads: Multi-Head Attention is Secretly a Long Context Processor
Official implementation for the paper "LongEmbed: Extending Embedding Models for Long Context Retrieval"