Skip to content
View dwzhu-pku's full-sized avatar

Highlights

  • Pro

Organizations

@PKU-TANGENT
Block or Report

Block or report dwzhu-pku

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official github repo for the paper "Compression Represents Intelligence Linearly"

Python 111 3 Updated Jun 9, 2024

The memory layer for Personalized AI

Python 15,399 1,546 Updated Jul 23, 2024

The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism

Python 18 Updated Jul 17, 2024

Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".

Python 128 10 Updated Jul 8, 2024

Fast and memory-efficient exact attention

Python 12,549 1,116 Updated Jul 23, 2024
Python 122 6 Updated Jul 23, 2024

To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

Python 569 19 Updated Jul 18, 2024

Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs

Python 20 2 Updated Jul 2, 2024

awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.

98 5 Updated Jul 20, 2024

LOFT: A 1 Million+ Token Long-Context Benchmark

106 3 Updated Jun 21, 2024

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

1,438 68 Updated Jul 3, 2024
Python 17 1 Updated Jun 23, 2024

Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement

Python 11 Updated Jul 17, 2024

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and…

Python 6,311 1,225 Updated Jul 23, 2024

Efficient retrieval head analysis with triton flash attention that supports topK probability

Jupyter Notebook 12 Updated Jun 15, 2024
Jupyter Notebook 387 26 Updated Jul 19, 2024

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

118 5 Updated Jun 12, 2024

The repo for In-context Autoencoder

Jupyter Notebook 61 2 Updated May 11, 2024

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

315 11 Updated Jun 18, 2024

[ACL 2024] A Prospector of Long-Dependency Data for Large Language Models

Python 39 Updated Jul 23, 2024
Python 12 1 Updated Jul 20, 2024

The this is the official implementation of "CAPE: Context-Adaptive Positional Encoding for Length Extrapolation"

Python 6 Updated May 23, 2024

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Python 359 20 Updated Jul 19, 2024

🐚 OpenDevin: Code Less, Make More

Python 28,821 3,328 Updated Jul 23, 2024

BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.

Jupyter Notebook 116 12 Updated Jun 28, 2024

Sequence Parallel Attention for Long Context LLM Model Training and Inference

Python 228 8 Updated Jun 27, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,104 116 Updated Jun 26, 2024
Next