Skip to content
View HeZez's full-sized avatar

Block or report HeZez

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The Memory layer for your AI apps

Python 22,375 2,062 Updated Oct 18, 2024

An (unofficial) implementation of Focal Loss, as described in the RetinaNet paper, generalized to the multi-class case.

Python 222 25 Updated Jan 22, 2024

Material for gpu-mode lectures

Jupyter Notebook 2,735 268 Updated Oct 17, 2024

Open source annotation tool for machine learning practitioners.

Python 9,500 1,721 Updated Oct 11, 2024

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Python 17,031 1,168 Updated Oct 19, 2024

AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents

Python 13,536 1,803 Updated Oct 18, 2024

Generative Agents: Interactive Simulacra of Human Behavior

17,099 2,188 Updated Aug 5, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,496 148 Updated Sep 25, 2024
Python 168 9 Updated May 31, 2024

SuperCLUE-Role中文原生角色扮演测评基准

21 Updated Apr 3, 2024

NTK scaled version of ALiBi position encoding in Transformer.

66 3 Updated Aug 16, 2023

The official Meta Llama 3 GitHub site

Python 26,714 3,026 Updated Aug 12, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 28,725 4,259 Updated Oct 19, 2024

通义千问VLLM推理部署DEMO

Python 424 61 Updated Mar 28, 2024

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Shell 51,280 11,425 Updated Oct 18, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,515 198 Updated Oct 18, 2024

KenLM: Faster and Smaller Language Model Queries

C++ 2,501 512 Updated Jul 30, 2024

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,130 67 Updated Oct 14, 2024

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

Python 4,440 395 Updated Sep 8, 2024

A clash client for Windows, support Mihomo

C# 4,810 596 Updated Jun 29, 2024

Various examples for different articles

HTML 163 92 Updated Oct 16, 2024

CoSENT、STS、SentenceBERT

Python 161 21 Updated Jul 10, 2023

Ongoing research training transformer models at scale

Python 10,323 2,310 Updated Oct 19, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

2,673 182 Updated Oct 15, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,440 956 Updated Oct 15, 2024

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Python 2,616 271 Updated Aug 14, 2024

A natural language interface for computers

Python 52,635 4,649 Updated Oct 15, 2024

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

C++ 3,299 336 Updated Oct 15, 2024
Next