- Zihuatanejo
-
20:53
(UTC +08:00) - https://nagi.fun/
- @Nag1ovo
Highlights
- Pro
Block or Report
Block or report Nagi-ovo
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLLM
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
LlamaIndex is a data framework for your LLM applications
lightweight, standalone C++ inference engine for Google's Gemma models.
The official PyTorch implementation of Google's Gemma models
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
Modeling, training, eval, and inference code for OLMo
基于向量数据库与GPT3.5的通用本地知识库方案(A universal local knowledge base solution based on vector database and GPT3.5)
a fast cross platform AI inference engine 🤖 using Rust 🦀 and WebGPU 🎮
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.
An unnecessarily tiny implementation of GPT-2 in NumPy.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
LLM Finetuning with peft
<Beat AI> 又名 <零生万物> , 是一本专属于软件开发工程师的 AI 入门圣经,手把手带你上手写 AI。从神经网络到大模型,从高层设计到微观原理,从工程实现到算法,学完后,你会发现 AI 也并不是想象中那么高不可攀、无法战胜,Just beat it !
中文nlp解决方案(大模型、数据、模型、训练、推理)
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Video+code lecture on building nanoGPT from scratch
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence