- Shenzhen, China
- https://www.fullstackmemo.com
Block or Report
Block or report x22x22
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用
A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Awesome machine learning model compression research papers, quantization, tools, and learning material.
A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆25 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Train…
🪢 Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.
Intelligence for Kubernetes. World's most promising Kubernetes Visualization Tool for Developer and Platform Engineering teams.
Must-read Papers on Large Language Model (LLM) Continual Learning
Continual Learning of Large Language Models: A Comprehensive Survey
A plug-and-play library for parameter-efficient-tuning (Delta Tuning)
An Open-Source Framework for Prompt-Learning.
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and…
中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
Python 3.9+ installers that support Windows 7 SP1 and Windows Server 2008 R2
DSPy: The framework for programming—not prompting—foundation models
This is an implementation of the paper Attention is all you need.
Unsupervised text tokenizer for Neural Network-based text generation.
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)