Starred repositories
Learning eBPF, published by O'Reilly - out now! Here's where you'll find a VM config for the examples, and more
⚡️SwanLab: your ML experiment notebook. 你的AI实验笔记本,跟踪与可视化你的机器学习全流程
Fast and memory-efficient exact attention
🔥 经典编程书籍大全,涵盖:计算机系统与网络、系统架构、算法与数据结构、前端开发、后端开发、移动开发、数据库、测试、项目与团队、程序员职业修炼、求职面试等
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
we want to create a repo to illustrate usage of transformers in chinese
A robust web archive analytics toolkit
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系[email protected] 版权所有,违权必究 Tan 2018.06
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
a unified scheduler for online and offline tasks
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
BetterAndBetter 是一款包含很多功能的 macOS 软件。
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Retrieval and Retrieval-augmented LLMs
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.