LLM
Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Let ChatGPT teach your own chatbot in hours with a single GPU!
Examples and guides for using the OpenAI API
🦜🔗 Build context-aware reasoning applications
LlamaIndex is a data framework for your LLM applications
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
we want to create a repo to illustrate usage of transformers in chinese
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
HuggingLLM, Hugging Future.
A guidance language for controlling large language models.
Library for fast text representation and classification.
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Prompt Framework made to optimise conversation with LLM's.
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering