All about large language models
- my-alpaca reproduce alpaca
- multi-turn-alpaca train alpaca with multi-turn dialogue datasets
- alpaca-rlhf train multi-turn alpaca with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
- my-autocrit experiments using autocrit
- try-large-models try large models
- my-rl learn reinforcement learning using tianshou
- 2023-Challenges and Applications of Large Language Models paper
- 2023-A Survey of Large Language Models [paper]
- Blog
- Paper
- Rotary
- ALiBi [paper]
- Survey
- RMSNorm
- Layer Normalization
- Pre-LN
- Post-LN
- Sandwich-LN
- DeepNorm
- SwiGLU
- GeLUs
- Swish
- BPE paper
- 2020-Scaling Laws for Neural Language Models [paper]
- T0
- FLAN
- Flan-LM
- BLOOMZ & mT0
- ChatGPT
- Alpaca: A Strong, Replicable Instruction-Following Model
- Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality
- Koala: A Dialogue Model for Academic Research
- alpaca-lora
- ChatGLM-6B
- Firefly
- thai-buffala-lora-7b-v0-1
- multi-turn-alpaca
- Open-Assistant
- Chinese-ChatLLaMA
- BELLE
- Chinese-LLaMA-Alpaca
- Luotuo-Chinese-LLM
- Chinese-Vicuna
- Chinese-alpaca-lora
- Japanese-Alpaca-LoRA
- 2023-ChatDoctor: A medical chat model fine-tuned on llama model using medical domain knowledge
- 华驼(HuaTuo): 基于中文医学知识的LLaMA微调模型
- LawGPT_zh:中文法律大模型(獬豸)
- 2023-Recalpaca: Low-rank llama instruct-tuning for recommendation
- 2023-A Survey of Domain Specialization for Large Language Models [paper]
- 2017-Proximal Policy Optimization Algorithms [paper]
- 2016-Asynchronous methods for deep reinforcement learning [paper]
- 2015-High-dimensional continuous control using generalized advantage estimation [paper]
- 2015-mlr-Trust Region Policy Optimization [paper]
- 2023-REWARD DESIGN WITH LANGUAGE MODELS [paper]
- 2022-Scaling Laws for Reward Model Overoptimization [paper]
- autocrit
- 2023-On The Fragility of Learned Reward Functions [paper]
- 2021-LoRA- Low-Rank Adaptation of Large Language Models [paper]
- 2023-RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment [paper]
- 2023-Preference Ranking Optimization for Human Alignment [paper]
- 2023-Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization [paper]
- 2023-Fine-Grained Human Feedback Gives Better Rewards for Language Model Training paper]
- 2023-Chain of Hindsight Aligns Language Models with Feedback [paper]
- 2023-Training Socially Aligned Language Models in Simulated Human Society [paper]
- 2023-Let’s Verify Step by Step [paper]
- 2023-The False Promise of Imitating Proprietary LLMs [paper]
- 2023-AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback [paper]
- 2023-LIMA- Less Is More for Alignment [paper]
- 2023-RRHF: Rank Responses to Align Language Models with Human Feedback without tears [paper] [code]
- 2022-Solving math word problems with process-and outcome-based feedback [paper]
- 2022-Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback [paper]
- 2022-Training language models to follow instructions with human feedback [paper]
- 2022-Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned [paper]
- 2022-LaMDA- Language Models for Dialog Applications [Paper]
- 2022-Constitutional ai- Harmlessness from ai feedback [paper]
- 2021-A general language assistant as a laboratory for alignment [paper]
- 2021-Ethical and social risks of harm from language models [paper]
- 2020-nips-Learning to summarize from human feedback [paper]
- 2019-Fine-Tuning Language Models from Human Preferences [paper]
- 2018-Scalable agent alignment via reward modeling: a research direction [paper]
- Reinforcement Learning for Language Models Blog
- 2017-nips-Deep reinforcement learning from human preferences [paper]
- 2016-Concrete problems in ai safety [paper]
- 2022-naacl-MetaICL- Learning to Learn In Context [paper]
- 2022-iclr-Multitask Prompted Training Enables Zero-Shot Task Generalization [paper]
- 2023-Tree of Thoughts: Deliberate Problem Solving with Large Language Models [paper]
- 2023-Guiding Large Language Models via Directional Stimulus Prompting [paper]
- 2023-ICLR-Self-Consistency Improves Chain of Thought Reasoning in Language Models [paper]
- 2023-Is Prompt All You Need No. A Comprehensive and Broader View of Instruction Learning [paper]
- 2021-Pre-train, Prompt, and Predict- A Systematic Survey of Prompting Methods in Natural Language Processing [paper]
- 2023-Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition [paper]
- 2022-AC-PPT- Pre-trained Prompt Tuning for Few-shot Learning [paper]
- 2022-ACL-P-Tuning- Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks [paper]
- 2021-EMNLP-The Power of Scale for Parameter-Efficient Prompt Tuning [paper]
- 2021-acl-Prefix-Tuning- Optimizing Continuous Prompts for Generation [paper]
- 2021-GPT Understands, Too [paper]
- 2023-OpenAGI: When LLM Meets Domain Experts [paper]
- 2023-WebCPM: Interactive Web Search for Chinese Long-form Question Answering [paper]
- 2023-Evaluating Verifiability in Generative Search Engines [paper]
- 2023-Enabling Large Language Models to Generate Text with Citations [paper]
- 2022-ACL-Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora [paper]
- 2022-findings-acl-ELLE: Efficient Lifelong Pre-training for Emerging Data [paper]
- langchain
- GitHub
- 2023-Check Your Facts and Try Again- Improving Large Language Models with External Knowledge and Automated Feedback [paper]
- 2022-Teaching language models to support answers with verified quotes
- 2021-Webgpt: Browser-assisted question-answering with human feedback [paper]
- 2021-Improving language models by retrieving from trillions of tokens
- 2020-REALM: retrieval-augmented language model pre-training
- 2020-Retrieval-augmented generation for knowledge-intensive NLP tasks
- RedPajama-Data
- C4
- Pile
- ROOTS
- Wudao Corpora
- 大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
- CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集
- 中文图书语料集合
- Chinese Open Instruction Generalist (COIG)
- 医疗数据集
- 金融数据
- ChatAlpaca
- InstructionZoo
- FlagInstruct
- fnlp/moss-002-sft-data
- SuperCLUE:中文通用大模型综合性测评基准
- Open LLMs benchmark大模型能力评测标准计划
- 中文医疗大模型评测基准-PromptCBLUE
- GLUE、SuperGLUE、SQuAD、CoQA、WMT、LAMBADA、ROUGE、智源指数CUGE、MMLU、Hellaswag、OpenBookQA、ARC、TriviaQA、TruthfulQA
- 2023-A Pretrainer’s Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity [paper]
- 2023-DoReMi: Optimizing data mixtures speeds up language model pretraining
- 2023-Data selection for language models via importance resampling
- 2022-SELF-INSTRUCT- Aligning Language Model with Self Generated Instructions [paper]
- 2022-acl-Deduplicating training data makes language models better [paper]
- 2023-findings-acl-Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation paper
- 2023-Harnessing the Power of LLMs in Practice- A Survey on ChatGPT and Beyond [paper]
- 2023-INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models [paper]
- LLMZoo: a project that provides data, models, and evaluation benchmark for large language models.
- 2023-Evaluating ChatGPT's Information Extraction Capabilities- An Assessment of Performance, Explainability, Calibration, and Faithfulness paper
- 2023-Towards Better Instruction Following Language Models for Chinese- Investigating the Impact of Training Data and Evaluation paper
- PandaLM
- lm-evaluation-harness
- BIG-bench
- 2023-HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models [paper]
- 2023-C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models [paper]
- 2023-Safety Assessment of Chinese Large Language Models [paper]
- 2022-Holistic Evaluation of Language Models [paper]
- helpfulness
- honesty
- harmlessness
- truthfulness
- robustness
- Bias, Toxicity and Misinformation
- 已有的评估通常只用已有的常见NLP任务,海量的其它任务并没有评估,比如写邮件
- Pythia: Interpreting Autoregressive Transformers Across Time and Scale
- 2023-Inspecting and Editing Knowledge Representations in Language Models [paper]
- ChatGPT
- 文心一言
- 通义千问
- AgentGPT
- HuggingGPT
- AutoGPT
- MiniGPT-4
- ShareGPT
- character ai
- LLaVA
- Video-LLaMA
- ChatPaper
- 2023-AnnoLLM- Making Large Language Models to Be Better Crowdsourced Annotators [paper]
- 2022-Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks [paper]
- 2023-Sentiment Analysis in the Era of Large Language Models- A Reality Check [Paper] [GitHub]
- 2023-Can chatgpt understand too? A comparative study on chatgpt and fine-tuned BERT
- 2023-Is chatgpt a good sentiment analyzer? A preliminary study
- 2023-Llms to the moon? reddit market sentiment analysis with large language models
- 2023-Is GPT-3 a Good Data Annotator? [paper]
- 2022-Language models in the loop: Incorporating prompting into weak supervision [paper]
- 2023-Unifying Large Language Models and Knowledge Graphs: A Roadmap [paper]
- 2020-ICLR-Neural text generation with unlikelihood training [paper]
- 2021-findings-emnlp-GeDi- Generative Discriminator Guided Sequence Generation [paper]
- 2021-ACL-DExperts- Decoding-Time Controlled Text Generation with Experts and Anti-Experts [paper]
- 2021-ICLR-Mirostat- a neural text decoding algorithm that directly controls perplexity [paper]
- 2022-NIPS-A Contrastive Framework for Neural Text Generation [paper]
- 2022-ACL-Length Control in Abstractive Summarization by Pretraining Information Selection [paper]
- 2020-Integer Quantization for Deep Learning Inference Principles and Empirical Evaluation [paper]
- 2023-ICLR-GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers [paer]
- 2023-QLORA: Efficient Finetuning of Quantized LLMs [paper]
- 如何为GPT/LLM模型添加额外知识?
- 如何正确复现 Instruct GPT / RLHF?
- ChatGPT在单个NLP数据集任务上比SOTA有多大提升?
- 影响PPO算法性能的10个关键技巧(附PPO算法简洁Pytorch实现)
- 灌水新方向 偏好强化学习概述
- 为什么说大模型训练很难?
- 如何基于深度学习大模型开展小模型的研发,如何把大模型和小模型相结合?
- open-llms A list of open LLMs available for commercial use.
- safe-rlhf
- Awesome-Multimodal-Large-Language-Models
- Awesome-Chinese-LLM