Block or Report
Block or report web199195
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.
AI 写小说,生成玄幻和言情网文等等。中文预训练生成模型。采用我的 RWKV 模型,类似 GPT-2 。AI写作。RWKV for Chinese novel generation.
[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
Sequence Parallel Attention for Long Context LLM Model Training and Inference
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Gemma 2B with 10M context length using Infini-attention.
This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and training code.
An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'
Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
This repo contains the syllabus of the Hugging Face Deep Reinforcement Learning Course.
[TMLR 2024] Efficient Large Language Models: A Survey