Skip to content
View web199195's full-sized avatar
Block or Report

Block or report web199195

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 150 9 Updated May 31, 2024

Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.

Jupyter Notebook 1,723 154 Updated Apr 4, 2024

AI 写小说,生成玄幻和言情网文等等。中文预训练生成模型。采用我的 RWKV 模型,类似 GPT-2 。AI写作。RWKV for Chinese novel generation.

Python 2,852 511 Updated Sep 17, 2023

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…

Jupyter Notebook 455 18 Updated Jul 13, 2024

基于大语言模型(LLM)和多智能体(Multi-Agent),探究AI写小说能力的边界

Python 53 12 Updated May 13, 2024
Python 182 31 Updated May 3, 2024

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 1,699 278 Updated Aug 2, 2024

Sequence Parallel Attention for Long Context LLM Model Training and Inference

Python 244 9 Updated Jun 27, 2024

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

679 28 Updated Jul 24, 2024

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

Python 1,443 87 Updated Nov 7, 2023

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Python 566 56 Updated Jun 1, 2024

Gemma 2B with 10M context length using Infini-attention.

Python 911 57 Updated May 12, 2024

基于chatglm3-6b模型的lora方法的微调

Python 67 11 Updated Apr 18, 2024

This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and training code.

Python 49 5 Updated Apr 20, 2024
Jupyter Notebook 3 Updated Apr 22, 2024

An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'

Python 28 4 Updated Jul 21, 2024

Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval

Python 54 5 Updated May 9, 2024
Dockerfile 10 Updated Jul 13, 2024

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 28,312 3,472 Updated Aug 2, 2024

Implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Jupyter Notebook 2 1 Updated Apr 25, 2024

Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Python 317 27 Updated Apr 23, 2024

PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)

Python 258 22 Updated May 4, 2024

A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks

Jupyter Notebook 229 11 Updated Jul 30, 2024

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

13,895 1,277 Updated Jul 21, 2024

FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。

HTML 1,636 240 Updated May 8, 2024

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

5,904 354 Updated Jul 28, 2024

Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,394 486 Updated Jul 16, 2024

This repo contains the syllabus of the Hugging Face Deep Reinforcement Learning Course.

MDX 3,761 574 Updated Jul 15, 2024

[TMLR 2024] Efficient Large Language Models: A Survey

885 75 Updated Aug 2, 2024
Next