web199195

web199195

1 follower · 23 following

Block or Report

Block or report web199195

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

morecry / CharacterEval

Python 150 9 Updated May 31, 2024

LC1332 / Chat-Haruhi-Suzumiya

Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.

Jupyter Notebook 1,723 154 Updated Apr 4, 2024

BlinkDL / AI-Writer

AI 写小说，生成玄幻和言情网文等等。中文预训练生成模型。采用我的 RWKV 模型，类似 GPT-2 。AI写作。RWKV for Chinese novel generation.

Python 2,852 511 Updated Sep 17, 2023

AIGText / Glyph-ByT5

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…

Jupyter Notebook 455 18 Updated Jul 13, 2024

cjyyx / AI_Gen_Novel

基于大语言模型(LLM)和多智能体(Multi-Agent)，探究AI写小说能力的边界

Python 53 12 Updated May 13, 2024

salesforce / booksum

Python 182 31 Updated May 3, 2024

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 1,699 278 Updated Aug 2, 2024

multimodal-art-projection / MAP-NEO

Python 797 73 Updated Jun 21, 2024

feifeibear / long-context-attention

Sequence Parallel Attention for Long Context LLM Model Training and Inference

Python 244 9 Updated Jun 27, 2024

Xnhyacinth / Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

679 28 Updated Jul 24, 2024

CStanKonrad / long_llama

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

Python 1,443 87 Updated Nov 7, 2023

datamllab / LongLM

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Python 566 56 Updated Jun 1, 2024

mustafaaljadery / gemma-2B-10M

Gemma 2B with 10M context length using Infini-attention.

Python 911 57 Updated May 12, 2024

We-IOT / chatglm3_6b_finetune

基于chatglm3-6b模型的lora方法的微调

Python 67 11 Updated Apr 18, 2024

jiahe7ay / infini-mini-transformer

This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and training code.

Python 49 5 Updated Apr 20, 2024

jli943 / Infini-attention-Transformer

Jupyter Notebook 3 Updated Apr 22, 2024

vmarinowski / infini-attention

An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'

Python 28 4 Updated Jul 21, 2024

jlamprou / Infini-Attention

Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval

Python 54 5 Updated May 9, 2024

fengwang / LLaMA-Factory-docker

Dockerfile 10 Updated Jul 13, 2024

hiyouga / LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 28,312 3,472 Updated Aug 2, 2024

nakranivaibhav / infi_attention

Implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Jupyter Notebook 2 1 Updated Apr 25, 2024

Beomi / InfiniTransformer

Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Python 317 27 Updated Apr 23, 2024

dingo-actual / infini-transformer

PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)

Python 258 22 Updated May 4, 2024

Strivin0311 / long-llms-learning

A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks

Jupyter Notebook 229 11 Updated Jul 30, 2024

HqWu-HITCS / Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

13,895 1,277 Updated Jul 21, 2024

MetaGLM / FinGLM

FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目，利用开源开放来促进「AI+金融」。

HTML 1,636 240 Updated May 8, 2024

WooooDyy / LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

5,904 354 Updated Jul 28, 2024

yangjianxin1 / Firefly

Firefly: 大模型训练工具，支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,394 486 Updated Jul 16, 2024

huggingface / deep-rl-class

This repo contains the syllabus of the Hugging Face Deep Reinforcement Learning Course.

MDX 3,761 574 Updated Jul 15, 2024

AIoT-MLSys-Lab / Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

885 75 Updated Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly