Pan-Yuqi

Pan-Yuqi

Stars

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 10,687 681 Updated Aug 14, 2024

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 38,782 4,341 Updated Nov 8, 2024

THUDM / LongAlign

[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs

Python 216 15 Updated Apr 22, 2024

EleutherAI / pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Jupyter Notebook 2,267 170 Updated Nov 1, 2024

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 134,690 26,935 Updated Nov 9, 2024

tau-nlp / scrolls

The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".

Python 68 8 Updated Jan 12, 2024

gkamradt / LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook 1,552 159 Updated Aug 17, 2024

THUDM / LongBench

[ACL 2024] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

Python 659 54 Updated Sep 10, 2024

mlfoundations / dclm

DataComp for Language Models

HTML 1,151 104 Updated Oct 27, 2024

EleutherAI / DeeperSpeed

Forked from microsoft/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Python 165 46 Updated Apr 22, 2024

athms / mad-lab

A MAD laboratory to improve AI architecture designs 🧪

Python 95 6 Updated May 2, 2024

berlino / gated_linear_attention

Python 97 2 Updated Mar 9, 2024

apple / corenet

CoreNet: A library for training deep neural networks

Jupyter Notebook 6,980 541 Updated Oct 14, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,132 2,547 Updated Nov 9, 2024

sustcsonglin / flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 1,324 68 Updated Nov 9, 2024

microsoft / AI-System

System for AI Education Resource.

Python 3,592 442 Updated Oct 25, 2024

HazyResearch / zoology

Understand and test language model architectures on synthetic tasks.

Python 161 27 Updated May 1, 2024

EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 6,937 1,009 Updated Nov 6, 2024

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 6,918 1,849 Updated Nov 9, 2024

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,841 461 Updated May 3, 2024

BlinkDL / RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…

Python 12,630 859 Updated Nov 9, 2024

BICLab / Spike-Driven-Transformer-V2

Offical implementation of "Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips" (ICLR2024)

Python 136 18 Updated May 10, 2024

lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python 20,467 3,038 Updated Nov 8, 2024

state-spaces / mamba

Mamba SSM architecture

Python 13,132 1,115 Updated Nov 5, 2024

huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 32,177 4,754 Updated Nov 8, 2024

BICLab / Spike-Driven-Transformer

Offical implementation of "Spike-driven Transformer" (NeurIPS2023)

Python 219 17 Updated Mar 18, 2024

ridgerchu / SpikeGPT

Implementation of "SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks"

Python 762 76 Updated May 31, 2024

fangwei123456 / spikingjelly

SpikingJelly is an open-source deep learning framework for Spiking Neural Network (SNN) based on PyTorch.

Python 1,376 239 Updated Nov 9, 2024

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 83,797 22,592 Updated Nov 10, 2024

pytorch / vision

Datasets, Transforms and Models specific to Computer Vision

Python 16,212 6,951 Updated Nov 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly