Stars
Tools to download and cleanup Common Crawl data
I have used Time Series Analysis to predict the behavior and pattern of Passengers at a bus stop, Data Visualizations include Time-Series Plots.
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
📚 A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application
A latent text-to-image diffusion model
Open-Sora: Democratizing Efficient Video Production for All
High-Resolution Image Synthesis with Latent Diffusion Models
Fetch papers from https://papers.cool/, and sort them by the number of clicks from AI researchers.
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Models
Official inference library for Mistral models
Emu Series: Generative Multimodal Models from BAAI
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.
A high-throughput and memory-efficient inference and serving engine for LLMs
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
A framework for few-shot evaluation of language models.
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT-2 / GPT-3 / GPT-4 / GPT-4o. Port of OpenAI's tiktoken with additional features.