Highlights
- Pro
Block or Report
Block or report koukyo1994
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
LLaVA-JP is a Japanese VLM trained by LLaVA method
world modeling challenge for humanoid robots
Open-MAGVIT2: Democratizing Autoregressive Visual Generation
[CVPR 2024] On the Content Bias in Fréchet Video Distance
commaVQ is a dataset of compressed driving video
A Generalizable World Model for Autonomous Driving
Code to Blur Human Faces and Vehicle License Plates in Video and Images using a SoTA Object Detection model YOLOv8
Implementation of MagViT2 Tokenizer in Pytorch
Stable Video Diffusion Training Code and Extensions.
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Collect some World Models for Autonomous Driving papers.
"Improving Mathematical Reasoning with Process Supervision" by OPENAI
Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch
Image Generation using VQVAE and GPT Models
Open-Sora: Democratizing Efficient Video Production for All
A curated list of foundation models for vision and language tasks
[IEEE T-PAMI] All you need for End-to-end Autonomous Driving
OpenMMLab Detection Toolbox and Benchmark
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
日本語LLMまとめ - Overview of Japanese LLMs
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/
A Supervised and Semi-Supervised Object Detection Library for YOLO Series
Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
SAM with text prompt