-
UCLA
- Los Angeles, California
- https://evelinehong.github.io
- @yining_hong
Highlights
- Pro
Block or Report
Block or report evelinehong
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies
Dynamic Thresholding (CFG Scale Fix) for Stable Diffusion (eSwarmUI, ComfyUI, and Auto WebUI)
[ECCV 2024] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Stable Video Diffusion Training Code and Extensions.
MotionDirector: Motion Customization of Text-to-Video Diffusion Models.
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
Finetune ModelScope's Text To Video model using Diffusers 🧨
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Official implementation for CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Official code release for ConceptGraphs
Code for 3D-LLM: Injecting the 3D World into Large Language Models
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
evelinehong / 3D-CLR-Official
Forked from zsh2000/3D-CLR[CVPR 2023] Code for "3D Concept Learning and Reasoning from Multi-View Images"
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"
[ICCV'23 Workshop] SAM3D: Segment Anything in 3D Scenes
An open-source framework for training large multimodal models.
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
Code and documentation to train Stanford's Alpaca models, and generate the data.
gradslam is an open source differentiable dense SLAM library for PyTorch
Code Release of "3D Concept Grounding on Neural Fields (NeurIPS2022)"
Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
🔥Urban-scale point cloud dataset (CVPR 2021 & IJCV 2022)
Direct voxel grid optimization for fast radiance field reconstruction.