-
Shanghai Jiao Tong University
- Shanghai, China ↔️ Fuzhou, China
-
19:21
(UTC +08:00) - https://tonyfang.net/
- @FangGalaxies
Highlights
- Pro
Stars
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation
TensorRT implementation of Depth-Anything V1, V2
[IROS 2024] 📈 RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
Pre-training Reusable Representations for Robotic Manipulation Using Diverse Human Video Data
Official repository for "VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training"
The repository for the largest and most comprehensive empirical study of visual foundation models for Embodied AI (EAI).
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Source code for Twitter's Recommendation Algorithm
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
RDK (Robotic Development Kit) for Flexiv robots. Supports C++ and Python. Compatible with Linux, macOS, and Windows.
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
Robot bimanual manipulation / dual-arm manipulation
A playbook for systematically maximizing the performance of deep learning models.
[ECCV2022] Masked Autoencoders for Point Cloud Self-supervised Learning
Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch
[WACV 2023] XNeRF: Explicit Neural Radiance Field for Multi-Scene 360° Insufficient RGB-D Views
Denoising Diffusion Probabilistic Models
code for the SE3 Transformers paper: https://arxiv.org/abs/2006.10503
[ICLR 2023] Equivariant Descriptor Fields: SE(3)-Equivariant Energy-Based Models for End-to-End Visual Robotic Manipulation Learning