RT-X

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"

Here we implement both model architectures, RTX-1 and RTX-2

Paper Link

Appreciation

Lucidrains
Agorians

Install

pip install rtx-torch

Usage

RTX1 Usage takes in text and videos

import torch
from rtx.rtx1 import RTX1

model = RTX1()

video = torch.randn(2, 3, 6, 224, 224)

instructions = ["bring me that apple sitting on the table", "please pass the butter"]

# compute the train logits
train_logits = model.train(video, instructions)

# set the model to evaluation mode
model.model.eval()

# compute the eval logits with a conditional scale of 3
eval_logits = model.run(video, instructions, cond_scale=3.0)
print(eval_logits.shape)

RTX-2 takes in images and text and interleaves them to form multi-modal sentences:

import torch
from rtx import RTX2

# usage
img = torch.randn(1, 3, 256, 256)
text = torch.randint(0, 20000, (1, 1024))

model = RTX2()
output = model(img, text)
print(output)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
rtx		rtx
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agorabanner.png		agorabanner.png
img.jpeg		img.jpeg
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
rtx1_example.py		rtx1_example.py
rtx2_example.py		rtx2_example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RT-X

Appreciation

Install

Usage

License

Citations

Todo

About

Releases

Packages

Languages

License

kandapagari/RT-X

Folders and files

Latest commit

History

Repository files navigation

RT-X

Appreciation

Install

Usage

License

Citations

Todo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages