Skip to content

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 201 apps, and 1.4K app combos.

Notifications You must be signed in to change notification settings

OpenGVLab/GUI-Odyssey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GUI Odyssey

This repository is the official implementation of GUI Odyssey.

GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Quanfeng Lu, Wenqi Shao✉️⭐️, Zitao Liu, Fanqing Meng, Boxuan Li, Botong Chen, Siyuan Huang, Kaipeng Zhang, Yu Qiao, Ping Luo✉️
✉️ Wenqi Shao ([email protected]) and Ping Luo ([email protected]) are correponding authors.
⭐️ Wenqi Shao is project leader.

💡 News

🔆 Introduction

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 201 apps, and 1.4K app combos. overview

🛠️ Data collection pipeline

GUI Odyssey comprises six categories of navigation tasks. For each category, we construct instruction templates with items and apps selected from a predefined pool, resulting in a vast array of unique instructions for annotating GUI episodes. Human demonstrations on an Android emulator capture the metadata of each episode in a comprehensive format. After rigorous quality checks, GUI Odyssey includes 7,735 validated cross-app GUI navigation episodes. pipeline

📝 Statistics

Splits # Episodes # Unique Prompts # Avg. Steps Data location Model
Total 7,735 7,735 15.4 GUI-Odyssey OdysseyAgent
Train-Random & Test-Random 5,802 / 1,933 5,802 / 1,933 15.4 / 15.2 random_split.json OdysseyAgent-Random
Train-Task & Test-Task 6,719 / 1,016 6,719 / 1,016 15.0 / 17.6 task_split.json OdysseyAgent-Task
Train-Device & Test-Device 6,473 / 1,262 6,473 / 1,262 15.4 / 15.0 device_split.json OdysseyAgent-Device
Train-App & Test-App 6,596 / 1,139 6,596 / 1,139 15.4 / 15.3 app_split.json OdysseyAgent-App

💫 Dataset Access

The whole GUI Odyssey is hosted on Huggingface.

Clone the entire dataset from Huggingface:

git clone https://huggingface.co/datasets/OpenGVLab/GUI-Odyssey

And then move the cloned dataset into ./data directory. After that, the structure of ./data should look like this:

GUI-Odyssey
├── data
│   ├── annotations
│   │   └── *.json
│   ├── screenshots
│   │   └── data_*
│   │        └── *.png
│   ├── splits
│   │   ├── app_split.json
│   │   ├── device_split.json
│   │   ├── random_split.json
│   │   └── task_split.json
│   ├── format_converter.py
│   └── preprocessing.py
└── ...

Then organize the screenshots folder:

cd data
python preprocessing.py

Finally, the structure of ./data should look like this:

GUI-Odyssey
├── data
│   ├── annotations
│   │   └── *.json
│   ├── screenshots
│   │   └── *.png
│   ├── splits
│   │   ├── app_split.json
│   │   ├── device_split.json
│   │   ├── random_split.json
│   │   └── task_split.json
│   ├── format_converter.py
│   └── preprocessing.py
└── ...

⚙️ Detailed Data Information

Please refer to this.

🚀 Quick Start

Please refer to this to quick start.

📖 Release Process

  • Dataset
    • Screenshots of GUI Odyssey
    • annotations of GUI Odyssey
    • split files of GUI Odyssey
  • Code
    • data preprocessing code
    • inference code
  • Models

🖊️ Citation

If you feel GUI Odyssey useful in your project or research, please kindly use the following BibTeX entry to cite our paper. Thanks!

@misc{lu2024gui,
      title={GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices}, 
      author={Quanfeng Lu and Wenqi Shao and Zitao Liu and Fanqing Meng and Boxuan Li and Botong Chen and Siyuan Huang and Kaipeng Zhang and Yu Qiao and Ping Luo},
      year={2024},
      eprint={2406.08451},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 201 apps, and 1.4K app combos.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages