Kosmos2.5

My implementation of Kosmos2.5 from Microsoft research and the paper: "KOSMOS-2.5: A Multimodal Literate Model"

Appreciation

Lucidrains
Agorians

Install

pip install kosmos2-torch

Usage

import torch
from kosmos.model import Kosmos

#usage
img = torch.randn(1, 3, 256, 256)
text = torch.randint(0, 20000, (1, 1024))

model = Kosmos()
output = model(img, text)
print(output)

Dataset Strategy

Here is a table summarizing the datasets used in the paper KOSMOS-2.5: A Multimodal Literate Model with metadata and source links:

Dataset	Modality	# Samples	Domain	Source
IIT-CDIP	Text + Layout	27.6M pages	Scanned documents	Link
arXiv papers	Text + Layout	20.9M pages	Research papers	Link
PowerPoint slides	Text + Layout	6.2M pages	Presentation slides	Web crawl
General PDF	Text + Layout	155.2M pages	Diverse PDF files	Web crawl
Web screenshots	Text + Layout	100M pages	Webpage screenshots	Link
README	Text + Markdown	2.9M files	GitHub README files	Link
DOCX	Text + Markdown	1.1M pages	WORD documents	Web crawl
LaTeX	Text + Markdown	3.7M pages	Research papers	Link
HTML	Text + Markdown	6.3M pages	Webpages	Link

License

MIT

Citations

@misc{2309.11419,
Author = {Tengchao Lv and Yupan Huang and Jingye Chen and Lei Cui and Shuming Ma and Yaoyao Chang and Shaohan Huang and Wenhui Wang and Li Dong and Weiyao Luo and Shaoxiang Wu and Guoxin Wang and Cha Zhang and Furu Wei},
Title = {Kosmos-2.5: A Multimodal Literate Model},
Year = {2023},
Eprint = {arXiv:2309.11419},
}

bold italics

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github		.github
kosmos		kosmos
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
agorabanner.png		agorabanner.png
example.py		example.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kosmos2.5

Appreciation

Install

Usage

Dataset Strategy

License

Citations

About

Releases

Packages

Languages

License

nunamia/Kosmos2.5

Folders and files

Latest commit

History

Repository files navigation

Kosmos2.5

Appreciation

Install

Usage

Dataset Strategy

License

Citations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages