Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs

Zhu, Jinguo; Zhu, Xizhou; Wang, Wenhai; Wang, Xiaohua; Li, Hongsheng; Wang, Xiaogang; Dai, Jifeng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2206.04674 (cs)

[Submitted on 9 Jun 2022 (v1), last revised 5 Jul 2022 (this version, v2)]

Title:Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs

Authors:Jinguo Zhu, Xizhou Zhu, Wenhai Wang, Xiaohua Wang, Hongsheng Li, Xiaogang Wang, Jifeng Dai

View PDF

Abstract:To build an artificial neural network like the biological intelligence system, recent works have unified numerous tasks into a generalist model, which can process various tasks with shared parameters and do not have any task-specific modules. While generalist models achieve promising results on various benchmarks, they have performance degradation on some tasks compared with task-specialized models. In this work, we find that interference among different tasks and modalities is the main factor to this phenomenon. To mitigate such interference, we introduce the Conditional Mixture-of-Experts (Conditional MoEs) to generalist models. Routing strategies under different levels of conditions are proposed to take both the training/inference cost and generalization ability into account. By incorporating the proposed Conditional MoEs, the recently proposed generalist model Uni-Perceiver can effectively mitigate the interference across tasks and modalities, and achieves state-of-the-art results on a series of downstream tasks via prompt tuning on 1% of downstream data. Moreover, the introduction of Conditional MoEs still holds the generalization ability of generalist models to conduct zero-shot inference on new tasks, e.g., video-text retrieval and video caption. Code and pre-trained generalist models shall be released.

Comments:	Code shall be released at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2206.04674 [cs.CV]
	(or arXiv:2206.04674v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2206.04674

Submission history

From: Jifeng Dai [view email]
[v1] Thu, 9 Jun 2022 17:59:59 UTC (1,572 KB)
[v2] Tue, 5 Jul 2022 07:56:01 UTC (1,574 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators