Cross-task weakly supervised learning from instructional videos

Zhukov, Dimitri; Alayrac, Jean-Baptiste; Cinbis, Ramazan Gokberk; Fouhey, David; Laptev, Ivan; Sivic, Josef

Computer Science > Computer Vision and Pattern Recognition

arXiv:1903.08225v1 (cs)

[Submitted on 19 Mar 2019 (this version), latest version 29 Apr 2019 (v2)]

Title:Cross-task weakly supervised learning from instructional videos

Authors:Dimitri Zhukov, Jean-Baptiste Alayrac, Ramazan Gokberk Cinbis, David Fouhey, Ivan Laptev, Josef Sivic

View PDF

Abstract:In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations. At the heart of our approach is the observation that weakly supervised learning may be easier if a model shares components while learning different steps: `pour egg' should be trained jointly with other tasks involving `pour' and `egg'. We formalize this in a component model for recognizing steps and a weakly supervised learning framework that can learn this model under temporal constraints from narration and the list of steps. Past data does not permit systematic studying of sharing and so we also gather a new dataset, CrossTask, aimed at assessing cross-task sharing. Our experiments demonstrate that sharing across tasks improves performance, especially when done at the component level and that our component model can parse previously unseen tasks by virtue of its compositionality.

Comments:	10 pages, 7 figures, to be published in proceedings of the CVPR, 2019
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1903.08225 [cs.CV]
	(or arXiv:1903.08225v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1903.08225

Submission history

From: Dimitri Zhukov [view email]
[v1] Tue, 19 Mar 2019 19:30:29 UTC (3,198 KB)
[v2] Mon, 29 Apr 2019 10:08:57 UTC (4,650 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Cross-task weakly supervised learning from instructional videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Cross-task weakly supervised learning from instructional videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators