SafeSora Dataset

SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset

PKU-Alignment Team @ Peking-University

Dataset Composition

Figure1: Proportion of multi-label classifications for Prompt

Why SafeSora Dataset?

The multimodal nature of text-to-video models presents new challenges for AI alignment, including the scarcity of efficient datasets for alignment and the inherent complexity of multimodal data. To mitigate the risk of harmful outputs from large vision models, we introduce the SAFESORA dataset to promote research on aligning text-to-video generation with human values, which has the following features:

Data Point Example

Annotation Pipeline

Figure3: Left - Video generation pipeline: Both the original and augmented prompts are then used to generate multiple videos using five video generation models to form T-V pairs. Right - Two-stage annotation: The annotation process is structured into two distinct dimensions and two sequential stages. In the initial heuristic stage, crowdworkers are guided to annotate 4 sub-dimensions of helpfulness and 12 sub-categories of harmlessness. In the subsequent stage, they provide their decoupled preference upon two T-V pairs based on the dimensions of helpfulness and harmlessness.

Inspiring Future Research 

Figure4: Left - T-V Moderation: T-V Moderation incorporates user text inputs as criteria for evaluation, allowing it to filter out more potentially harmfulmulti-modal responses. The agreement ratio between T-V Moderation trained on the multi-label data of the SAFESORA training dataset and human judgment on the test set is 82.94%. Right - Preference Reward Model: Based on our dataset, we also develop a reward model focuses on helpfulness and a cost model focuses on harmfulness. The agreement ratio with crowd workers is 65.29% for the reward model and 72.41% for the cost model.

Citation

@misc{dai2024safesora,      title={SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset},       author={Josef Dai and Tianle Chen and Xuyao Wang and Ziran Yang and Taiye Chen and Jiaming Ji and Yaodong Yang},      year={2024},      eprint={2406.14477},      archivePrefix={arXiv},      primaryClass={id='cs.CV' full_name='Computer Vision and Pattern Recognition' is_active=True alt_name=None in_archive='cs' is_general=False description='Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.'}}