Skip to content

We designed an end-to-end framework that encourage interactive narrative experience based on keyword control and image generation.

License

Notifications You must be signed in to change notification settings

Stry233/Visual-Story-Generation-Based-on-Emotional-and-Keyword-Scheme

Repository files navigation

Visual Story Generation Based on Emotional and Keyword Scheme

[Paper] [Model Card] [Deployment Demo]

Overview

Pipeline - Main

In this work, we propose a narrative generation pipeline to co-create visual stories with the users. The pipeline allows the user to control events and emotions on the generated content.

The pipeline includes two parts: narrative and image generation. In narrative generation, we plan the narrative based on the keywords and emotional trends in sentences and generate the following story sentence. In image generation, we use both Disco Diffusion and Stable Diffusion to create a visually appealing image that captures the story's main plot; we further implement object recognition to allow objects in the images to be mentioned in future story development.

Model Card

Domain Name Description Language model type Model Card 🤗 link
Suggester Emotion Suggester This model is finetuned under StoryCommensense used to provided suggestions of the sentiment in next sentence DeBERTa-v2-xlarge Yuetian/deberta-finetuned-next-sentence-emotion Hugging Face
Suggester Emotion Suggester This model is finetuned under StoryCommensense that used to provided suggestion of the sentiment in next sentence BERT-base-uncased Yuetian/bert-base-uncased-finetuned-plutchik-emotion Hugging Face
Suggester Keyword Suggester This model is finetuned under ROCStories that used to provided suggestion of name entities in next sentence OPT-1.3B Stay tuned Stay tuned
Text pipe Next-sentence generator This model take context, keyword and sentiment together and generate next sentence in a ROCStories style T5-base-finetuned-commenGen Yuetian/T5-finetuned-storyCommonsense Hugging Face

GUI demo

image-20221028093158937

We implement a simple demo showing the deployment ver. of our framework here. Please referred to the Q&A section for more information

Evaluation

result

We demonstrate a performance distribution of the baseline model and the prompt-optimized model in 3,748 sets of experiments under different metrics. The blue box on the left side of each figure represents our method and the orange on the right side represents the baseline model.

Examples

Here is several example stories you can generate using this framework.

# Sentence Image
0 Marcus was collecting shells on the beach. image
1 He picked up a large beautiful shell. image
2 He put it in his pocket to save for later. image
3 Suddenly he felt a sharp pinch. image
4 A crab was inside the shell pinching his leg.. image

Citation

@misc{chen2023visual,
      title={Visual Story Generation Based on Emotion and Keywords}, 
      author={Yuetian Chen and Ruohua Li and Bowen Shi and Peiru Liu and Mei Si},
      year={2023},
      eprint={2301.02777},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

About

We designed an end-to-end framework that encourage interactive narrative experience based on keyword control and image generation.

Topics

Resources

License

Stars

Watchers

Forks

Languages