Skip to content

Named entity recognition followed by relation extraction for German cooking recipes resulting in step by step cooking guides.

Notifications You must be signed in to change notification settings

leonidasbothmer/ReciParse

 
 

Repository files navigation

ReciParse 🌯

🎯 Target

Creating Step by Step Guides from unstructured German Cooking Recipes

🌐 Website

Check the model's results at www.reciparse.de

🛠️ Models and Scores

  1. Named Entity Recognition using Micro-F1: 94.6%
  2. Relation Extraction Micro-F1: 74.3%
  3. Inter Annotator Agreement (F1): 95%

💬 Project Description

When following recipes, most users appreciate a clear and easy-to-follow instruction set. As manually rewriting recipe texts from various sources to only include the most relevant information and to be structured in a straightforward "step-by-step" manner is laborious work, we explored potential automated solutions. This repository contains our our experiments in the extraction of semantic information from German recipe texts using Machine Learning. To demonstrate a first proof-of concept, we implemented a full natural language processing (NLP) pipeline, starting from conceptualizing a task-appropriate annotation scheme, to modelling using modern NLP methods as well as developing an illustrative web application front end which can be accessed via http:https://www.reciparse.de/ . Despite the inherent ambiguities of linguistical annotation, we managed to achieve an Inter-Annotator Agreement of over 95%. Our models are two sequentially stacked neural networks using transfer learning based on transformer architectures. The first is a "named entity recognizer" (NER) which identifies and categorizes appropriate subsets of a text into recipe-relevant classes. These entity predictions serve as the basis upon which the second model, the "relations component" (REL), assigns class entities to the appropriate steps and deals with cross-sentence references. We were able to achieve a notable micro average F-Score of 94.6% for NER, while REL manages to achieve 74.3%. This project lays a valuable foundation for future work on the topic and cogently demonstrates that this task can, in principle, be approximated by machine learning.

About

Named entity recognition followed by relation extraction for German cooking recipes resulting in step by step cooking guides.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 90.7%
  • Python 8.8%
  • Other 0.5%