Low-resource Information Extraction 🚀

🍎 The repository is a paper set on low-resource information extraction (NER, RE, EE), which is categorized into three paradigms.

🤗 We strongly encourage the researchers who want to promote their fantastic work for the community to make pull request and update their papers in this repository!

📖 Survey Paper: Information Extraction in Low-Resource Scenarios: Survey and Perspective (2023) [paper]

🗂️ Slides:

Data-Efficient Knowledge Graph Construction, 高效知识图谱构建 (Tutorial on CCKS 2022) [slides]
Efficient and Robust Knowledge Graph Construction (Tutorial on AACL-IJCNLP 2022) [paper, slides]
Open-Environment Knowledge Graph Construction and Reasoning: Challenges, Approaches, and Opportunities (Tutorial on IJCAI 2023) [paper, slides]

🛠️ ToolKit:

DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population [paper, project]
OpenUE: An Open Toolkit of Universal Extraction from Text [paper, project]
OpenNRE [project]

Content

0. Related Surveys/Analysis on Low-resource IE
0. Low-resource IE Datasets
1. Exploiting Higher-resource Data
2. Developing Stronger Data-Efficient Models
3. Optimizing Data and Models Together
How to Cite

Related Surveys/Analysis on Low-resource IE

Information Extraction

NER

A Survey on Recent Advances in Named Entity Recognition from Deep Learning Models (COLING 2018) [paper]
A Survey on Deep Learning for Named Entity Recognition (TKDE, 2020) [paper]

RE

A Survey on Neural Relation Extraction (Science China Technological Sciences, 2020) [paper]
Relation Extraction: A Brief Survey on Deep Neural Network Based Methods (ICSIM 2021) [paper]
Deep Neural Network-Based Relation Extraction: An Overview (Neural Computing and Applications, 2022) [paper]
Revisiting Relation Extraction in the era of Large Language Models (ACL 2023) [paper]

EE

A Survey of Event Extraction From Text (ACCESS, 2019) [paper]
What is Event Knowledge Graph: A Survey (TKDE, 2022) [paper]
A Survey on Deep Learning Event Extraction: Approaches and Applications (TNNLS, 2022) [paper]
Low Resource Event Extraction: A Survey (2022) [paper]
Few-shot Event Detection: An Empirical Study and a Unified View (ACL 2023) [paper]
Exploring the Feasibility of ChatGPT for Event Extraction (arXiv, 2023) [paper]

General IE

From Information to Knowledge: Harvesting Entities and Relationships from Web Sources (PODS 2010) [paper]
Knowledge Base Population: Successful Approaches and Challenges (ACL 2011) [paper]
Advances in Automated Knowledge Base Construction (NAACL-HLC 2012, AKBC-WEKEX workshop) [paper]
Information Extraction (IEEE Intelligent Systems, 2015) [paper]
Populating Knowledge Bases (Part of The Information Retrieval Series book series, 2018) [paper]
A Survey on Open Information Extraction (COLING 2018) [paper]
A Survey on Automatically Constructed Universal Knowledge Bases (Journal of Information Science, 2020) [paper]
A Survey on Knowledge Graphs: Representation, Acquisition and Applications (TNNLS, 2021) [paper]
A Survey of Information Extraction Based on Deep Learning (Applied Sciences, 2022) [paper]
Generative Knowledge Graph Construction: A Review (EMNLP 2022) [paper]
Multi-Modal Knowledge Graph Construction and Application: A Survey (TKDE, 2022) [paper]
A Survey on Multimodal Knowledge Graphs: Construction, Completion and Applications (Mathematics, 2023) [paper]
Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples! (arXiv, 2023) [paper]
Evaluating ChatGPT’s Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness (arXiv, 2023) [paper]
Is Information Extraction Solved by ChatGPT? An Analysis of Performance, Evaluation Criteria, Robustness and Errors (arXiv, 2023) [paper]
LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities (arXiv, 2023) [paper]

Low-resource NLP

A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios (NAACL 2021) [paper]
Few-Shot Named Entity Recognition: An Empirical Baseline Study (EMNLP 2021) [paper]
A Survey on Low-Resource Neural Machine Translation (IJCAI 2021) [paper]
Revisiting Few-shot Relation Classification: Evaluation Data and Classification Schemes (TACL, 2021) [paper]

Low-resource Learning

A Survey of Zero-Shot Learning: Settings, Methods, and Applications (TIST, 2019) [paper]
Generalizing from a Few Examples: A Survey on Few-shot Learning (ACM Computing Surveys, 2021) [paper]
Knowledge-aware Zero-Shot Learning: Survey and Perspective (IJCAI 2021) [paper]
Zero-shot and Few-shot Learning with Knowledge Graphs: A Comprehensive Survey (2021) [paper]
A Survey on Machine Learning from Few Samples (Pattern Recognition, 2023) [paper]

Low-resource IE Datasets

Low-resource NER

{Few-NERD}: Few-NERD: A Few-shot Named Entity Recognition Dataset (EMNLP 2021) [paper, data]

Low-resource RE

{FewRel}: FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation (EMNLP 2018) [paper, data]
{FewRel2.0}: FewRel 2.0: Towards More Challenging Few-Shot Relation Classification (EMNLP 2019) [paper, data]
{Entail-RE}: Low-resource Extraction with Knowledge-aware Pairwise Prototype Learning (Knowledge-Based Systems, 2022) [paper, data]
{LREBench}: Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study (EMNLP 2022, Findings) [paper, data]

Low-resource EE

{FewEvent}: Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection (WSDM 2020) [paper, data]
{Causal-EE}: Low-resource Extraction with Knowledge-aware Pairwise Prototype Learning (Knowledge-Based Systems, 2022) [paper, data]
{OntoEvent}: OntoED: Low-resource Event Detection with Ontology Embedding (ACL 2021) [paper, data]

1 Exploiting Higher-resource Data

Weakly Supervised Augmentation

Distant Supervision for Relation Extraction without Labeled Data (ACL 2009) [paper]
Neural Relation Extraction with Selective Attention over Instances (ACL 2016) [paper]
Automatically Labeled Data Generation for Large Scale Event Extraction (ACL 2017) [paper]
Adversarial Training for Weakly Supervised Event Detection (NAACL 2019) [paper]
Local Additivity Based Data Augmentation for Semi-supervised NER (EMNLP 2020) [paper]
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision (KDD 2020) [paper]
Gradient Imitation Reinforcement Learning for Low Resource Relation Extraction (EMNLP 2021) [paper]
Noisy-Labeled NER with Confidence Estimation (NAACL 2021) [paper]
ANEA: Distant Supervision for Low-Resource Named Entity Recognition (ICLR 2021, Workshop of Practical Machine Learning For Developing Countries) [paper]
Finding Influential Instances for Distantly Supervised Relation Extraction (COLING 2022) [paper]

Multi-modal Augmentation

Visual Attention Model for Name Tagging in Multimodal Social Media (ACL 2018) [paper]
Cross-media Structured Common Space for Multimedia Event Extraction (ACL 2020) [paper]
Joint Multimedia Event Extraction from Video and Article (EMNLP 2021, Findings) [paper]
Multimodal Relation Extraction with Efficient Graph Alignment (MM 2021) [paper]
Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion (SIGIR 2022) [paper]
Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction (NAACL 2022, Findings) [paper]

Multi-lingual Augmentation

Neural Relation Extraction with Multi-lingual Attention (ACL 2017) [paper]
Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer (IJCAI 2018) [paper]
Event Detection via Gated Multilingual Attention Mechanism (AAAI 2018) [paper]
Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning (COLING 2022) [paper]
Cross-lingual Transfer Learning for Relation Extraction Using Universal Dependencies (Computer Speech & Language, 2022) [paper]
Language Model Priming for Cross-Lingual Event Extraction (AAAI 2022) [paper]

Auxiliary Knowledge Enhancement

(1) Text

Improving Event Detection via Open-domain Trigger Knowledge (ACL 2020) [paper]
MapRE: An Effective Semantic Mapping Approach for Low-resource Relation Extraction (EMNLP 2021) [paper]
Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning (ACL 2021) [paper]
MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER (ACL 2022) [paper]
Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction (EMNLP 2022, Findings) [paper]
GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks (ACL 2023, Findings) [paper]
Enhancing Few-shot NER with Prompt Ordering based Data Augmentation (arXiv, 2023) [paper]
STAR: Boosting Low-Resource Event Extraction by Structure-to-Text Data Generation with Large Language Models (arXiv, 2023) [paper]

(2) KG

Leveraging FrameNet to Improve Automatic Event Detection (ACL 2016) [paper]
DOZEN: Cross-Domain Zero Shot Named Entity Recognition with Knowledge Graph (SIGIR 2021) [paper]

(3) Ontology & Logical Rules

Logic-guided Semantic Representation Learning for Zero-Shot Relation Classification (COLING 2020) [paper]
OntoED: Low-resource Event Detection with Ontology Embedding (ACL 2021) [paper]
Neuralizing Regular Expressions for Slot Filling (EMNLP 2021) [paper]
Low-resource Extraction with Knowledge-aware Pairwise Prototype Learning (Knowledge-Based Systems, 2022) [paper]
Leveraging Open Information Extraction for Improving Few-Shot Trigger Detection Domain Transfer (arXiv, 2023) [paper]

2 Developing Stronger Data-Efficient Models

Meta Learning

For Low-resource NER

Few-shot Classification in Named Entity Recognition Task (SAC 2019) [paper]
Enhanced Meta-Learning for Cross-Lingual Named Entity Recognition with Minimal Resources (AAAI 2020) [paper]
MetaNER: Named Entity Recognition with Meta-Learning (WWW 2020) [paper]
Few-Shot Named Entity Recognition via Meta-Learning (TKDE, 2022) [paper]

For Low-resource RE

Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification (AAAI 2019) [paper]
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs (ICML 2020) [paper]
Bridging Text and Knowledge with Multi-Prototype Embedding for Few-Shot Relational Triple Extraction (COLING 2020) [paper]
Meta-Information Guided Meta-Learning for Few-Shot Relation Classification (COLING 2020) [paper]
Pre-training to Match for Unified Low-shot Relation Extraction (ACL 2022) [paper]
Generative Meta-Learning for Zero-Shot Relation Triplet Extraction (arXiv, 2023) [paper]

For Low-resource EE

Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection (WSDM 2020) [paper]
Adaptive Knowledge-Enhanced Bayesian Meta-Learning for Few-shot Event Detection (ACL 2021, Findings) [paper]
Few-Shot Event Detection with Prototypical Amortized Conditional Random Field (ACL 2021, Findings) [paper]
Zero- and Few-Shot Event Detection via Prompt-Based Meta Learning (ACL 2023) [paper]

Transfer Learning

(1) Class-related Semantics

Zero-Shot Transfer Learning for Event Extraction (ACL 2018) [paper]
Transfer Learning for Named-Entity Recognition with Neural Networks (LREC 2018) [paper]
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks (NAACL 2019) [paper]
Relation Adversarial Network for Low Resource Knowledge Graph Completion (WWW 2020) [paper]
Graph Learning Regularization and Transfer Learning for Few-Shot Event Detection (SIGIR 2021) [paper]
LearningToAdapt with Word Embeddings: Domain Adaptation of Named Entity Recognition Systems (Information Processing and Management, 2021) [paper]

(2) Pre-trained Language Representations

Matching the Blanks: Distributional Similarity for Relation Learning (ACL 2019) [paper]
Exploring Pre-trained Language Models for Event Extraction and Generation (ACL 2019) [paper]
Coarse-to-Fine Pre-training for Named Entity Recognition (EMNLP 2020) [paper]
CLEVE: Contrastive Pre-training for Event Extraction (ACL 2021) [paper]
Unleash GPT-2 Power for Event Detection (ACL 2021) [paper]
Few-shot Named Entity Recognition with Self-describing Networks (ACL 2022) [paper]
Unleashing Pre-trained Masked Language Model Knowledge for Label Signal Guided Event Detection (DASFAA 2023) [paper]
Learning In-context Learning for Named Entity Recognition (ACL 2023) [paper]

Prompt Learning

(1) Vanilla Prompt Learning

Template-Based Named Entity Recognition Using BART (ACL 2021, Findings) [paper]
LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting (COLING 2022) [paper]
COPNER: Contrastive Learning with Prompt Guiding for Few-shot Named Entity Recognition (COLING 2022) [paper]
Template-free Prompt Tuning for Few-shot NER (NAACL 2022) [paper]
RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction (ACL 2022, Findings) [paper]
Dynamic Prefix-Tuning for Generative Template-based Event Extraction (ACL 2022) [paper]
Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction (ACL 2022) [paper]
Prompt-Learning for Cross-Lingual Relation Extraction (IJCNN 2023) [paper]

(2) Augmented Prompt Learning

PTR: Prompt Tuning with Rules for Text Classification (AI Open, 2022) [paper]
KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction (WWW 2022) [paper]
Ontology-enhanced Prompt-tuning for Few-shot Learning (WWW 2022) [paper]
DEGREE: A Data-Efficient Generation-Based Event Extraction Model (NAACL 2022) [paper]
AugPrompt: Knowledgeable Augmented-Trigger Prompt for Few-Shot Event Classification (Information Processing & Management, 2022) [paper]
Schema-aware Reference as Prompt Improves Data-Efficient Relational Triple and Event Extraction (SIGIR 2023) [paper]
A Composable Generative Framework based on Prompt Learning for Various Information Extraction Tasks (IEEE Transactions on Big Data, 2023) [paper]
PromptNER: Prompt Locating and Typing for Named Entity Recognition (ACL 2023) [paper]
AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model (ACL 2023) [paper]
MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event Detection (arXiv, 2023) [paper]
PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search (arXiv, 2023) [paper]

3 Optimizing Data and Models Together

Multi-task Learning

(1) NER, Named Entity Normalization (NEN)

A Neural Multi-Task Learning Framework to Jointly Model Medical Named Entity Recognition and Normalization (AAAI 2019) [paper]
MTAAL: Multi-Task Adversarial Active Learning for Medical Named Entity Recognition and Normalization (AAAI 2021) [paper]
An End-to-End Progressive Multi-Task Learning Framework for Medical Named Entity Recognition and Normalization (ACL 2021) [paper]

(2) NER, RE

GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction (ACL 2019) [paper]
CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning (AAAI 2020) [paper]
Joint Entity and Relation Extraction Model based on Rich Semantics (Neurocomputing, 2021) [paper]

(3) NER, RE, EE

Entity, Relation, and Event Extraction with Contextualized Span Representations (EMNLP 2019) [paper]
InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction (arXiv, 2023) [paper]

(4) Word Sense Disambiguation (WSD), Event Detection (ED)

Similar but not the Same: Word Sense Disambiguation Improves Event Detection via Neural Representation Matching (EMNLP 2018) [paper]

(5) NER, RE, EE & Other Structured Prediction Tasks

DeepStruct: Pretraining of Language Models for Structure Prediction (ACL 2022, Findings) [paper]
SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres (ACL 2023) [paper]
RexUIE: A Recursive Method with Explicit Schema Instructor for Universal Information Extraction [paper]

Task Reformulation

QA/MRC

Zero-Shot Relation Extraction via Reading Comprehension (CoNLL 2017) [paper]
Entity-Relation Extraction as Multi-Turn Question Answering (ACL 2019) [paper]
A Unified MRC Framework for Named Entity Recognition (ACL 2020) [paper]
Event Extraction as Machine Reading Comprehension (EMNLP 2020) [paper]
Event Extraction by Answering (Almost) Natural Questions (EMNLP 2020) [paper]
Learning to Ask for Data-Efficient Event Argument Extraction (AAAI 2022, Student Abstract) [paper]
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors (ACL 2023, Findings) [paper]
Zero-Shot Information Extraction via Chatting with ChatGPT (arXiv, 2023) [paper]

Text-to-Structure Generation

Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction (ACL 2021) [paper]
Structured Prediction as Translation between Augmented Natural Languages (ICLR 2021) [paper]
Unified Structure Generation for Universal Information Extraction (ACL 2022) [paper]
LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model (NeurIPS 2022) [paper]
Universal Information Extraction as Unified Semantic Matching (AAAI 2023) [paper]
CODE4STRUCT: Code Generation for Few-Shot Structured Prediction from Natural Language (ACL 2023) [paper]
CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors (ACL 2023) [paper]
Easy-to-Hard Learning for Information Extraction (ACL 2023, Findings) [paper]
How to Unleash the Power of Large Language Models for Few-shot Relation Extraction? (ACL 2023, SustaiNLP Workshop) [paper]
GPT-RE: In-context Learning for Relation Extraction using Large Language Models (arXiv, 2023) [paper]

Retrieval Augmentation

Retrieval-based Low-resource IE

Few-shot Intent Classification and Slot Filling with Retrieved Examples (NAACL 2021) [paper]
Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning (SIGIR 2022, Short Paper) [paper]
Decoupling Knowledge from Memorization: Retrieval-Augmented Prompt Learning (NeurIPS 2022, Spotlight) [paper]
Retrieval-Augmented Generative Question Answering for Event Argument Extraction (EMNLP 2022) [paper]
Event Extraction With Dynamic Prefix Tuning and Relevance Retrieval (TKDE, 2023) [paper]

Retrieval-based Language Models in Low-resource Scenarios

KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier (2021) [paper]
Few-shot Learning with Retrieval Augmented Language Models (2022, Meta AI, Atlas) [paper]

How to Cite

📋 Thank you very much for your interest in our survey work. If you use or extend our survey, please cite the following paper:

@article{2023_LowResIE,
    author    = {Shumin Deng and
                 Ningyu Zhang and
                 Bryan Hooi},
    title     = {Information Extraction in Low-Resource Scenarios: Survey and Perspective},
    journal   = {CoRR},
    volume    = {abs/2202.08063},
    year      = {2023},
    url       = {https://arxiv.org/abs/2202.08063}
}

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
LICENSE		LICENSE
README.md		README.md

License

edzq/Low-resource-KEPapers

Folders and files

Latest commit

History

Repository files navigation