Awesome-KGLLM

A collection of papers and resources about knowledge graph enhanced large language models (KGLLM)

Recently, ChatGPT, a representative large language model (LLM), has gained considerable attention due to its powerful emergent abilities. Some researchers suggest that LLMs could potentially replace structured knowledge bases like knowledge graphs (KGs) and function as parameterized knowledge bases. However, while LLMs are proficient at learning probabilistic language patterns based on large corpus and engaging in conversations with humans, they, like previous smaller pre-trained language models (PLMs), still have difficulty in recalling facts while generating knowledge-grounded contents. To overcome these limitations, researchers have proposed enhancing data-driven PLMs with knowledge-based KGs to incorporate explicit factual knowledge into PLMs, thus improving their performance to generate texts requiring factual knowledge and providing more informed responses to user queries. Therefore, we review the studies on enhancing PLMs with KGs, detailing existing knowledge graph enhanced pre-trained language models (KGPLMs) as well as their applications. Inspired by existing studies on KGPLM, we propose to enhance LLMs with KGs by developing knowledge graph-enhanced large language models (KGLLMs). KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research.

The organization of these papers refers to our survey: ChatGPT is not Enough: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling

Please let us know if you find any mistakes or have any suggestions by email: [email protected]

If you find our survey useful for your research, please cite the following paper:

@article{KGLLM,
title={ChatGPT is not Enough: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling},
author={Yang, Linyao and Chen, Hongyang and Li, Zhao and Ding, Xiao and Wu, Xindong},
journal={arXiv preprint arXiv:2306.11489},
year={2023}
}

Overview

In this repository, we collect recent advances in knowledge graph enhanced large language models. According to the stage at which KGs participate in pre-training, existing methods can be categorized into before-training enhancement, during-training enhancement, and post-training enhancement methods.

Before-training Enhancement KGPLMs

Expand Input Structures

K-bert: Enabling language representation with knowledge graph (AAAI, 2020) [paper]
CoLAKE: Contextualized Language and Knowledge Embedding (COLING, 2020) [paper]
Cn-hit-it. nlp at semeval-2020 task 4: Enhanced language representation with multiple knowledge triples (SemEval, 2020) [paper]

Enrich Input Information

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention (EMNLP, 2020) [paper]
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT (EMNLP, 2020) [paper]
Knowledge-Aware Language Model Pretraining [paper]
OAG-BERT: Towards a Unified Backbone Language Model for Academic Knowledge Services (KDD, 2022) [paper]
DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding (AAAI, 2022) [paper]

Generate New Data

Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models [paper]
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation (EMNLP, 2020) [paper]
Barack's wife hillary: Using knowledge-graphs for fact-aware language modeling (ACL, 2019) [paper]
Atomic: An atlas of machine commonsense for if-then reasoning (AAAI, 2019) [paper]
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation (TACL, 2021) [paper]

Optimize Word Masks

ERNIE: Enhanced Language Representation with Informative Entities (ACL, 2019) [paper]
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model [paper]
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning (EMNLP, 2020) [paper]

During-training Enhancement KGPLMs

Incorporate Knowledge Encoders

ERNIE: Enhanced Language Representation with Informative Entities (ACL, 2019) [paper]
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation [paper]
BERT-MK: Integrating Graph Contextualized Knowledge into Pre-trained Language Models (AI Open, 2021) [paper]
JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering (NAACL, 2022) [paper]
Knowledge-Enriched Transformer for Emotion Detection in Textual Conversations (EMNLP-IJCNLP, 2019) [paper]
Relational Memory-Augmented Language Models (TACL, 2022) [paper]
QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering (NAACL, 2021) [paper]
GreaseLM: Graph REASoning Enhanced Language Models for Question Answering [paper]
KLMo: Knowledge graph enhanced pretrained language model with fine-grained relationships (EMNLP, 2021) [paper]

Insert Knowledge Encoding Layers

K-bert: Enabling language representation with knowledge graph (AAAI, 2020) [paper]
CoLAKE: Contextualized Language and Knowledge Embedding (COLING, 2020) [paper]
Knowledge Enhanced Contextual Word Representations (EMNLP-IJCNLP, 2019) [paper]
JAKET: Joint Pre-training of Knowledge Graph and Language Understanding (AAAI, 2022) [paper]
KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning (AAAI, 2021) [paper]

Add Independent Adapters

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters (ACL-IJCNLP, 2021) [paper]
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers (DEELIO, 2020) [paper]
Parameter-Efficient Domain Knowledge Integration from Multiple Sources for Biomedical Pre-trained Language Models (EMNLP, 2021) [paper]
Commonsense knowledge graph-based adapter for aspect-level sentiment classification (Neurocomputing, 2023) [paper]

Modify Pre-training Task

SenseBERT: Driving Some Sense into BERT (ACL, 2020) [paper]
ERNIE: Enhanced Language Representation with Informative Entities (ACL, 2019) [paper]
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention (EMNLP, 2020) [paper]
OAG-BERT: Towards a Unified Backbone Language Model for Academic Knowledge Services (KDD, 2022) [paper]
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model [paper]
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning (EMNLP, 2020) [paper]
ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning (ACL-IJCNLP, 2021) [paper]
SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge (EMNLP, 2020) [paper]

Post-training Enhancement KGPLMs

Fine-tune PLMs with Knowledge

KALA: Knowledge-Augmented Language Model Adaptation (NAACL, 2022) [paper]
Pre-trained language models with domain knowledge for biomedical extractive summarization (Knowledge-Based Systems, 2022) [paper]
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning (EMNLP-IJCNLP, 2019) [paper]
Enriching contextualized language model from knowledge graph for biomedical information extraction (Briefings in bioinformatics, 2021) [paper]
Incorporating Commonsense Knowledge Graph in Pretrained Models for Social Commonsense Tasks (DeeLIO, 2020) [paper]

Generate Knowledge-based Prompts

Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation (AAAI, 2021) [paper]
Enhanced story comprehension for large language models through dynamic document-based knowledge graphs (AAAI, 2022) [paper]
Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding (EMNLP, 2022) [paper]

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
figs		figs
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-KGLLM

Overview

Table of Contents

Before-training Enhancement KGPLMs

Expand Input Structures

Enrich Input Information

Generate New Data

Optimize Word Masks

During-training Enhancement KGPLMs

Incorporate Knowledge Encoders

Insert Knowledge Encoding Layers

Add Independent Adapters

Modify Pre-training Task

Post-training Enhancement KGPLMs

Fine-tune PLMs with Knowledge

Generate Knowledge-based Prompts

About

Releases

Packages

License

linyaoyang/Awesome-KGLLM

Folders and files

Latest commit

History

Repository files navigation

Awesome-KGLLM

Overview

Table of Contents

Before-training Enhancement KGPLMs

Expand Input Structures

Enrich Input Information

Generate New Data

Optimize Word Masks

During-training Enhancement KGPLMs

Incorporate Knowledge Encoders

Insert Knowledge Encoding Layers

Add Independent Adapters

Modify Pre-training Task

Post-training Enhancement KGPLMs

Fine-tune PLMs with Knowledge

Generate Knowledge-based Prompts

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages