Skip to content

Code Repo for "Rationale-based Opinion Summarization"

Notifications You must be signed in to change notification settings

leehaoyuan/RATION

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RATION

This repository presents the implementation of the NAACL 2024 paper:

Rationale-based Opinion Summarization,
Haoyuan Li and Snigdha Chaturvedi

Data and Model

Download the data file from this link and unzip it into the data folder. It contains subsets of the Space dataset and the Yelp used by RATION for experiment and the intermediate outputs of RATION. It also includes training data for the specificity estimation model and textual alignment models. Download all model files from this link and unzip all the files into the model folder. They include the checkpoints for SemAE and Snippext and specificity estimation model and text alignment model used by RATION. The generated rationale-based opinion summaries are in the ration_summary folder. The extracted rationales by RATION and GPT-3.5 are in the rationales folder. The conventional opinion summaries generated by SemAE are in the conv_summary folder.

Environment

RATION depends on SemAE and Snippext. However, these two repos use an older version of pytorch not compatible with other codes of RATION. Therefore, create an environment for these two repos based on the instructions of these two repos (denoted as old_env) and create another enviroment based on the following intrcustions:

  • Python version: python3.8

  • Dependencies: Use the requirements.txt file and conda/pip to install all necessary dependencies. E.g., for pip:

      pip install -U pip
      pip install -U setuptools
      pip install -r requirements.txt 
    

This environment is denoted as new_env.

Using RATION to Generate Rationale-based Opinion Summaries

To generate rationale-based opinion summaries from scratch, please follow the following steps.

1. Extract Representative Opinions

RATION uses SemAE to generate conventional opinion summaries and uses Snippext to extract representative opinions from the conventional opinion summaries. Under old_env, please run sh script/extract_opinion_space.sh for the Space dataset or sh script/extract_opinion_yelp.sh for the Yelp dataset. Compared to the original implementation of SemAE, RATION further restricts that each extracted summary sentence contains at least one opinions identified by Snippext. You may use other summarization models to generate conventional opinion summaries.

2. Extract Rationale Candidates for Representative Opinions

RATION extracts review sentences as rationale candidates for each representative opinion based on the textual alignment model. Under new_env, please run sh script/gen_ration_cand_space.sh for the Space dataset or sh script/gen_ration_cand_yelp.sh for the Yelp dataset. The scripts also run automatic evaluations for the rationale candidate set. The values of --self_sim_thres are tuned so that each entity on average has 8 rationale candidate sets. The values of --entail_thres are tuned so all rationale candidate sets of each entity on average cover 30% of review sentences.

3. Extract Rationales for Representative Opinions

RATION extracts k rationales for each representative opinion from the rationale candidate sets. Under new_env, please run sh script/gen_ration_space.sh for the Space dataset or sh script/gen_ration_yelp.sh for the Yelp dataset. The scripts also run the evaluation for the rationales. The scripts also run automatic evaluations for the rationale candidate set. The values of --entail_thres are set as the same values as the previous step.

4. Generate Rationale-based Opinion Summaries

RATION generates rationale-based opinion summaries combining representative opinions and their rationales. Under new_env, please run sh script/gen_summary_space.sh for the Space dataset or sh script/gen_summary_yelp.sh for the Yelp dataset.

Train Specificity Estimation Model

We finetune the deberta-base model using the data from this link. Under new_env, please run sh script/train_specificity.sh.

Train Textual Alignment Model

We finetune the roberta-large model on masked language model and classification using data from this link for the Space dataset and data from this link for the Yelp dataset. Under new_env, please run sh script/train_align_space.sh for the Space dataset or sh script/train_align_yelp.sh for the Yelp dataset.

Citation

@inproceedings{li-chaturvedi-2024-rationale,
    title = "Rationale-based Opinion Summarization",
    author = "Li, Haoyuan  and
      Chaturvedi, Snigdha",
    editor = "Duh, Kevin  and
      Gomez, Helena  and
      Bethard, Steven",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
    month = jun,
    year = "2024",
    address = "Mexico City, Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.naacl-long.458",
    pages = "8267--8285",
    abstract = "Opinion summarization aims to generate concise summaries that present popular opinions of a large group of reviews. However, these summaries can be too generic and lack supporting details. To address these issues, we propose a new paradigm for summarizing reviews, rationale-based opinion summarization. Rationale-based opinion summaries output the representative opinions as well as one or more corresponding rationales. To extract good rationales, we define four desirable properties: relatedness, specificity, popularity, and diversity and present a Gibbs-sampling-based method to extract rationales. Overall, we propose RATION, an unsupervised extractive system that has two components: an Opinion Extractor (to extract representative opinions) and Rationales Extractor (to extract corresponding rationales). We conduct automatic and human evaluations to show that rationales extracted by RATION have the proposed properties and its summaries are more useful than conventional summaries. The implementation of our work is available at https://github.com/leehaoyuan/RATION.",
}

About

Code Repo for "Rationale-based Opinion Summarization"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published