RATION

This repository presents the implementation of the NAACL 2024 paper:

Rationale-based Opinion Summarization,
Haoyuan Li and Snigdha Chaturvedi

Data and Model

Download the data file from this link and unzip it into the data folder. It contains subsets of the Space dataset and the Yelp used by RATION for experiment and the intermediate outputs of RATION. It also includes training data for the specificity estimation model and textual alignment models. Download all model files from this link and unzip all the files into the model folder. They include the checkpoints for SemAE and Snippext and specificity estimation model and text alignment model used by RATION. The generated rationale-based opinion summaries are in the ration_summary folder. The extracted rationales by RATION and GPT-3.5 are in the rationales folder. The conventional opinion summaries generated by SemAE are in the conv_summary folder.

Environment

RATION depends on SemAE and Snippext. However, these two repos use an older version of pytorch not compatible with other codes of RATION. Therefore, create an environment for these two repos based on the instructions of these two repos (denoted as old_env) and create another enviroment based on the following intrcustions:

Python version: python3.8
Dependencies: Use the requirements.txt file and conda/pip to install all necessary dependencies. E.g., for pip:
```
  pip install -U pip
  pip install -U setuptools
  pip install -r requirements.txt 
```

This environment is denoted as new_env.

Using RATION to Generate Rationale-based Opinion Summaries

To generate rationale-based opinion summaries from scratch, please follow the following steps.

1. Extract Representative Opinions

RATION uses SemAE to generate conventional opinion summaries and uses Snippext to extract representative opinions from the conventional opinion summaries. Under old_env, please run sh script/extract_opinion_space.sh for the Space dataset or sh script/extract_opinion_yelp.sh for the Yelp dataset. Compared to the original implementation of SemAE, RATION further restricts that each extracted summary sentence contains at least one opinions identified by Snippext. You may use other summarization models to generate conventional opinion summaries.

2. Extract Rationale Candidates for Representative Opinions

RATION extracts review sentences as rationale candidates for each representative opinion based on the textual alignment model. Under new_env, please run sh script/gen_ration_cand_space.sh for the Space dataset or sh script/gen_ration_cand_yelp.sh for the Yelp dataset. The scripts also run automatic evaluations for the rationale candidate set. The values of --self_sim_thres are tuned so that each entity on average has 8 rationale candidate sets. The values of --entail_thres are tuned so all rationale candidate sets of each entity on average cover 30% of review sentences.

3. Extract Rationales for Representative Opinions

RATION extracts k rationales for each representative opinion from the rationale candidate sets. Under new_env, please run sh script/gen_ration_space.sh for the Space dataset or sh script/gen_ration_yelp.sh for the Yelp dataset. The scripts also run the evaluation for the rationales. The scripts also run automatic evaluations for the rationale candidate set. The values of --entail_thres are set as the same values as the previous step.

4. Generate Rationale-based Opinion Summaries

RATION generates rationale-based opinion summaries combining representative opinions and their rationales. Under new_env, please run sh script/gen_summary_space.sh for the Space dataset or sh script/gen_summary_yelp.sh for the Yelp dataset.

Train Specificity Estimation Model

We finetune the deberta-base model using the data from this link. Under new_env, please run sh script/train_specificity.sh.

Train Textual Alignment Model

We finetune the roberta-large model on masked language model and classification using data from this link for the Space dataset and data from this link for the Yelp dataset. Under new_env, please run sh script/train_align_space.sh for the Space dataset or sh script/train_align_yelp.sh for the Yelp dataset.

Citation

@inproceedings{li-chaturvedi-2024-rationale,
    title = "Rationale-based Opinion Summarization",
    author = "Li, Haoyuan  and
      Chaturvedi, Snigdha",
    editor = "Duh, Kevin  and
      Gomez, Helena  and
      Bethard, Steven",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
    month = jun,
    year = "2024",
    address = "Mexico City, Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.naacl-long.458",
    pages = "8267--8285",
    abstract = "Opinion summarization aims to generate concise summaries that present popular opinions of a large group of reviews. However, these summaries can be too generic and lack supporting details. To address these issues, we propose a new paradigm for summarizing reviews, rationale-based opinion summarization. Rationale-based opinion summaries output the representative opinions as well as one or more corresponding rationales. To extract good rationales, we define four desirable properties: relatedness, specificity, popularity, and diversity and present a Gibbs-sampling-based method to extract rationales. Overall, we propose RATION, an unsupervised extractive system that has two components: an Opinion Extractor (to extract representative opinions) and Rationales Extractor (to extract corresponding rationales). We conduct automatic and human evaluations to show that rationales extracted by RATION have the proposed properties and its summaries are more useful than conventional summaries. The implementation of our work is available at https://github.com/leehaoyuan/RATION.",
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
SemAE-main		SemAE-main
Snippext_public-master		Snippext_public-master
conv_summary		conv_summary
data		data
eval		eval
model		model
ration_summary		ration_summary
rationales		rationales
script		script
README.md		README.md
clause_from_source.py		clause_from_source.py
configs.json		configs.json
entail_review_self.py		entail_review_self.py
entail_review_sent.py		entail_review_sent.py
gen_ration.py		gen_ration.py
gen_ration_cand.py		gen_ration_cand.py
gen_rationale_summary.py		gen_rationale_summary.py
gen_sample.py		gen_sample.py
predict_specificity.py		predict_specificity.py
requirements.txt		requirements.txt
run_align.py		run_align.py
run_mlm.py		run_mlm.py
run_spec.py		run_spec.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RATION

Data and Model

Environment

Using RATION to Generate Rationale-based Opinion Summaries

1. Extract Representative Opinions

2. Extract Rationale Candidates for Representative Opinions

3. Extract Rationales for Representative Opinions

4. Generate Rationale-based Opinion Summaries

Train Specificity Estimation Model

Train Textual Alignment Model

Citation

About

Releases

Packages

Languages

leehaoyuan/RATION

Folders and files

Latest commit

History

Repository files navigation

RATION

Data and Model

Environment

Using RATION to Generate Rationale-based Opinion Summaries

1. Extract Representative Opinions

2. Extract Rationale Candidates for Representative Opinions

3. Extract Rationales for Representative Opinions

4. Generate Rationale-based Opinion Summaries

Train Specificity Estimation Model

Train Textual Alignment Model

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages