Update README.md

jason9693 · Feb 5, 2023 · a904c08 · a904c08
1 parent 87df79e
commit a904c08
Showing 1 changed file with 14 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -25,7 +25,7 @@ You can download benchmark set **APEACH**. `APEACH/test.csv` in this repositor
 ![](resource/dist_lengths.png)
 
 ## Paper
-- https://arxiv.org/pdf/2202.12459.pdf
+- https://aclanthology.org/2022.findings-emnlp.525/
 
 ## Experiment Code
 <a href="https://colab.research.google.com/drive/1djd0fuoMYIaf7VCHaLQIziJi4_yBJruP#scrollTo=VPR24ysr5Q7k"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="base"/></a>
@@ -46,11 +46,19 @@ We also share BEST model of our dataset which we trained in this experiment as c
 
 ## Citation
 ```
-@article{yang2022apeach,
- title={APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets},
- author={Yang, Kichang and Jang, Wonjun and Cho, Won Ik},
- journal={arXiv preprint arXiv:2202.12459},
- year={2022}
+@inproceedings{yang-etal-2022-apeach,
+ title = "{APEACH}: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets",
+ author = "Yang, Kichang and
+ Jang, Wonjun and
+ Cho, Won Ik",
+ booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
+ month = dec,
+ year = "2022",
+ address = "Abu Dhabi, United Arab Emirates",
+ publisher = "Association for Computational Linguistics",
+ url = "https://aclanthology.org/2022.findings-emnlp.525",
+ pages = "7076--7086",
+ abstract = "In hate speech detection, developing training and evaluation datasets across various domains is the critical issue. Whereas, major approaches crawl social media texts and hire crowd-workers to annotate the data. Following this convention often restricts the scope of pejorative expressions to a single domain lacking generalization. Sometimes domain overlap between training corpus and evaluation set overestimate the prediction performance when pretraining language models on low-data language. To alleviate these problems in Korean, we propose APEACH that asks unspecified users to generate hate speech examples followed by minimal post-labeling. We find that APEACH can collect useful datasets that are less sensitive to the lexical overlaps between the pretraining corpus and the evaluation set, thereby properly measuring the model performance.",
 }
 ```