Towards A Reliable Ground-Truth For Biased Language Detection

Spinde, Timo; Krieger, David; Plank, Manuel; Gipp, Bela

Computer Science > Computation and Language

arXiv:2112.07421 (cs)

[Submitted on 14 Dec 2021 (v1), last revised 17 Dec 2021 (this version, v2)]

Title:Towards A Reliable Ground-Truth For Biased Language Detection

Authors:Timo Spinde, David Krieger, Manuel Plank, Bela Gipp

View PDF

Abstract:Reference texts such as encyclopedias and news articles can manifest biased language when objective reporting is substituted by subjective writing. Existing methods to detect bias mostly rely on annotated data to train machine learning models. However, low annotator agreement and comparability is a substantial drawback in available media bias corpora. To evaluate data collection options, we collect and compare labels obtained from two popular crowdsourcing platforms. Our results demonstrate the existing crowdsourcing approaches' lack of data quality, underlining the need for a trained expert framework to gather a more reliable dataset. By creating such a framework and gathering a first dataset, we are able to improve Krippendorff's $\alpha$ = 0.144 (crowdsourcing labels) to $\alpha$ = 0.419 (expert labels). We conclude that detailed annotator training increases data quality, improving the performance of existing bias detection systems. We will continue to extend our dataset in the future.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2112.07421 [cs.CL]
	(or arXiv:2112.07421v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2112.07421

Submission history

From: Timo Spinde [view email]
[v1] Tue, 14 Dec 2021 14:13:05 UTC (721 KB)
[v2] Fri, 17 Dec 2021 11:35:03 UTC (719 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-12

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Bela Gipp

export BibTeX citation

Computer Science > Computation and Language

Title:Towards A Reliable Ground-Truth For Biased Language Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Towards A Reliable Ground-Truth For Biased Language Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators