Inducing anxiety in large language models can induce bias

Coda-Forno, Julian; Witte, Kristin; Jagadish, Akshay K.; Binz, Marcel; Akata, Zeynep; Schulz, Eric

Computer Science > Computation and Language

arXiv:2304.11111 (cs)

[Submitted on 21 Apr 2023 (v1), last revised 15 Oct 2024 (this version, v2)]

Title:Inducing anxiety in large language models can induce bias

Authors:Julian Coda-Forno, Kristin Witte, Akshay K. Jagadish, Marcel Binz, Zeynep Akata, Eric Schulz

View PDF HTML (experimental)

Abstract:Large language models (LLMs) are transforming research on machine learning while galvanizing public debates. Understanding not only when these models work well and succeed but also why they fail and misbehave is of great societal relevance. We propose to turn the lens of psychiatry, a framework used to describe and modify maladaptive behavior, to the outputs produced by these models. We focus on twelve established LLMs and subject them to a questionnaire commonly used in psychiatry. Our results show that six of the latest LLMs respond robustly to the anxiety questionnaire, producing comparable anxiety scores to humans. Moreover, the LLMs' responses can be predictably changed by using anxiety-inducing prompts. Anxiety-induction not only influences LLMs' scores on an anxiety questionnaire but also influences their behavior in a previously-established benchmark measuring biases such as racism and ageism. Importantly, greater anxiety-inducing text leads to stronger increases in biases, suggesting that how anxiously a prompt is communicated to large language models has a strong influence on their behavior in applied settings. These results demonstrate the usefulness of methods taken from psychiatry for studying the capable algorithms to which we increasingly delegate authority and autonomy.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2304.11111 [cs.CL]
	(or arXiv:2304.11111v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2304.11111

Submission history

From: Julian Coda-Forno [view email]
[v1] Fri, 21 Apr 2023 16:29:43 UTC (1,328 KB)
[v2] Tue, 15 Oct 2024 14:20:51 UTC (5,875 KB)

Computer Science > Computation and Language

Title:Inducing anxiety in large language models can induce bias

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Inducing anxiety in large language models can induce bias

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators