IAPT: Instruction-Aware Prompt Tuning for Large Language Models

Zhu, Wei; Tian, Aaron Xuxiang; Yin, Congrui; Ni, Yuan; Wang, Xiaoling; Xie, Guotong

Computer Science > Computation and Language

arXiv:2405.18203 (cs)

[Submitted on 28 May 2024 (v1), last revised 7 Jun 2024 (this version, v2)]

Title:IAPT: Instruction-Aware Prompt Tuning for Large Language Models

Authors:Wei Zhu, Aaron Xuxiang Tian, Congrui Yin, Yuan Ni, Xiaoling Wang, Guotong Xie

View PDF HTML (experimental)

Abstract:Soft prompt tuning is a widely studied parameter-efficient fine-tuning method. However, it has a clear drawback: many soft tokens must be inserted into the input sequences to guarantee downstream performance. As a result, soft prompt tuning is less considered than Low-rank adaptation (LoRA) in the large language modeling (LLM) era. In this work, we propose a novel prompt tuning method, Instruction-Aware Prompt Tuning (IAPT), that requires only four soft tokens. First, we install a parameter-efficient soft prompt generator at each Transformer layer to generate idiosyncratic soft prompts for each input instruction. The generated soft prompts can be seen as a semantic summary of the input instructions and can effectively guide the output generation. Second, the soft prompt generators are modules with a bottleneck architecture consisting of a self-attention pooling operation, two linear projections, and an activation function. Pilot experiments show that prompt generators at different Transformer layers require different activation functions. Thus, we propose to learn the idiosyncratic activation functions for prompt generators automatically with the help of rational functions. We have conducted experiments on various tasks, and the experimental results demonstrate that (a) our IAPT method can outperform the recent baselines with comparable tunable parameters. (b) Our IAPT method is more efficient than LoRA under the single-backbone multi-tenant setting.

Comments:	Accepted by ACL-2024
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2405.18203 [cs.CL]
	(or arXiv:2405.18203v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.18203

Submission history

From: Wei Zhu [view email]
[v1] Tue, 28 May 2024 14:11:01 UTC (601 KB)
[v2] Fri, 7 Jun 2024 06:41:18 UTC (605 KB)

Computer Science > Computation and Language

Title:IAPT: Instruction-Aware Prompt Tuning for Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:IAPT: Instruction-Aware Prompt Tuning for Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators