Investigating Forgetting in Pre-Trained Representations Through Continual Learning

Luo, Yun; Yang, Zhen; Bai, Xuefeng; Meng, Fandong; Zhou, Jie; Zhang, Yue

Abstract:Representation forgetting refers to the drift of contextualized representations during continual training. Intuitively, the representation forgetting can influence the general knowledge stored in pre-trained language models (LMs), but the concrete effect is still unclear. In this paper, we study the effect of representation forgetting on the generality of pre-trained language models, i.e. the potential capability for tackling future downstream tasks. Specifically, we design three metrics, including overall generality destruction (GD), syntactic knowledge forgetting (SynF), and semantic knowledge forgetting (SemF), to measure the evolution of general knowledge in continual learning. With extensive experiments, we find that the generality is destructed in various pre-trained LMs, and syntactic and semantic knowledge is forgotten through continual learning. Based on our experiments and analysis, we further get two insights into alleviating general knowledge forgetting: 1) training on general linguistic tasks at first can mitigate general knowledge forgetting; 2) the hybrid continual learning method can mitigate the generality destruction and maintain more general knowledge compared with those only considering rehearsal or regularization.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2305.05968 [cs.CL]
	(or arXiv:2305.05968v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.05968

Computer Science > Computation and Language

Title:Investigating Forgetting in Pre-Trained Representations Through Continual Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators