SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning

Chi, Jianlei; Qu, Yu; Liu, Ting; Zheng, Qinghua; Yin, Heng

Computer Science > Cryptography and Security

arXiv:2010.10805 (cs)

[Submitted on 21 Oct 2020 (v1), last revised 22 Mar 2022 (this version, v3)]

Title:SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning

Authors:Jianlei Chi, Yu Qu, Ting Liu, Qinghua Zheng, Heng Yin

View PDF

Abstract:Software vulnerabilities are now reported at an unprecedented speed due to the recent development of automated vulnerability hunting tools. However, fixing vulnerabilities still mainly depends on programmers' manual efforts. Developers need to deeply understand the vulnerability and try to affect the system's functions as little as possible.
In this paper, with the advancement of Neural Machine Translation (NMT) techniques, we provide a novel approach called SeqTrans to exploit historical vulnerability fixes to provide suggestions and automatically fix the source code. To capture the contextual information around the vulnerable code, we propose to leverage data flow dependencies to construct code sequences and fed them into the state-of-the-art transformer model. The fine-tuning strategy has been introduced to overcome the small sample size problem. We evaluate SeqTrans on a dataset containing 1,282 commits that fix 624 vulnerabilities in 205 Java projects. Results show that the accuracy of SeqTrans outperforms the latest techniques and achieves 23.3% in statement-level fix and 25.3% in CVE-level fix. In the meantime, we look deep inside the result and observe that NMT model performs very well in certain kinds of vulnerabilities like CWE-287 (Improper Authentication) and CWE-863 (Incorrect Authorization).

Comments:	22 pages, 20 figures, 7 tables
Subjects:	Cryptography and Security (cs.CR); Software Engineering (cs.SE)
Cite as:	arXiv:2010.10805 [cs.CR]
	(or arXiv:2010.10805v3 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2010.10805

Submission history

From: Jianlei Chi [view email]
[v1] Wed, 21 Oct 2020 07:49:08 UTC (2,123 KB)
[v2] Tue, 1 Jun 2021 06:17:30 UTC (2,931 KB)
[v3] Tue, 22 Mar 2022 12:45:39 UTC (3,812 KB)

Computer Science > Cryptography and Security

Title:SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators