Quantitative Biology > Genomics
[Submitted on 4 Aug 2017 (v1), last revised 16 Mar 2018 (this version, v5)]
Title:Minimap2: pairwise alignment for nucleotide sequences
View PDFAbstract:Motivation: Recent advances in sequencing technologies promise ultra-long reads of $\sim$100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.
Results: Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of $\ge$100bp in length, $\ge$1kb genomic reads at error rate $\sim$15%, full-length noisy Direct RNA or cDNA reads, and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions (INDELs) and introduces new heuristics to reduce spurious alignments. It is 3-4 times faster than mainstream short-read mappers at comparable accuracy and $\ge$30 times faster at higher accuracy for both genomic and mRNA reads, surpassing most aligners specialized in one type of alignment.
Availability and implementation: this https URL
Contact: [email protected]
Submission history
From: Heng Li [view email][v1] Fri, 4 Aug 2017 13:35:34 UTC (48 KB)
[v2] Fri, 25 Aug 2017 09:55:44 UTC (52 KB)
[v3] Mon, 6 Nov 2017 16:12:44 UTC (60 KB)
[v4] Tue, 2 Jan 2018 00:34:55 UTC (62 KB)
[v5] Fri, 16 Mar 2018 16:14:54 UTC (63 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.