Benchmark on difference recurrence relation

This repository contains benchmarking programs for the difference-recurrence DP algorithm, which is adopted in the GABA nucleotide sequence alignment library. For an overview and assessment results of the other algorithm, adaptive banded DP, that is adopted in the GABA library, see https://github.com/ocxtal/adaptivebandbench.

Derivation of the difference recurrence

The semi-global variant of the Gotoh's algorithm is shown below. It calculates an alignment path with its left end fixed (aligned) and right end free.

The four difference variables, shown below, are introduced to represent difference of horizontally and vertically adjacent scores.

The original recurrence is transformed into a difference form.

The four difference variables are bounded by constants which are determined by the scoring parameters.

Adding gap-penalty offsets and modifying E and F differences make the formula simple, enabling faster calculation on super-scaler processors.

The bounding constants are also modified.

This transformation is inspired by the Loving's bit-parallel global alignment algorithm. In contrast to the Loving's algorithm that keeps difference values in multiple bit vectors, ours keeps difference variables in contiguous cells in a set of SIMD registers. The GABA library implements the last recurrence relation and its linear-gap penalty variant combined with the adaptive banded heuristic.

Speed benchmark

The matrix calculation and traceback speed of the linear-gap and affine-gap penalty implementation, found in the GABA library, is measured and compared to the adaptive banded without difference recurrence, the adaptive banded variant of the Myers' bit-parallel edit distance algorithm (see Myers, 1999 and Kimura, 2010).

The benchmark program is built with the waf build framework. The following command line make waf to build the program with the Intel C Compiler (icc).

$ ./waf configure CC=icc build

The result, shown in the table and figure below, is the result of the AVX2-enabled build of the GABA library, taken on 3.6GHz Intel Haswell processor. The fastest was the linear-gap penalty GABA library, being 5% faster than the adaptive-banded bit-parallel edit distance implementation. The difference algorithm accelerated the non-difference implementation by a factor 2. The affine-gap penalty implementations exhibited a similar trend. The average clock cycles of the fill-in loop were 18 in the linear-gap implementation and 25 in the affine-gap one, respectively. The traceback run in 7 cycle per loop in the both implementation.

Performance notices

The fill-in loop of the affine-gap implementation in the GABA library uses at most 16 SIMD vectors at the same time when it is compiled with SSE4.1 option. The Intel C Compiler (icc) backend is very smart to keep all the vectors on the 16 xmm registers though the other compilers, gcc and clang, generates one or two register spills and reloads. It is confirmed that this (and some other factors) causes 10~20% performance degregation in the both linear and affine implementations.

License

Apache v2.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
fig		fig
gaba @ df858d7		gaba @ df858d7
parasail		parasail
scripts		scripts
seq		seq
seqan		seqan
x86_64		x86_64
.gitmodules		.gitmodules
DB.c		DB.c
DB.h		DB.h
Makefile		Makefile
QV.c		QV.c
QV.h		QV.h
README.md		README.md
aaffine.c		aaffine.c
aed.c		aed.c
aed.h		aed.h
align.c		align.c
align.h		align.h
alinear.c		alinear.c
bench.c		bench.c
bench.h		bench.h
bench.sh		bench.sh
bench_seqan.cc		bench_seqan.cc
blast.c		blast.c
ddiag.h		ddiag.h
diff.h		diff.h
edlib.cpp		edlib.cpp
edlib.h		edlib.h
ksw.c		ksw.c
ksw.h		ksw.h
kvec.h		kvec.h
log.h		log.h
parasail.h		parasail.h
raffine.c		raffine.c
rlinear.c		rlinear.c
run.sh		run.sh
sea.h		sea.h
seqan_wrap.cc		seqan_wrap.cc
seqreader.h		seqreader.h
unittest.h		unittest.h
util.h		util.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmark on difference recurrence relation

Derivation of the difference recurrence

Speed benchmark

Performance notices

License

About

Releases

Packages

Languages

ocxtal/diffbench

Folders and files

Latest commit

History

Repository files navigation

Benchmark on difference recurrence relation

Derivation of the difference recurrence

Speed benchmark

Performance notices

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages