applications.txt

%WANT_KATEX

<h2>Applications & benchmarks</h2>

<p>
Here we illustrate FLINT's performance on various benchmarks and
give examples of research problems that were
solved with FLINT's help.
</p>

<h3>Arithmetic with huge numbers</h3>

<h4>Digits of pi</h4>

<p>
FLINT uses <a href="https://gmplib.org">GMP</a> and
<a href="https://mpfr.org">MPFR</a> under the hood but notably also
provides its own arbitrary-size integer and floating-point code
which can be 2-10 times faster in many typical applications.
A basic benchmark problem is to compute $\pi$ to 100 million digits.</p>

<table style="margin-left: auto; margin-right: auto;">
<tr><th>Implementation</th> <th>Time</th></tr>
<tr><td>GMP (<a href="https://gmplib.org/pi-with-gmp">GMP-chudnovsky</a>)</td> <td>62.1 s</td></tr>
<tr><td>MPFR (builtin function <tt>mpfr_const_pi</tt>)</td> <td>172.9 s</td></tr>
<tr><td>FLINT (builtin function <tt>arb_const_pi</tt>)</td> <td>31.6 s</td></tr>
<tr><td>FLINT (builtin function <tt>arb_const_pi</tt>) (8 threads)</td> <td>9.34 s</td></tr>
</table>

<div style="text-align:center">
<p><i>Benchmark machine: 8-core AMD Ryzen 7 PRO 5850U CPU (Zen 3).</i></p>
</div>

<p>
FLINT is about twice as fast as GMP and five times faster than MPFR on this benchmark.
When allowed to use multiple threads, FLINT is 6.6x faster than GMP (which is
single-threaded only)
and 18x faster
than MPFR.
Note that due to the nature of the algorithm
used to compute $\pi$, this benchmark heavily exercises numbers of all sizes,
not just 100-million-digit numbers.
</p>

<p>
We mention that FLINT is about half as fast as
the special-purpose, closed-source pi computing
program <a href="https://www.numberworld.org/y-cruncher/">y-cruncher</a>
(which does this in 18.3 s on a single core and in 4.45 s on 8 cores);
however, FLINT can compute far more general mathematical objects.
</p>

<h4>Elliptic curve primality proving</h4>

<p>As of May 2023, the largest number that has been certified
prime using a primality test for numbers of general form
(as opposed to special tests such as Lucas-Lehmer for Mersenne primes)
is the 86453-digit repunit 11111...11111.
This was <a href="https://www.multiprecision.org/cm/ecpp.html">achieved</a> by Andreas Enge
using the elliptic curve primality proving algorithm (ECPP)
implemented in his CM software.
CM uses FLINT for certain operations such as
exponentiation in the ring $(\mathbf{Z}/n \mathbf{Z})[x] / f$
(done using FLINT's <tt>fmpz_mod_poly</tt> type)
as part of the Cantor-Zassenhaus step.
</p>

<p>FLINT itself currently includes provable primality tests such as APRCL
suitable for smaller numbers (up to a few thousand digits)
as well as probable prime tests such as Miller-Rabin and BPSW
useful for numbers of any size.</p>

<h3>Exact linear algebra</h3>

<h4>Integer matrix multiplication</h4>

<p>
Matrix multiplication is a common kernel for other operations.
Here we compare timings to multiply two 100 &times; 100 matrices with <i>b</i>-bit integer entries.
</p>

<table style="margin-left: auto; margin-right: auto;">
<tr><th>Language, integer type  </th><th> Algorithm                 </th><th> $b = 10$</th><th> $b = 100$</th><th> $b = 1000$</th></tr>
<tr><td>Julia, <tt>BigInt</tt> </td><td>  Naive (built-in <tt>*</tt> for <tt>Array</tt>)     </td><td> 0.224</td><td> 0.356</td><td> 0.542</td></tr>
<tr><td>Python, <tt>int</tt>   </td><td>  Naive (<tt>@</tt> for <tt>numpy.array</tt>)     </td><td> 0.0375 </td><td> 0.0731</td><td> 0.890</td></tr>
<tr><td>Python, <tt>gmpy2.mpz</tt>  </td><td>  Naive (<tt>@</tt> for <tt>numpy.array</tt>) </td><td> 0.0479 </td><td> 0.0537</td><td> 0.192</td></tr>
<tr><td>Pari/GP, <tt>t_INT</tt></td><td> Naive ($n^3 \times$ (<tt>c += a * b</tt>))   </td><td>  0.211</td><td> 0.219</td><td> 0.369</td></tr>
<tr><td>Pari/GP, <tt>t_INT</tt></td><td> Optimized (built-in <tt>*</tt> for <tt>t_MAT</tt>)   </td><td> 0.00147 </td><td> 0.00920</td><td> 0.102</td></tr>
<tr><td>GMP (C++ wrapper), <tt>mpz_class</tt></td><td> Naive ($n^3 \times$ (<tt>c += a * b</tt>))    </td><td> 0.0287 </td><td> 0.0294</td><td> 0.169</td></tr>
<tr><td>GMP, <tt>mpz_t</tt></td><td> Naive ($n^3 \times$ (<tt>mpz_addmul</tt>))        </td><td> 0.0122 </td><td> 0.0172</td><td> 0.159</td></tr>
<tr><td>FLINT, <tt>fmpz_t</tt></td><td> Naive ($n^3 \times$ (<tt>fmpz_addmul</tt>))       </td><td> 0.00395 </td><td> 0.0213</td><td> 0.163</td></tr>
<tr><td>FLINT, <tt>fmpz_t</tt></td><td> Optimized (<tt>fmpz_mat_mul</tt>),    </td><td> 0.0000678</td><td> 0.00235</td><td> 0.0456</td></tr>
<tr><td>FLINT, <tt>fmpz_t</tt></td><td> Optimized (<tt>fmpz_mat_mul</tt>), 8 threads    </td><td> 0.0000675</td><td> 0.000800</td><td> 0.0142</td></tr>
</table>

<div style="text-align:center">
<p>
<i>
Timing results are given in seconds.
Benchmark machine: 8-core AMD Ryzen 7 PRO 5850U CPU (Zen 3).
Note: for this benchmark, we configured FLINT to use the BLAS backend for modular matrix multiplications.</i></p>
</div>

<p>
The main takeway is that an optimized integer matrix multiplication algorithm
such as that provided by FLINT's <tt>fmpz_mat_mul</tt>
is much faster (sometimes by more than a factor 1000) than a naive loop doing $n^3$ big-integer multiplications and additions.
For the matrices in this benchmark, FLINT's <tt>fmpz_mat_mul</tt> is using a multimodular algorithm.
Pari/GP also uses a similar algorithm, but it is somewhat slower than FLINT.
</p>

<p>
If one implements the naive algorithm, FLINT's <tt>fmpz_t</tt> is about 3&times; faster than GMP's
<tt>mpz_t</tt> with small entries
(and at worst 20% slower for medium-size entries)
due to using an inline representation for values that fit in a single machine word.
</p>

<h3>Univariate polynomial arithmetic</h3>

<h4>Tabulating Weil polynomials</h4>

<p><a href="https://kskedlaya.org/">Kiran S. Kedlaya</a> reports:</p>

<blockquote>My code in the Sage library for tabulating Weil polynomials makes extensive use of FLINT univariate polynomials over $\mathbf{Z}$, particularly to count real roots using Sturm-Habicht sequences. A concrete application appears in my three recent papers on the relative class number one problem for function fields [<a href="https://doi.org/10.1007/s40993-022-00369-y">I</a>, <a href="https://arxiv.org/abs/2206.02084">II</a>, <a href="https://arxiv.org/abs/2208.11277">III</a>], where I end up running this code at scale to tabulate millions of Weil polynomials.</blockquote>

<h4>Congruent numbers below one trillion</h4>

<p>In 2009, FLINT was used together
with <a href="https://web.maths.unsw.edu.au/~davidharvey/code/zn_poly/">zn_poly</a>
for a large power series computation to determine the <a href="https://www.aimath.org/news/congruentnumbers/">congruent numbers up to $10^{12}$</a>,
a result which was widely reported in the press ("A Trillion Triangles").
</p>

<h4>Solving polynomial systems</h4>

<p>FLINT integer arithmetic and univariate polynomial arithmetic is used
in the <a href="https://msolve.lip6.fr/">msolve</a> library
for solving multivariate systems. For example, the authors write:
</p>

<blockquote>to tackle large degrees, we revisit asymptotically fast
algorithms for Taylor shift (...) and implement them carefully using the
FFT-based multiplication of FLINT for univariate
polynomials with integer coefficients. This is a major difference
with other implementations because of the (wrong) belief that asymptotically
fast algorithms are useless in this context. This allows us to obtain a
univariate solver which outperforms
the state of the art on examples coming from our computations</blockquote>

<h3>Multivariate polynomial arithmetic</h3>

<p>
FLINT includes fast, multithreaded code for multivariate polynomials
over $\mathbf{Z}$, $\mathbf{Q}$ and $\mathbf{Z}/n\mathbf{Z}$ and
other coefficient rings.
Some older (2019) benchmarks results are available (<a href="https://wbhart.blogspot.com/2019/08/parallel-multivariate-arithmetic-final.html">part 1</a>, 
<a href="https://wbhart.blogspot.com/2019/08/update-on-trip-timings-for-multivariate.html">part 2</a>)
demonstrating favorable performance compared to Maple, Trip and Giac
on both sparse and dense problems.
</p>

<h4>Reduction of Feynman integrals</h4>

<p>
<a href="https://gitlab.com/feynmanintegrals/fire">FIRE</a> is a program which performs integration-by-parts reduction of
Feynman integrals.
As of version 6, FIRE allows using FLINT as a backend for multivariate polynomial arithmetic.
An example benchmark problem is the "three-loop banana diagram", about which the authors <a href="https://arxiv.org/abs/2311.02370">write</a>:
</p>

<div style="text-align:center">
<img src="banana.svg" style="width:250px; max-width:25%" />
</div>

<blockquote>
We consider the unequal mass case, with four different internal masses $m_1, m_2, m_3, m_4$. The external mass is set to a numerical value, $p^2=1$. We reduce a tensor integral whose numerator is a dot product between the loop momentum for the $m_2$ line and the loop momentum for the $m_3$ line. The execution times are shown in Table 5. Unlike the previous benchmarks, this one has an unusually large number of kinematic scales. In this case, FLINT and Symbolica backends offer dramatic speedups, up to about 9 times, over the Fermat backend.

<p>
<table style="margin-left:auto; margin-right:auto">
  <colgroup>
    <col span="1" style="width: 50%;">
    <col span="1" style="width: 50%;">
  </colgroup>
  <thead>
    <tr>
      <th style="text-align: center;">Simplifier</th>
      <th style="text-align: center;">Time (1000s)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: center;">FLINT</td>
      <td style="text-align: center;">0.62, 1.01</td>
    </tr>
    <tr>
      <td style="text-align: center;">Symbolica</td>
      <td style="text-align: center;">1.02, 1.66</td>
    </tr>
    <tr>
      <td style="text-align: center;">Fermat</td>
      <td style="text-align: center;">5.36, 9.23</td>
    </tr>
  </tbody>
</table>
</p>

<p style="text-align: center; font-size: smaller;">Time taken for FIRE to perform the IBP reduction for the 3-loop banana diagram with unequal internal masses, with a rank-1 numerator, for three different simplifier backends. Every result is shown as two numbers, indicating the time measured on the two different machines.</p>

</blockquote>

<h3>Handling algebraic and exact numbers</h3>

<h4>Equality of two large expressions</h4>

<p>
Computer algebra systems often struggle
to manipulate exact numbers like $\sqrt{2 + \sqrt{3}}$
when the expressions become large.
FLINT can sometimes handle such problems more easily.
In 2020 a SageMath user
<a href="https://ask.sagemath.org/question/52653/equality-of-algebraic-numbers-given-by-huge-symbolic-expressions/">reported</a>
that SageMath did not finish in six hours when asked to check
the equality $A = B$ of two algebraic numbers
which arose in a matrix computation.
The expressions for the numbers involve nested square roots and have about
7000 terms.
The current version of FLINT proves this equality in four seconds.</p>

<h4>Cutting squares into similar rectangles</h4>

<div style="text-align:center">
<img src="squarecutting.png" style="width:600px; max-width:80%" />
</div>

<p>
Ian Henderson has a <a href="https://ianhenderson.org/similar-rectangles/">nice writeup</a>
about enumerating all ways to cut a square into $n$ similar rectangles. 
This problem was solved up to $n = 8$ with an exhaustive computer search
using the exact real numbers in FLINT/Calcium to represent roots of polynomials.
</p>

<h3>Numerical computation</h3>

<p>The ball arithmetic in FLINT (formerly the separate Arb library)
is used in diverse applications (physics, biology, engineering, number theory, etc.) requiring
extremely precise and reliable numerical computation.
</p>

<h4>Certified ODE solving and analytic continuation</h4>

<p>
The <a href="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/mkauers/ore_algebra/">ore_algebra</a> package
for SageMath includes tools developed by
<a href="https://marc.mezzarobba.net/">Marc Mezzarobba</a> for computing rigorous arbitrary-precision solutions of holonomic ODEs
and analytically continuing such solutions through the complex plane.
FLINT/Arb is used for the underlying arithmetic with complex numbers
and matrices, polynomials and power series.
Applications include:</p>

<ul>
<li>the construction of a large <a href="https://mathexp.eu/lairez/quarticdb">database of quartic surfaces</a>,</li>
<li><a href="https://arxiv.org/abs/2209.10962">simplification of differential operators for Feynman integrals</a>,</li>
<li><a href="https://arxiv.org/abs/1904.11705">computing the volumes of semi-algebraic sets</a> in $n$ dimensions (see illustration), and</li>
<li>proving uniqueness of the <a href="https://arxiv.org/abs/2011.08155">Canham model for biomembranes</a>.</li>
</ul>

<div style="text-align:center">
<img src="torus.jpg" style="width:250px" />
</div>

<h4>Accurate Gaunt factors for quadrupole radiation</h4>

<p>
Josef Pradler and Lukas Semmelrock
write in <i>The Astrophysical Journal</i> (2021) about
computing <a href="https://doi.org/10.3847/1538-4357/ac0898">accurate Gaunt factors for nonrelativistic quadrupole bremsstrahlung</a>:

<blockquote>The calculation of hypergeometric functions of large imaginary arguments is a difficult task [...].
A particular strength of [Arb] is the rigorous computation of hypergeometric functions [...]; common software frameworks such as <i>Mathematica</i> appear not able to yield results
for $|\nu_i| \gtrsim 100$ for all required function values in a reasonable amount of time</blockquote>

<blockquote>
We use Arb to calculate the hypergeometric functions, Sommerfeld factors, and all
coefficients; in short, the entire expression
This is required, because, first, products such as $S_i S_f \times |{}_2F_1(i \nu_f, i \nu_i; 1; z)|^2$ can be
of the form $huge \times tiny$, and second, because of the occurrence
of cancellations in the final linear combination of these terms.
Because of factors $\exp(\pm \nu_{i,f})$ contained in $S_{i,f}$, it turns out that
the required precision is approximately $|\nu_{i,f}|$. In our tabulation
we therefore evaluate the ingredients with up to $10^4$ digits of
precision.
</blockquote>

<h4>The ternary Goldbach conjecture</h4>

<p>FLINT/Arb has been used for certain computations in Harald Helfgott's
<a href="https://webusers.imj-prg.fr/~harald.helfgott/anglais/book.html">solution of the ternary Goldbach problem</a>,
for instance to
establish
precise bounds for integrals involving the Riemann zeta function
in the complex plane.
The computations are described in more detail in Helfgott's book
and on his website.</p>

<div style="text-align:center">
<img src="zplot2.svg" style="width:60%" /><br/>
<i>Figure: a short subsegment of one of Helfgott's integrals: $\int_{-\tfrac{1}{4}+8i}^{-\tfrac{1}{4}+40000i} \left|\frac{F_{19}(s+\tfrac{1}{2}) F_{19}(s+1)}{s^2}\right| |ds|$, where $F_N(s) = \zeta(s) \prod_{p\leq N} (1 - p^{-s})$.</i>
</div>

<h4>The Riemann hypothesis</h4>

<p>
The highest verification of the Riemann hypothesis to date
was done 
<a href="https://arxiv.org/abs/2004.09765">in 2020 by Dave Platt and Trim Trudgian</a>,
reaching height $3 \cdot 10^{12}$.
Parts of their computations used FLINT/Arb ball arithmetic
<blockquote>[...] in place of full interval arithmetic whence there is a space saving of roughly 50%, which
make applications more cache friendly.</blockquote>
</p>

<h4>The de Bruijn-Newman constant</h4>

<p>
FLINT/Arb was used in a large distributed computation
as part of the "polymath" project to <a href="https://terrytao.wordpress.com/2018/01/24/polymath-proposal-upper-bounding-the-de-bruijn-newman-constant/">bound the de Bruijn-Newman constant</a>.
A contributor to that effort wrote:</p>

<blockquote>
Very impressed by the robustness of your software as well as by the tremendous speed gains that it has brought us (up till a factor 20-30 faster than Pari/GP). So far we haven't encountered a single bug in the software. Well done!
</blockquote>

<h4>Traveling wave solutions of PDEs</h4>

<div style="text-align:center">
<img src="BH-u.svg" style="width:300px" /><br/>
</div>

<p>
Joel Dahne and Javier Gómez-Serrano
have <a href="https://arxiv.org/abs/2205.00802">proved the existence</a> of a periodic highest, cusped, traveling wave solution for the
Burgers-Hilbert equation,
and more recently also for the fractional KdV equation.
The interval computations in the proofs took 10,000 CPU hours to run on a 384-core cluster,
employing power series arithmetic and special functions
provided by FLINT/Arb. The authors write:
</p>

<blockquote>These bounds are highly non-trivial, in
particular $\delta_{0}$ is given by the supremum of a function on
the interval $[0, \pi]$ which attains its maximum around
$10^{-5000}$. [...] Note that
these numbers are extremely small, too small to be represented even
in standard quadruple precision <tt>binary128</tt>, though Arb has
no problem handling them.
</blockquote>