Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make Printf's general format C11 compliant #40689

Merged
merged 1 commit into from
May 11, 2021

Conversation

bicycle1885
Copy link
Member

@bicycle1885 bicycle1885 commented May 3, 2021

In general, rounding a floating-point number before formatting is considered harmful, because it may introduce some rounding error. As an extreme example, the largest normal number 1.7976931348623157e308 cannot be correctly formatted with precision 1 if we round it before sending it to Ryu, because the rounded result would be larger than the largest representable number, as shown below:

julia> using Printf

julia> @sprintf "%.1g" floatmax()  # expected: "2e+308"
"0"

N1570 (the final draft of C11, §7.21.6.1, page 313) defines the general format (%g and %G) as follows:
image

This PR makes the general format follow this standard. It may make the format slower because the current algorithm formats the same number twice in order to get the exponent part. I couldn't come up with a better way.

Also, this fixes #39748.

EDIT: This depends on #40685 to pass tests.

@quinnj
Copy link
Member

quinnj commented May 3, 2021

Hmmmm, would you mind posting some benchmarks? It seems unfortunate that we'd have to call Ryu.writexp twice in certain cases, and Ryu.writeexp then Ryu.writefixed in others. Admittedly I've been slowing in responding to the original issue here, but I was imagining doing the work inside Ryu.writeshortest itself, where there are already some rounding routines and such.

@bicycle1885
Copy link
Member Author

I quickly benchmarked the performance with the following script:

using Random
Random.seed!(1234)
N = 500_000
x = [randn(N); exp.(randn(N) * 100)]
shuffle!(x)

using Printf
function g6(out, x)
    for xx in x
        @printf out "%.6g" xx
    end
end

using BenchmarkTools
@btime g6($devnull, $x)

PR:

~/w/julia (printf-general|…) $ ./julia g.jl
  223.687 ms (2000000 allocations: 411.99 MiB)
~/w/julia (printf-general|…) $ ./julia g.jl
  227.702 ms (2000000 allocations: 411.99 MiB)
~/w/julia (printf-general|…) $ ./julia g.jl
  226.366 ms (2000000 allocations: 411.99 MiB)

master (8fdd137):

~/w/julia (master|…) $ ./julia g.jl
  216.545 ms (2000000 allocations: 411.99 MiB)
~/w/julia (master|…) $ ./julia g.jl
  215.115 ms (2000000 allocations: 411.99 MiB)
~/w/julia (master|…) $ ./julia g.jl
  214.878 ms (2000000 allocations: 411.99 MiB)

Surely it slowed the throughput down, but the difference is not devastating, which was much smaller than I expected. I agree that this should be fixed inside Ryu, but if it takes time I think this slowdown is tolerable for the time being.

@bicycle1885
Copy link
Member Author

Even if fixed-point and scientific cases are segregated, the result seems to be the same:

using Random
Random.seed!(1234)
N = 500_000
x = randn(N)
y = exp.(randn(N) * 100)
z = shuffle!([x; y])

using Printf
function g6(out, x)
    for xx in x
        @printf out "%.6g" xx
    end
end

using BenchmarkTools
@btime g6($devnull, $x)
@btime g6($devnull, $y)
@btime g6($devnull, $z)

PR:

~/w/julia (printf-general|…) $ ./julia g.jl
  100.871 ms (1000000 allocations: 205.99 MiB)
  123.202 ms (1000000 allocations: 205.99 MiB)
  229.081 ms (2000000 allocations: 411.99 MiB)
~/w/julia (printf-general|…) $ ./julia g.jl
  98.763 ms (1000000 allocations: 205.99 MiB)
  120.594 ms (1000000 allocations: 205.99 MiB)
  225.540 ms (2000000 allocations: 411.99 MiB)

master:

~/w/julia (master|…) $ ./julia g.jl
  108.733 ms (1000000 allocations: 205.99 MiB)
  105.132 ms (1000000 allocations: 205.99 MiB)
  218.649 ms (2000000 allocations: 411.99 MiB)
~/w/julia (master|…) $ ./julia g.jl
  106.367 ms (1000000 allocations: 205.99 MiB)
  104.395 ms (1000000 allocations: 205.99 MiB)
  215.350 ms (2000000 allocations: 411.99 MiB)

@bicycle1885
Copy link
Member Author

I'm a bit skeptical of this result. Maybe I'm doing something wrong. Could anyone reproduce?

@bicycle1885
Copy link
Member Author

#40755 should be merged before this change.

@quinnj
Copy link
Member

quinnj commented May 11, 2021

Here are the results I get.

master:

julia> @btime g6($devnull, $x)
  163.121 ms (2000000 allocations: 411.99 MiB)

  julia> @btime g6($devnull, $x)
  80.925 ms (1000000 allocations: 205.99 MiB)

julia> @btime g6($devnull, $y)
  78.093 ms (1000000 allocations: 205.99 MiB)

julia> @btime g6($devnull, $z)
  165.143 ms (2000000 allocations: 411.99 MiB)

PR:

 julia> @btime g6($devnull, $x)
  194.712 ms (2000000 allocations: 411.99 MiB)

julia> @btime g6($devnull, $x)
  81.821 ms (1000000 allocations: 205.99 MiB)

julia> @btime g6($devnull, $y)
  107.219 ms (1000000 allocations: 205.99 MiB)

julia> @btime g6($devnull, $z)
  196.549 ms (2000000 allocations: 411.99 MiB)

Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I'm fine merging because it at least fixes the rounding cases w/o too bad performance hit. The solution just feels bad though, because we're using writeexp just to get the exponent, then twiddling w/ that to figure out if we should do writefixed or writeexp. And we're not even using writeshortest any more, which has a bunch of the same logic builtin.

But it's fine; it's tricky code to dig into, and like I said, the perf here isn't bad at all, so we should merge it for now.

@quinnj quinnj merged commit 35d7571 into JuliaLang:master May 11, 2021
@bicycle1885 bicycle1885 deleted the printf-general branch May 11, 2021 03:57
@bicycle1885
Copy link
Member Author

Thank you!

@KristofferC
Copy link
Sponsor Member

Would this be good to backport? In that case, please add the backport label.

@bicycle1885 bicycle1885 added the backport 1.6 Change should be backported to release-1.6 label May 11, 2021
@bicycle1885
Copy link
Member Author

Yes, I think this should be backported. %g of the latest release is unreliable and I'm not sure when it fails. If this is going to be backported, #40685 and #40755 should be included as well (so I've added backport labels to them).

@KristofferC KristofferC mentioned this pull request May 11, 2021
45 tasks
@KristofferC KristofferC removed the backport 1.6 Change should be backported to release-1.6 label Jul 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

printf output - change from 1.5.3 to 1.6.0-rc1
3 participants