Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-enable BigInt hexadecimal literals #23546

Merged
merged 6 commits into from
Oct 11, 2020
Merged

re-enable BigInt hexadecimal literals #23546

merged 6 commits into from
Oct 11, 2020

Conversation

rfourquet
Copy link
Member

@rfourquet rfourquet commented Sep 1, 2017

As this reverts a change which apparently was uncontroversial (but not supported by solid arguments IMO, see below), I apologize in advance to re-open the discussion, and to try to have this decision reconsidered.

This tries to fix what appears to me as a useless and frustrating limitation: the impossibility to input arbitrary-precision numbers as hexadecimal literals. My first attempt (which proves my willingness to try to respect the earlier decision) at fixing this was to create a BigUInt <: Unsigned type to hold arbitrary-precision unsigned numbers (a thin wrapper over BigInt). After a couple hours of coding, it became clear that, even if the code complexity overhead was not big, no-one would want to maintain this new type, at least not in Base (BigInt alone is complex enough). But after all, BigInt is enough for most purposes where one needs a big Unsigned (e.g. as bit fields), which is helped by the fact that the BigInt uses "two-complement" semantics. Except that they can encode negative values. Having specific printing routines to represent them in hexadecimal would help using them as Unsigned: e.g.

julia> bighex(big(10)) # ...0 represents infinitely many 0 on the left
"0x...0a"

julia> bigbin(~big(1))
"0b...10"

For this last number, "...1110" is the mathematical representation of -2 using 2-adic numbers, i.e. there is some soundness to it.

So I claim that representing and inputing BigInt with hexadecimal/octal/binary literals has nothing wrong in itself. IMHO, enabling the literal 0x100000000000000000000000000000000 to parse requires only to document that hex-literals don't necessarily produce Unsigned numbers, so I updated the docs here.

The arguments for disabling parsing BigInt hex literals that I can recollect (from #11105 and #11130):

  1. it could possibly lead to hard to find bugs

I'm not conviced honestly; sure, more complexity brings more possibilities for bugs; but Julia didn't use this argument to have only one integer type in Base...

  1. "It would be great to remove the BigInt dependency from the frontend" / "bigints from literals make it harder to deploy a Julia runtime without needing gmp."

Well, as long as we parse decimal BigInt literals, this will remain the case.

  1. having hex BigInt literals is pretty strange

it may appear so, until one needs them.

So my argument is that it is useful to have hex-literals for BigInt, and that it's surprising only if one believes that hex-literals must produce Unsigned numbers.

@JeffBezanson
Copy link
Sponsor Member

I agree. It's not a law of the universe that hexadecimal must give Unsigned numbers, and I see nothing wrong with writing big integers in hex.

@rfourquet
Copy link
Member Author

I re-added the tests and updated the docs which where modified in #12760.

@@ -235,16 +235,18 @@ For users coming to Julia from R, these are some noteworthy differences:
a larger size type, such as `Int64` (if `Int` is `Int32`), `Int128`, or the arbitrarily large
`BigInt` type. There are no numeric literal suffixes, such as `L`, `LL`, `U`, `UL`, `ULL` to indicate
unsigned and/or signed vs. unsigned. Decimal literals are always signed, and hexadecimal literals
(which start with `0x` like C/C++), are unsigned. Hexadecimal literals also, unlike C/C++/Java
(which start with `0x` like C/C++), are unsigned, unless when they encode more than 128 bits,
in which case they are of type ``BigInt``. Hexadecimal literals also, unlike C/C++/Java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

``BigInt`` -> `BigInt`

@rfourquet rfourquet added the domain:bignums BigInt and BigFloat label Sep 2, 2017
@rfourquet
Copy link
Member Author

Ok to merge?

@rfourquet
Copy link
Member Author

Jeff was supporting the idea, but without explicit approval, I'm not sure to dare to merge it mysef. So gentle bump!

@StefanKarpinski
Copy link
Sponsor Member

This can be re-enabled at any point since it's an error now. I think the weird thing about these is that they use an input syntax that is supposed to produce unsigned integers but produces a signed integer type. Let's let this wait – we can always have the debate later and enable this in 1.x.

@StefanKarpinski
Copy link
Sponsor Member

Oh, I see the connection to the MT printing, just read that.

@rfourquet
Copy link
Member Author

I'm very fine to wait, it was just convenient to flesh out a nicer print for MersenneTwister. As I said in the OP, I tried hard to avoid that by starting coding UBigInt, but it's a dead-end, as BigInt fulfils most of the needs that would be accomplished by a UBigInt, except the parsing doesn't work for now.

@StefanKarpinski
Copy link
Sponsor Member

Given that this provides a motivation for the hex syntax for BigInts, we could just do that.

@rfourquet rfourquet force-pushed the rf/0x-BigInt branch 3 times, most recently from aa8655d to 713c714 Compare October 4, 2020 13:18
@rfourquet
Copy link
Member Author

Besides #25129 (the seed of a MersenneTwister, which is a Vector{UInt32}, is printed represented as an integer in hexadecimal format), I got a couple more use cases, involving the BitIntegers package

  1. While manipulating cryptographic hashes stored e.g. as UInt256, it's nice to be able to input these hashes directly in the REPL and have that be interpreted as an BigInt
  2. building on 1), the SafeREPL package allows to reinterpret BigInt literals typed at the REPL as UInt256 for example; but this needs to not be erroring on an hexadecimal number which doesn't fit an UInt128.

Given that support was expressed, I will merge soon barring objections.

@rfourquet
Copy link
Member Author

Yet another argument for this change: Julia itself uses long hexadecimal digits to represent primitive types:

julia> primitive type UInt200 200 end

julia> reinterpret(UU, rand(UInt8, 25))[]
UU(0x616395f3cd92da91f682131e0a681693b380f61127cb303bc7)

julia> 0x616395f3cd92da91f682131e0a681693b380f61127cb303bc7
ERROR: syntax: Hex or binary literal too large for UInt128
[...]

That would be cool to be able to manipulate 0x616395f3cd92da91f682131e0a681693b380f61127cb303bc7 at the REPL, and have it be interpreted as a BigInt.

@rfourquet rfourquet merged commit d83d81e into master Oct 11, 2020
@rfourquet rfourquet deleted the rf/0x-BigInt branch October 11, 2020 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:bignums BigInt and BigFloat
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants