Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jv: Add some support for 64 bit ints in a very conservative way #1246

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dequis
Copy link

@dequis dequis commented Oct 3, 2016

This adds an extra int64_t field to jv, which is only written when
parsing number literals (and only if they are integers within range),
and only read when printing.

Any operations done over those numbers will downgrade them to a double
with 53 bit precision. This is intentional for the scope of this patch:

$ echo 111111111111111111 | jq -c '[., 1*.]'
[111111111111111111,111111111111111100]

For ints between 53 and 64 bits, this matches the behavior of awk,
as suggested in the bug tracker by tischwa:

$ echo 111111111111111111 | awk '{print $1, 1*$1}'
111111111111111111 111111111111111104

Fixes issue #369 (at least partially)

I tried to add tests but the stuff in jq.test seems to be parsed by jq itself, so they end up being comparisons between the double values, not the int64 ones.

This is an itch i've really needed to scratch for a long time. Using jq as a pretty printer for json, I almost started getting used to big numbers getting mangled. Not anymore!

This adds an extra int64_t field to jv, which is only written when
parsing number literals (and only if they are integers within range),
and only read when printing.

Any operations done over those numbers will downgrade them to a double
with 53 bit precision. This is intentional for the scope of this patch:

    $ echo 111111111111111111 | jq -c '[., 1*.]'
    [111111111111111111,111111111111111100]

For ints between 53 and 64 bits, this matches the behavior of awk,
as suggested in the bug tracker by tischwa:

    $ echo 111111111111111111 | awk '{print $1, 1*$1}'
    111111111111111111 111111111111111104
@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 85.776% when pulling cf11ac0 on dequis:64bit into 0b82185 on stedolan:master.

double number;
struct {
double dbl;
int64_t int64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not desirable, unfortunately, as it increases the sizeof(jv).

What I would recommend instead is a new jv kind flag that represents a signed 64-bit integer. Then add jv_int64() (creates a jv from an int64_t), jv_int64_value() (returns an int64_t given a numeric jv). For backwards compatibility I'd have jv_get_kind() return JV_KIND_NUMBER, and then add a jv_get_number_kind() that reports the numeric kind. jv_number_value() and jv_int64_value() should both accept numbers of any kind, and do the appropriate conversion.

This would allow us to also add uint64_t as well.

All arithmetic and math functions would continue to operate on doubles, though perhaps the arithmetic operations could eventually be made to operate on integers (falling back on doubles in the event of overflow/underflow).

@nicowilliams
Copy link
Contributor

Hi. Thanks for this contribution. I left a comment on jv.h. Let us know what you think.

@dequis
Copy link
Author

dequis commented Jan 23, 2017

omg you're alive

@nicowilliams
Copy link
Contributor

heheh, yeah, I'm alive. Sorry for the absence :( I've been heads down in other things.

@nicowilliams
Copy link
Contributor

I'm working on an alternative version of this where a numeric jv can only be a double, and int64_t, or a uint64_t. The parser will use strtoumax() or similar when a numeric value has no exponent and no decimal, and will produce an a 64-bit integer, signed or unsigned as necessary if the integer is small enough to fit in 64 bits. The dumper will, of course, trivially dump 64-bit ints. There will be new jv functions to go with all of this.

My main fear is that this will slow things down. I may add macros for disabling this feature.

@dequis
Copy link
Author

dequis commented Jan 28, 2017

Why is it not acceptable to increase sizeof(jv) anyway? Is there some sort of API/ABI guarantee for other projects?

@nicowilliams
Copy link
Contributor

There's no ABI constraint on the size of a jv. But jvs are rather compact -- as compact as they can be, really, and this is useful because we pass them on the C and jq stacks a lot. There are some issues with jvs though, mostly the size of the size and offset fields, which are way too small and have bad overflow semantics, but even for fixing those I'd be reluctant to change the size of a jv.

@nicowilliams
Copy link
Contributor

Another thing is that having multiple number representations at once in any one jv seems like asking for trouble? Better just have one.

@fadado
Copy link

fadado commented Jan 28, 2017

Another thing is that having multiple number representations at once in any one jv seems like asking for trouble? Better just have one.

I like a lot bitwise operations on integers, but I liked also XSLT in the past, and XSLT has only one kind of "numbers": let jq do the same?

Speaking of XSLT, how about implementing translate?

JJOR

@nicowilliams
Copy link
Contributor

BTW, your submission is awesome. I'm running with it and modifying it to make the int value a part of the u union in the jv, but please don't feel bad about that.

@nicowilliams
Copy link
Contributor

@fadado jq will only have one kind of number. The idea here is that when the input was an integer that fits in an int64_t then that should be the internal representation and when printed it should be printed the way a printf function would do it (hmm, though we need to be careful to not have any internationalization, e.g., thousands separators!), which is to say: with no exponent, no decimal.

@nicowilliams
Copy link
Contributor

@dequis Checkout #1327. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants