Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numeric literal suffixes #10493

Merged
merged 17 commits into from
Dec 13, 2021
Merged

Conversation

Aidan63
Copy link
Contributor

@Aidan63 Aidan63 commented Nov 19, 2021

Closes #10481

Opening up a draft with my progress so far to make sure I'm not going off in the wrong direction.

Ast constant int and float expressions now contain an optional suffix string. During typing the suffix is inspected and in the case of i32 or f64 the constant is passed through as is and with i64 or u32 an appropriate expression is inserted. This does mean that suffixes don't make it into the typed ast which I'm not sure is ideal but does simplify things.

In many cases I've discarded the suffix or just inserted None to get it to compile, while things seem to work pretty well as is in my basic testing I'll have to go through and understand whats going on in many of those places and make sure its right. I've made a note of things which I know about / need to do below.

@RealyUniqueName
Copy link
Member

RealyUniqueName commented Nov 19, 2021

I think this should be implemented by adding new constructors to Ast.constant and Type.tconstant.
Maybe one constructor per suffix or | Number of value * suffix. Or maybe even replace | Float and | Int in favor of | Number

@Simn
Copy link
Member

Simn commented Nov 19, 2021

The Ast.constant part is fine like this. I see no reason to add anything else because it is sufficient to describe the syntax and is not a very intrusive change.

I'm not sure about the Type.tconstant situation, but I think that's somewhat out-of-scope for this PR.

@Aidan63
Copy link
Contributor Author

Aidan63 commented Nov 19, 2021

I originally was thinking of having the suffix in tconstant and using some sort of filter to generate the Int64 make and UInt casts but decided against it as you'd just end up with None suffixes everywhere after the filter which doesn't seem very useful.

Managed to work around the i64 reification issue by manually constructing the EConst CInt enum instead of using makeExpr. Not sure why the i64 suffix triggers the value_to_expr exception, printing out the value reports it as a VInt32 the same as all the other suffixes.

@Aidan63
Copy link
Contributor Author

Aidan63 commented Nov 20, 2021

I've added support for the integer suffixes on hex literals, also if you append i64 to a hex literal it now allows up to 16 hex characters as opposed to 8 for other suffixes or un-suffixed hex literals (closes #5150).

As for parseInt / parseFloat I don't think it makes sense to support suffixes in the input strings as you can't change the output type of parseInt if you input 123i64. Had a quick check against other languages like C# and java are they don't seem to support the suffixes in their parse functions either.

@Aidan63 Aidan63 marked this pull request as ready for review November 20, 2021 23:17
@@ -568,11 +568,11 @@ let rec constructor_side_effects e =

let type_constant basic c p =
match c with
| Int s ->
| Int (s,_) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this function should ignore the suffixes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, wasn't sure what to do here since currently the suffixes are handled just before (mainly for the simplicity of creating the int64 make and uint cast with the untyped ast).

Would it be better to try and manually create the texprs here instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this situation makes me think that we should consider dealing with the tconstant part of this as well, but that would add some complexity to everything...

Generally, I haven't had great experiences with syntactic rewrites like this. However, I acknowledge that it's very convenient here.

Ultimately, we're going to need some sort of mk_int64 : int64 -> texpr function that doesn't require the full typer context. Which is already a problem because Int64.make is a pretty high-level function that relies on inline. I'll have to think about how to approach this.

@Simn
Copy link
Member

Simn commented Nov 23, 2021

I propose that we merge this now and then look into our internal handling of literals as a second step. The changes here shouldn't break anything, at worst there's some part that doesn't work correctly yet.

Could you resolve the conflict?

@@ -111,6 +111,25 @@ let is_valid_identifier s =
with Exit ->
false

let split_int_suffix s =
let is_signed = String.contains s 'i' in
match String.index_opt s (if is_signed then 'i' else 'u') with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't try String.index show better performance? I think numeric constants are quite numerous in hx sources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unlikely to ever be measurable, and it's not clear if the alternative is actually better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess the match could be changed to something like match (try String.index s 'i' with _ -> (try String.index s 'u' with _ -> -1)) with ... which is less readable at a glance for me, not sure if any performance difference would really be noticable to make it worthwhile

@@ -345,8 +345,8 @@ let rec load_instance' ctx (t,p) allow_no_params =
| TPExpr e ->
let name = (match fst e with
| EConst (String(s,_)) -> "S" ^ s
| EConst (Int i) -> "I" ^ i
| EConst (Float f) -> "F" ^ f
| EConst (Int (i,_)) -> "I" ^ i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may lead to misses for types like MyType<123i32> and MyType<123i64>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we can just append the suffix here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this makes me realize that we might have to sanitize the suffixes somewhere if they aren't typed as expressions. I think we could sneak in invalid strings via macros otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes, I think I see what you mean. macro : MyType<123i40> allows the i40 suffix into the type parameter (I'm currently ignoring the suffix in that part of the reify since I wasn't sure when it was called until now).

I'll look into not ignoring the suffixes in this case and adding some sort of check, not sure if there are any other places you could add invalid suffixes as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a check in macro api decoding to ensure the suffixes of any created numeric constants are valid. Not sure if there's a better place to put these checks so its not done both in the typer and at macro decoding.

@Aidan63
Copy link
Contributor Author

Aidan63 commented Nov 23, 2021

Noticed that with the i32 suffix literals could still be changed to floats if they were too large for 32 bits. Changed it to error instead since changing to a float doesn't make much sense if you're explicitly using the i32 suffix.

@nanjizal
Copy link
Contributor

  • D allowIntOverflow
    is likely to be a request compiler feature if it errors.

@skial skial mentioned this pull request Nov 24, 2021
1 task
@RealyUniqueName RealyUniqueName added this to the Release 4.3 milestone Nov 29, 2021
@Simn Simn merged commit 07678d7 into HaxeFoundation:development Dec 13, 2021
@Aidan63 Aidan63 deleted the integer-suffixes branch December 15, 2021 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement numeric literal suffixes
4 participants