Minor fixes for dump printing #2342

rvagg · 2023-12-01T13:24:30Z

uint64's should use an unsigned print so as not to cast at the edges
FloatParser<T>::ParseHex should be able to handle nan, nan:* and ±inf too
remove extraneous . in the float hex format, 0x1.p+0 is not correct
added some tests of known-good actual values to augment the round-tripping tests; these match C++ output except for the signalling NaN extra data, they also match Go except for NaN and Inf casing and an additional zero pad Go adds on the significand for some odd reason.

sbc100 · 2023-12-01T17:45:03Z

test/dump/const.txt

 00004b: 80 7f | 
 00004d: 1a | drop
- 00004e: 42 7f | i64.const -1
+ 00004e: 42 7f | i64.const 18446744073709551615


Are you sure it makes more sense to show these as unsigned?

The numbers are read signed values in the binary reader so maybe it makes sense to show them as signed in the disassembly too?

wabt/src/binary-reader.cc

Line 827 in 2428315

CHECK_RESULT(ReadS64Leb128(&value, "i64.const value"));

sbc100

Is there some kind of bug in the literal parser that you are fixing here? If so, I wonder why it was not triggered by any of the spec tests? Are we missing a spec test maybe?

rvagg · 2023-12-02T01:14:16Z

This is coming from my unorthodox usage of wabt—I'm doing some binary parsing and using the dump output as part of my test fixturing to make sure I can produce the same output for the same binary and in reimplementing some of the printing I'm seeing some of the rough edges here.

The 0x1.p... hex printing, which I believe is a simple flaw, I don't see this in anyone else's interpretation of hex float representations. The current tests for hex printing simply involve round-tripping floats through the printer and back through the parser and the extraneous . doesn't trip anything up (reasonable for a non-strict parser). It's just when you're testing for specific output it becomes a problem. Hence my inclusion of some tests that go from bytes to specific text output for the hex printing, just to lock it in for the main cases and edges.
The nan and infinity inclusion in ParseFloat isn't being used here because there's no reason to parse hex floats in here, I included it for completeness of the implementation, and to get my spec tests working cleanly.
The i64.const printing: I get the argument that they could be signed, it's probably not a strong argument either way since they are technically uninterpreted; maybe the strongest argument is that they can be signed coming in from the text format?

Either way, it's the const printing inconsistency that's the problem because i32.const are printed as unsigned already, while i64.const are signed:

wabt/src/binary-reader-objdump.cc

Line 787 in 2428315

LogOpcode("%u", value);

i.e. current test fixtures for the dump even show this discrepancy - see input at the top vs output at the bottom:

$ git grep -E 'i(32|64)\.const ' -- test/dump/const.txt
test/dump/const.txt:    i32.const 0
test/dump/const.txt:    i32.const -2147483648
test/dump/const.txt:    i32.const 4294967295
test/dump/const.txt:    i32.const -0x80000000
test/dump/const.txt:    i32.const 0xffffffff
test/dump/const.txt:    i64.const 0
test/dump/const.txt:    i64.const -9223372036854775808
test/dump/const.txt:    i64.const 18446744073709551615
test/dump/const.txt:    i64.const -0x8000000000000000
test/dump/const.txt:    i64.const 0xffffffffffffffff
test/dump/const.txt: 000019: 41 00                      | i32.const 0
test/dump/const.txt: 00001c: 41 80 80 80 80 78          | i32.const 2147483648
test/dump/const.txt: 000023: 41 7f                      | i32.const 4294967295
test/dump/const.txt: 000026: 41 80 80 80 80 78          | i32.const 2147483648
test/dump/const.txt: 00002d: 41 7f                      | i32.const 4294967295
test/dump/const.txt: 000030: 42 00                      | i64.const 0
test/dump/const.txt: 000033: 42 80 80 80 80 80 80 80 80 | i64.const -9223372036854775808
test/dump/const.txt: 00003f: 42 7f                      | i64.const -1
test/dump/const.txt: 000042: 42 80 80 80 80 80 80 80 80 | i64.const -9223372036854775808
test/dump/const.txt: 00004e: 42 7f                      | i64.const -1

The one minor wrinkle in printing them signed is that they are represented unsigned internally and the path to printing is a straightforward one through BinaryReaderObjdumpDisassemble::OnOpcodeUint32 and BinaryReaderObjdumpDisassemble::OnOpcodeUint64. If we do them signed then they'd both need special cases for the consts, which is doable, just messes with the model a bit, as per spec:

In the abstract syntax, they are represented as unsigned values.

Although, for the text spec we have:

For the other integer instructions, the use of two’s complement for the signed interpretation means that they behave the same regardless of signedness.

So 🤷 are we text or abstract/binary at the point of dumping?

sbc100 · 2023-12-02T01:50:43Z

Could you upload a separate PR for the i64.const printing? I agree it should be consistent with i32.const.

I'm not sure I understand the other two items yet..

Ref: WebAssembly#2342

rvagg · 2023-12-02T02:57:20Z

Done, leaving this with just the ParseHex change and the extraneous . in the hex format; if it makes it easier to stomach I can back out the inf and nan stuff, that just seemed like a logical thing to do because the method is called "ParseHex" but it can't parse all of the hex that WriteHex produces, hence asymmetry. It'll just mean that my test cases will need to special case "nan*" and "*inf" cases to get the job done.

Going deeper into the . thing in the hexfloat format, the case being addressed here is specifically about subnormals (sorry, I wasn't clear on that to begin with).

The . is technically allowed by the spec I believe, so the parser being lax is fine, but the printer is inconsistent, both within itself and across other implementations (i.e. I've not found another implementation yet that does this). I believe that the formalised grammar defined for IEEE 745 looks like this: https://observablehq.com/@jrus/hexfloat#cell-631 - {hexDigit}* "." {hexDigit}+ | {hexDigit}+ "." | {hexDigit}+ and the WebAssembly spec appears to suggest roughly the same thing wrt ., with the ? indicator:

What's happening here, and why it's not being picked up by existing tests, is that this branch is only encountered for subnormals, which aren't covered in the dump tests. They are covered in the const.wast spec tests, but not as hexfloat output. I've added a new commit to this branch that passes with the current main version of literal.cc but will fail on this fixed version. You can see in there the 0x1.p entries in there now, looking out of place, because further up the non-subnormal cases without exponent have just 0x1p. So in the branch I'm addressing in my fix it comes straight after the subnormal adjustment of sig.

Clear as mud I'm sure.

This matches the behaviour of i32 printing. Ref: #2342

rvagg · 2023-12-02T05:30:35Z

Digging further into the parse hex thing I've done here, after tracing the normal path of incoming tokens, I can see nan and inf being handled entirely separately in the text lexer and coming in to literal.cc with an indicator that separates the parsing path. So, I've backed out that change that adds extra logic inside ParseHex, leaving just the . fix and the test changes in here. I hope that simplifies things a little bit. The only material change now is omitting . for subnormals that have no significand.

And I can flip those failing tests around when you're happy with the change, I just want to demonstrate what it's doing now with those.

with the help of some wabt upstream fixes: WebAssembly/wabt#2342

…good cases

rvagg · 2023-12-09T05:05:14Z

@sbc100 I've fixed the tests with the new subnormal additions so that it passes now. Below is a capture of what it looks like with subnormals added but printing using the format that it comes out if you run the version on main. The change here is now very minimal aside from test additions. PTAL.

Capture of test failure with subnormals included

- test/dump/hexfloat_f32.txt
  STDOUT MISMATCH:
  --- expected
  +++ actual
  @@ -35,17 +35,17 @@
    000078: 1a                         | drop
    000079: 43 80 80 7f c0             | f32.const -0x1.ff01p+1
    00007e: 1a                         | drop
  - 00007f: 43 01 00 00 00             | f32.const 0x1.p-149
  + 00007f: 43 01 00 00 00             | f32.const 0x1p-149
    000084: 1a                         | drop
  - 000085: 43 01 00 00 80             | f32.const -0x1.p-149
  + 000085: 43 01 00 00 80             | f32.const -0x1p-149
    00008a: 1a                         | drop
  - 00008b: 43 01 00 00 00             | f32.const 0x1.p-149
  + 00008b: 43 01 00 00 00             | f32.const 0x1p-149
    000090: 1a                         | drop
  - 000091: 43 01 00 00 80             | f32.const -0x1.p-149
  + 000091: 43 01 00 00 80             | f32.const -0x1p-149
    000096: 1a                         | drop
  - 000097: 43 02 00 00 00             | f32.const 0x1.p-148
  + 000097: 43 02 00 00 00             | f32.const 0x1p-148
    00009c: 1a                         | drop
  - 00009d: 43 02 00 00 80             | f32.const -0x1.p-148
  + 00009d: 43 02 00 00 80             | f32.const -0x1p-148
    0000a2: 1a                         | drop
    0000a3: 43 03 00 00 00             | f32.const 0x1.8p-148
    0000a8: 1a                         | drop

- test/dump/hexfloat_f64.txt
  STDOUT MISMATCH:
  --- expected
  +++ actual
  @@ -39,41 +39,41 @@
    0000cc: 1a                         | drop
    0000cd: 44 00 00 00 00 00 00 00 80 | f64.const -0x0p+0
    0000d6: 1a                         | drop
  - 0000d7: 44 01 00 00 00 00 00 00 00 | f64.const 0x1.p-1074
  + 0000d7: 44 01 00 00 00 00 00 00 00 | f64.const 0x1p-1074
    0000e0: 1a                         | drop
  - 0000e1: 44 01 00 00 00 00 00 00 80 | f64.const -0x1.p-1074
  + 0000e1: 44 01 00 00 00 00 00 00 80 | f64.const -0x1p-1074
    0000ea: 1a                         | drop
  - 0000eb: 44 01 00 00 00 00 00 00 00 | f64.const 0x1.p-1074
  + 0000eb: 44 01 00 00 00 00 00 00 00 | f64.const 0x1p-1074
    0000f4: 1a                         | drop
  - 0000f5: 44 01 00 00 00 00 00 00 80 | f64.const -0x1.p-1074
  + 0000f5: 44 01 00 00 00 00 00 00 80 | f64.const -0x1p-1074
    0000fe: 1a                         | drop
  - 0000ff: 44 01 00 00 00 00 00 00 00 | f64.const 0x1.p-1074
  + 0000ff: 44 01 00 00 00 00 00 00 00 | f64.const 0x1p-1074
    000108: 1a                         | drop
  - 000109: 44 01 00 00 00 00 00 00 80 | f64.const -0x1.p-1074
  + 000109: 44 01 00 00 00 00 00 00 80 | f64.const -0x1p-1074
    000112: 1a                         | drop
  - 000113: 44 01 00 00 00 00 00 00 00 | f64.const 0x1.p-1074
  + 000113: 44 01 00 00 00 00 00 00 00 | f64.const 0x1p-1074
    00011c: 1a                         | drop
  - 00011d: 44 01 00 00 00 00 00 00 80 | f64.const -0x1.p-1074
  + 00011d: 44 01 00 00 00 00 00 00 80 | f64.const -0x1p-1074
    000126: 1a                         | drop
  - 000127: 44 01 00 00 00 00 00 00 00 | f64.const 0x1.p-1074
  + 000127: 44 01 00 00 00 00 00 00 00 | f64.const 0x1p-1074
    000130: 1a                         | drop
  - 000131: 44 01 00 00 00 00 00 00 80 | f64.const -0x1.p-1074
  + 000131: 44 01 00 00 00 00 00 00 80 | f64.const -0x1p-1074
    00013a: 1a                         | drop
  - 00013b: 44 02 00 00 00 00 00 00 00 | f64.const 0x1.p-1073
  + 00013b: 44 02 00 00 00 00 00 00 00 | f64.const 0x1p-1073
    000144: 1a                         | drop
  - 000145: 44 02 00 00 00 00 00 00 80 | f64.const -0x1.p-1073
  + 000145: 44 02 00 00 00 00 00 00 80 | f64.const -0x1p-1073
    00014e: 1a                         | drop
  - 00014f: 44 02 00 00 00 00 00 00 00 | f64.const 0x1.p-1073
  + 00014f: 44 02 00 00 00 00 00 00 00 | f64.const 0x1p-1073
    000158: 1a                         | drop
  - 000159: 44 02 00 00 00 00 00 00 80 | f64.const -0x1.p-1073
  + 000159: 44 02 00 00 00 00 00 00 80 | f64.const -0x1p-1073
    000162: 1a                         | drop
  - 000163: 44 02 00 00 00 00 00 00 00 | f64.const 0x1.p-1073
  + 000163: 44 02 00 00 00 00 00 00 00 | f64.const 0x1p-1073
    00016c: 1a                         | drop
  - 00016d: 44 02 00 00 00 00 00 00 80 | f64.const -0x1.p-1073
  + 00016d: 44 02 00 00 00 00 00 00 80 | f64.const -0x1p-1073
    000176: 1a                         | drop
  - 000177: 44 02 00 00 00 00 00 00 00 | f64.const 0x1.p-1073
  + 000177: 44 02 00 00 00 00 00 00 00 | f64.const 0x1p-1073
    000180: 1a                         | drop
  - 000181: 44 02 00 00 00 00 00 00 80 | f64.const -0x1.p-1073
  + 000181: 44 02 00 00 00 00 00 00 80 | f64.const -0x1p-1073
    00018a: 1a                         | drop
    00018b: 44 03 00 00 00 00 00 10 00 | f64.const 0x1.0000000000003p-1022
    000[194](https://github.com/WebAssembly/wabt/actions/runs/7068323834/job/19242756027?pr=2342#step:14:195): 1a                         | drop

rvagg force-pushed the rvagg/dump-fixes branch 2 times, most recently from eeaec87 to 2e1b9b5 Compare December 1, 2023 13:41

sbc100 reviewed Dec 1, 2023

View reviewed changes

rvagg added a commit to rvagg/wabt that referenced this pull request Dec 2, 2023

fix: uint64 opcodes print as %llu

2008bd8

Ref: WebAssembly#2342

rvagg mentioned this pull request Dec 2, 2023

fix: uint64 opcodes print as %llu #2344

Merged

rvagg force-pushed the rvagg/dump-fixes branch from 2e1b9b5 to d676e0f Compare December 2, 2023 01:58

sbc100 pushed a commit that referenced this pull request Dec 2, 2023

[wasm-objdump] Print i64 values as unsigned (#2344)

6780262

This matches the behaviour of i32 printing. Ref: #2342

rvagg force-pushed the rvagg/dump-fixes branch from 96930b2 to 7d48b53 Compare December 2, 2023 05:28

rvagg added a commit to rvagg/kasm that referenced this pull request Dec 9, 2023

make const tests pass

176172c

with the help of some wabt upstream fixes: WebAssembly/wabt#2342

rvagg force-pushed the rvagg/dump-fixes branch from 7d48b53 to f5768fe Compare December 9, 2023 04:51

rvagg added 2 commits December 9, 2023 15:53

fix(hexfloat): extraneous '.' when no exponent, add some fixed known-…

9a24954

…good cases

test(hexfloat): add subnormals to f??.const dump output

33ed5f8

rvagg force-pushed the rvagg/dump-fixes branch from f5768fe to 33ed5f8 Compare December 9, 2023 04:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor fixes for dump printing #2342

Minor fixes for dump printing #2342

rvagg commented Dec 1, 2023

sbc100 Dec 1, 2023

sbc100 left a comment

rvagg commented Dec 2, 2023 •

edited

Loading

sbc100 commented Dec 2, 2023

rvagg commented Dec 2, 2023

rvagg commented Dec 2, 2023 •

edited

Loading

rvagg commented Dec 9, 2023

Minor fixes for dump printing #2342

Are you sure you want to change the base?

Minor fixes for dump printing #2342

Conversation

rvagg commented Dec 1, 2023

sbc100 Dec 1, 2023

Choose a reason for hiding this comment

sbc100 left a comment

Choose a reason for hiding this comment

rvagg commented Dec 2, 2023 • edited Loading

sbc100 commented Dec 2, 2023

rvagg commented Dec 2, 2023

rvagg commented Dec 2, 2023 • edited Loading

rvagg commented Dec 9, 2023

rvagg commented Dec 2, 2023 •

edited

Loading

rvagg commented Dec 2, 2023 •

edited

Loading