Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple issues in arm64 integer simd test subject #1082

Open
rfalke opened this issue Sep 21, 2021 · 5 comments
Open

Multiple issues in arm64 integer simd test subject #1082

rfalke opened this issue Sep 21, 2021 · 5 comments

Comments

@rfalke
Copy link

rfalke commented Sep 21, 2021

Version: f7fcefa

Subject: https://github.com/rfalke/decompiler-subjects/tree/master/from_holdec/stress_arm64/arm64_macho_simdInt_single_inst__1_var

In general the subject is constructed in such a way

  • that the decompiler should for each function only produce 3 memory writes with the value 0
  • the functions don't take any parameters
  • the return value is ignored by the caller and explicitly set to zero in the function

I'm choosing here an instruction which reko understands. Other functions have similar issues and there are functions with instructions Reko doesn't understand (see #1081).

Input:

000000010000f5a8 <_inst_165_var_0>:
10000f5a8: 00 0f 8c d2  mov     x0, #24696
10000f5ac: a0 73 a9 f2  movk    x0, #19357, lsl #16
10000f5b0: 80 12 c6 f2  movk    x0, #12436, lsl #32
10000f5b4: 60 63 f8 f2  movk    x0, #49947, lsl #48
10000f5b8: 0c 00 67 9e  fmov    d12, x0
10000f5bc: c0 ba 9e d2  mov     x0, #62934
10000f5c0: 20 3c b5 f2  movk    x0, #43489, lsl #16
10000f5c4: 00 5a cd f2  movk    x0, #27344, lsl #32
10000f5c8: 60 ae f9 f2  movk    x0, #52595, lsl #48
10000f5cc: 0c 00 af 9e  fmov.d  v12[1], x0
10000f5d0: 80 b3 9e d2  mov     x0, #62876
10000f5d4: 60 6f ad f2  movk    x0, #27515, lsl #16
10000f5d8: 00 68 c3 f2  movk    x0, #6976, lsl #32
10000f5dc: a0 cc f5 f2  movk    x0, #44645, lsl #48
10000f5e0: 1f 00 67 9e  fmov    d31, x0
10000f5e4: 60 27 82 d2  mov     x0, #4411
10000f5e8: c0 ef a3 f2  movk    x0, #8062, lsl #16
10000f5ec: 40 c3 c1 f2  movk    x0, #3610, lsl #32
10000f5f0: 20 6d ff f2  movk    x0, #64361, lsl #48
10000f5f4: 1f 00 af 9e  fmov.d  v31[1], x0
10000f5f8: 60 7b 9a d2  mov     x0, #54235
10000f5fc: 80 e9 b7 f2  movk    x0, #48972, lsl #16
10000f600: 40 2c d9 f2  movk    x0, #51554, lsl #32
10000f604: 60 7e ff f2  movk    x0, #64499, lsl #48
10000f608: 1f 00 67 9e  fmov    d31, x0
10000f60c: 00 47 93 d2  mov     x0, #39480
10000f610: 60 19 b6 f2  movk    x0, #45259, lsl #16
10000f614: e0 8e c6 f2  movk    x0, #13431, lsl #32
10000f618: 00 80 ff f2  movk    x0, #64512, lsl #48
10000f61c: 1f 00 af 9e  fmov.d  v31[1], x0
10000f620: ec 8f 3f 6e  cmeq.16b        v12, v31, v31
10000f624: 8c 1d 3f 6e  eor.16b v12, v12, v31
10000f628: 8c 1d 3f 6e  eor.16b v12, v12, v31
10000f62c: 80 01 ae 9e  fmov.d  x0, v12[1]
10000f630: 81 01 66 9e  fmov    x1, d12
10000f634: e4 03 1f aa  mov     x4, xzr
10000f638: e2 ff 9f d2  mov     x2, #65535
10000f63c: e2 ff bf f2  movk    x2, #65535, lsl #16
10000f640: e2 ff df f2  movk    x2, #65535, lsl #32
10000f644: e2 ff ff f2  movk    x2, #65535, lsl #48
10000f648: 00 00 02 cb  sub     x0, x0, x2
10000f64c: e2 ff 9f d2  mov     x2, #65535
10000f650: e2 ff bf f2  movk    x2, #65535, lsl #16
10000f654: e2 ff df f2  movk    x2, #65535, lsl #32
10000f658: e2 ff ff f2  movk    x2, #65535, lsl #48
10000f65c: 21 00 02 cb  sub     x1, x1, x2
10000f660: 82 01 00 b0  adrp    x2, #200704
10000f664: 42 60 00 91  add     x2, x2, #24
10000f668: 40 00 00 f9  str     x0, [x2]
10000f66c: 82 01 00 b0  adrp    x2, #200704
10000f670: 42 80 00 91  add     x2, x2, #32
10000f674: 41 00 00 f9  str     x1, [x2]
10000f678: 82 01 00 b0  adrp    x2, #200704
10000f67c: 42 40 00 91  add     x2, x2, #16
10000f680: 44 00 00 f9  str     x4, [x2]
10000f684: e0 03 1f 2a  mov     w0, wzr
10000f688: c0 03 5f d6  ret

Output:

// 000000010000F5A8: void _inst_165_var_0(Register word64 x1, Register (ptr64 Eq_2429) q12_64_64, Register (ptr64 Eq_2430) q31_64_64)
// Called from:
//      _main
void _inst_165_var_0(word64 x1, struct Eq_2429 * q12_64_64, struct Eq_2430 * q31_64_64)
{
	*((char *) &q12_64_64->a4B9D6078->qw0000 + 1) = 14804293844532852182;
	*((char *) &q31_64_64->a6B7BF59C->qw0000 + 1) = 18116026481434825019;
	Eq_2441 q31_34 = (char *) q31_64_64 - 1085484069;
	*((word128) q31_34 + 1) = 18158571386229725752;
	g_qw40018 = (__cmeq(q31_34, q31_34) ^ q31_34 ^ q31_34)[1].qw0000 - ~0x00;
	g_qw40020 = x1 - ~0x00;
	g_qw40010 = 0x00;
}

Issues:

  • reko things that x1 needs to be passed in while it is set in the function
  • reko things that q12_64_64 and q31_64_64 need to be passed in while it is set in the function
  • No idea where a4B9D6078 comes from
  • 0x00 suggests a byte value but it isn't. Either 0x0 (or even better 0) or 0x00000....00
  • Isn't x1 - ~0x00 the same as x1 +1 ?!
  • Should calculate __cmeq(q31_34, q31_34) as 0xfff...ff
@rfalke
Copy link
Author

rfalke commented Sep 21, 2021

Before of an error in the generator I had to re-run the generator. The above lines do not match anymore with the master of https://github.com/rfalke/decompiler-subjects.

@rfalke
Copy link
Author

rfalke commented Sep 21, 2021

There are various functions which produce (almost) the 3 expected memory writes of 0:

// 000000010002AB2C: void _inst_687_var_0(Register (ptr64 Eq_10148) q1_64_64, Register (ptr64 Eq_10149) q31_64_64)
// Called from:
//      _main
void _inst_687_var_0(struct Eq_10148 * q1_64_64, struct Eq_10149 * q31_64_64)
{
	*((char *) &q1_64_64->a2E492072->qw0000 + 1) = 6253327452446453412;
	*((char *) &q31_64_64->aA711593A->qw0000 + 1) = 16669493564404935579;
	Mem43[0x0000000100040020<p64>:word64] = 0x00[1];
	g_t40028.u1 = 0x00;
	g_qw40018 = 0x00;
}

Issues:

  • the parameters are still wrongly guessed
  • 0x00[1] looks strange
  • Mem43[0x0000000100040020<p64>:word64] looks strange
  • the first two assignments don't have a side effect and can be removed.

@rfalke
Copy link
Author

rfalke commented Sep 22, 2021

Another bug fix in the generator but the issues are still visible.

uxmal added a commit that referenced this issue Dec 7, 2021
A pair of swapped operands was causing lots of the errors reported in #1082.
@uxmal
Copy link
Owner

uxmal commented Dec 7, 2021

Commits 458ffdd and earlier have improved the code generation tremendously, although there are still issues. I will leave this open for now, with the understanding that there are many sub-issues in this issue, and so it will take some time to fix them all.

@rfalke
Copy link
Author

rfalke commented Aug 26, 2023

Sample outputs of version 0.11.4.0-931ca7d:


// correct!
void inst_300_var_0()
{
	Mem45[0x000000010004C010<p64>:word64] = 0x00;
	Mem48[0x000000010004C018<p64>:word64] = 0x00;
	Mem51[0x000000010004C008<p64>:word64] = 0x00;
	Mem54[0x000000010004C000<p64>:word64] = 0x00;
}

// incorrect computation
void inst_301_var_0()
{
	Mem32[0x000000010004C010<p64>:word64] = 1367546121348104947;
	Mem35[0x000000010004C018<p64>:word64] = 0x00;
	Mem38[0x000000010004C008<p64>:word64] = 0x00;
	Mem41[0x000000010004C000<p64>:word64] = 0x00;
}

// missing computation of __shl
void inst_405_var_0()
{
	word64 d11_29 = __shl<word32[2]>(15989307599098311744, 20);
	Mem50[0x000000010004C010<p64>:word64] = (SEQ(12735396823387534501, d11_29) ^ 0BA12058E2704FF12DDE56E8BE0B57440)[1] - 13407785148733521682;
	Mem53[0x000000010004C018<p64>:word64] = (d11_29 ^ 15989307599098311744) - 3843099403073451072;
	Mem56[0x000000010004C008<p64>:word64] = 0x00;
	Mem59[0x000000010004C000<p64>:word64] = 0x00;
}

So in general still an issue.

uxmal added a commit that referenced this issue Dec 20, 2023
Part of the ongoing fixes for #1082
uxmal added a commit that referenced this issue Dec 20, 2023
uxmal added a commit that referenced this issue Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants