Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit memory operands that pushes values to stack should have displacement #510

Open
NaC-L opened this issue Jun 16, 2024 · 5 comments · May be fixed by #512
Open

Implicit memory operands that pushes values to stack should have displacement #510

NaC-L opened this issue Jun 16, 2024 · 5 comments · May be fixed by #512
Labels
A-decoder Area: Decoder C-enhancement Category: Enhancement of existing features

Comments

@NaC-L
Copy link
Contributor

NaC-L commented Jun 16, 2024

if we do push rsp operand info tells us that we push rsp to [rsp] however, it should be [rsp-8(size)].

https://www.felixcloutier.com/x86/push

IA-32 Architecture Compatibility ¶
For IA-32 processors from the Intel 286 on, the PUSH ESP instruction pushes the value of the ESP register as it existed before the instruction was executed. (This is also true for Intel 64 architecture, real-address and virtual-8086 modes of IA-32 architecture.) For the Intel® 8086 processor, the PUSH SP instruction pushes the new value of the SP register (that is the value after it has been decremented by 2).
mov rsp, 0x1008
push rsp 

is [0x1000] = 0x1008 and not [0x1000] = 0x1000

ZydisInfo.exe -64 -64 56
== [    BASIC ] ============================================================================================
   MNEMONIC: push [ENC: DEFAULT, MAP: DEFAULT, OPC: 0x56]
     LENGTH:  1
        SSZ: 64
       EOSZ: 64
       EASZ: 64
   CATEGORY: PUSH
    ISA-SET: I86
    ISA-EXT: BASE
 EXCEPTIONS: NONE
  OPTIMIZED: 56

== [ OPERANDS ] ============================================================================================
##       TYPE  VISIBILITY  ACTION      ENCODING   SIZE  NELEM  ELEMSZ  ELEMTYPE                        VALUE
--  ---------  ----------  ------  ------------   ----  -----  ------  --------  ---------------------------
 0   REGISTER    EXPLICIT       R        OPCODE     64      1      64       INT                          rsi
 1   REGISTER      HIDDEN      RW          NONE     64      1      64       INT                          rsp
 2     MEMORY      HIDDEN       W          NONE     64      1      64       INT  TYPE  =                 MEM
                                                                                 SEG   =                  ss
                                                                                 BASE  =                 rsp
                                                                                 INDEX =                none
                                                                                 SCALE =                   0
                                                                                 DISP  =  0x0000000000000000
--  ---------  ----------  ------  ------------   ----  -----  ------  --------  ---------------------------

== [      ATT ] ============================================================================================
   ABSOLUTE: push %rsi
   RELATIVE: push %rsi

== [    INTEL ] ============================================================================================
   ABSOLUTE: push rsi
   RELATIVE: push rsi

== [ SEGMENTS ] ============================================================================================
56
:..OPCODE
@flobernd flobernd added C-enhancement Category: Enhancement of existing features A-decoder Area: Decoder labels Jun 17, 2024
@flobernd
Copy link
Member

Hi @NaC-L,

we do not provide full semantics in most cases (sometimes it's completely impossible without a rich context; e.g. for certain R/E/FLAGS bits). The generic [RSP] operand is mainly there to indicate, that "some kind" of memory access happens. This is useful to detect dirty registers and as well allowed me to indicate different R/W/RW for the action on the register itself and the action on the memory access.

However, I agree that we could relatively easily improve that case as it's a static condition. We just have to check a bunch of other instructions as well to check, if they behave the same (like CALL, PUSH2, ...). If it's too much effort to implement, I would rather not do it at the moment.

@NaC-L
Copy link
Contributor Author

NaC-L commented Jun 19, 2024

Behavior is the same with CALL, PUSHA/PUSHAD PUSHF/PUSHFD/PUSHFQ as far as I know, no other instruction behaves like this. I'm working on a solution by modifying OperandDefinitions.inc.

@ZehMatt
Copy link
Contributor

ZehMatt commented Jun 19, 2024

I think it is correct as it is, the value is written to [rsp] after rsp is modified and not before, also check the pseudo code https://www.felixcloutier.com/x86/push#operation.

@NaC-L
Copy link
Contributor Author

NaC-L commented Jun 19, 2024

Hey @ZehMatt,

while psuedocode states that rsp is decremented first then value is written to the address, few lines above its mentioned:

The PUSH ESP instruction pushes the value of the ESP register as it existed before the instruction was executed. If a PUSH instruction uses a memory operand in which the ESP register is used for computing the operand address, the address of the operand is computed before the ESP register is decremented.

and
For IA-32 processors from the Intel 286 on, the PUSH ESP instruction pushes the value of the ESP register as it existed before the instruction was executed. (This is also true for Intel 64 architecture, real-address and virtual-8086 modes of IA-32 architecture.) For the Intel® 8086 processor, the PUSH SP instruction pushes the new value of the SP register (that is the value after it has been decremented by 2).

as my original post states

(you wouldnt move a value to rsp and push but assume you would for demonstration sake)
mov rsp, 0x1008
push rsp 

would result in [0x1000] = 0x1008 and not [0x1000] = 0x1000, you can confirm this by executing push rsp in a debugger and observing the values, it would push the value of rsp as it existed before executing the instruction.

its the intel manual being unclear again

@ZehMatt
Copy link
Contributor

ZehMatt commented Jun 20, 2024

I think you are assuming that SRC is just an alias for RSP in the pseudo code, the operand has to be read first, if we go by read/write order then it will work out, also as @flobernd previously mentioned Zydis doesn't have the full semantics but its good enough as it is if you respect the read/write order, like this example:

  1. SRC = Operand[0] (R)
  2. RSP = RSP - 8 (RW)
  3. [RSP] = SRC (W)

@NaC-L NaC-L linked a pull request Jun 20, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-decoder Area: Decoder C-enhancement Category: Enhancement of existing features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants