Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not rely on section info when DT_DYNAMIC is available #12732

Open
ret2libc opened this issue Jan 10, 2019 · 18 comments
Open

do not rely on section info when DT_DYNAMIC is available #12732

ret2libc opened this issue Jan 10, 2019 · 18 comments
Labels
ELF high-priority High priority bugs
Projects

Comments

@ret2libc
Copy link
Contributor

r2r broken test: https://github.com/radare/radare2-regressions/commit/19ccc45154ca69041b8432147197738a453419e5

ELF parser should not rely on sections to compute relocations because they are not used by the loader. The loader uses the info provided in the DT_DYNAMIC segment.

@ret2libc ret2libc added the ELF label Jan 10, 2019
@ret2libc ret2libc added the bug label Apr 16, 2019
@ghost
Copy link

ghost commented Apr 2, 2020

Hi could you indicate where i should look in the code if i want override this behavior with a lookup on the DT_DYNAMIC segment.

@ret2libc
Copy link
Contributor Author

ret2libc commented Apr 2, 2020

@08a Hi, I would start from Elf_(r_bin_elf_get_relocs) defined in libr/bin/format/elf/elf.c. As you can see it iterates over the sections looking for .rela or .rel sections. As said, loaders don't really care about sections, but they do use the information in the DT_DYNAMIC segment to find out relocations (I cannot tell you the exact details as I don't remember them anymore, but I guess the entries PLTRELSZ, PLTREL, JMPREL, RELA, RELASZ, RELAENT, RELACOUNT may help). Also, looking at the source code of the linux dynamic loader may help.

@ghost
Copy link

ghost commented Apr 4, 2020

@ret2libc There is in libr/bin/format/elf/elf.c an interesting function init_dynamic_section. I was wondering if i could modify this function and the struct ELFOBJ? I could add DT_RELA, DT_REL, ..

@ghost
Copy link

ghost commented Apr 8, 2020

I move here the list of possible refacto:

  • rel_cache_new and Elf_(r_bin_elf_get_relocs) (same behaviour different data structure, maybe replace by the new implementation or reused inside rel_cache_new)
  • move struct dynamic_relocation_section inside ELFOBJ
  • move the function get_dynamic_info inside init_dynamic_section
  • remove is_rela inside ELFOBJ

@ghost
Copy link

ghost commented Apr 9, 2020

I am trying to remove the struct ht_rel_t but in the function get_import_addr_ppc k is used:
rel->k.
What is the purpose of the field?

@ghost
Copy link

ghost commented Apr 9, 2020

I have no idea what get_import_addr is searching.

@ret2libc
Copy link
Contributor Author

ret2libc commented Apr 9, 2020

I am trying to remove the struct ht_rel_t but in the function get_import_addr_ppc k is used:
rel->k.
What is the purpose of the field?

AFAIS that is the position of the reloc entry in the relocations. IIRC ppc has some kind of relocation types that compute the final address based on the position of the relocation entry (or something like that, I don't remember exactly sorry).

@ghost
Copy link

ghost commented Apr 9, 2020

I am not sure what the function get_import_addr is doing, do you have any idea?
And maybe some documentation.

@radare
Copy link
Collaborator

radare commented Apr 9, 2020 via email

@ghost
Copy link

ghost commented Apr 9, 2020

Its documented in C

This part of the code base is a real spaghetti code without any context it take me 3 hour to understand that it was the addr inside the plt.

I am rewriting this part and i was wandering how to get the base addr of the got.
My idea was take the first lazy binding inside the got.

            ;-- section..got:
            0x00022c58      .qword 0x0000000000022a58 ; section..dynamic ; segment.DYNAMIC ; [22] -rw- section size 920 named .got
            0x00022c60      .qword 0x0000000000000000
            0x00022c68      .qword 0x0000000000000000
            ;-- reloc.__cxa_atexit:
            0x00022c70      .qword 0x0000000000004036                  ; RELOC 64 __cxa_atexit

Deref the ptr

        ╎   0x00004036      6800000000     push 0
        └─< 0x0000403b      e9e0ffffff     jmp section..plt
            ;-- section..text:

Do some magic

            ;-- section..plt:
        ┌─> 0x00004020      ff353aec0100   push qword [0x00022c60]     ; [12] -r-x section size 32 named .plt
        ╎   0x00004026      ff253cec0100   jmp qword [0x00022c68]      ; [0x22c68:8]=0
        ╎   0x0000402c      0f1f4000       nop dword [rax]
        ╎   0x00004030      ff253aec0100   jmp qword [reloc.__cxa_atexit] ; [0x22c70:8]=0x4036 ; "6@"
        ╎   0x00004036      6800000000     push 0
        └─< 0x0000403b      e9e0ffffff     jmp section..plt
            ;-- section..text:

But i can't read the got with r_buf_read_at (bin->b, first_lazy_entry, buf, 0x8);, the function seems to fail, is this a normal behavior?

@ret2libc
Copy link
Contributor Author

Hi @08a . Why are you rewriting that part of the code? Why do you need new logic to compute the addresses when there is already one?

By the way, about using sections vs segments, I think that function is partially wrong as well as it relies on sections by name, instead of using the dynamic entries.

@ghost
Copy link

ghost commented Apr 10, 2020

I was refactoring the rel_cache_new and i discover this function that was like you said wrong because it relies on sections by name.

@ghost
Copy link

ghost commented Apr 10, 2020

My only problem right now is get the value in the got.

@ret2libc
Copy link
Contributor Author

@08a I think existing code is already doing that, so I suggest that you adapt the code but you change as little as possible. For example, I think you can find the code in get_import_addr_x86 in the R_386_JMP_SLOT switch-case. In general, I believe the approach in RBin to get what you want is to get the file offset of the section and read it from there. If you already have the virtual address and you want to get the file offset, you should use Elf_(r_bin_elf_v2p_new) and then you can read data at that address in the file.

@ghost
Copy link

ghost commented Apr 11, 2020

[XX] db/formats/elf/symbols symbols with no sections header information 3
R2_NOPLUGINS=1 radare2 -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -Qc is ../bins/elf/libmemalloc-dump-mem
-- stdout

i don't have the same result than the example but i am pretty sure my result is valid, it is the only arm test that fail.

-1    0x00000000 0x00000000 GLOBAL FUNC   16       imp.__cxa_finalize
-2    0x00000000 0x00000000 GLOBAL FUNC   16       imp.__cxa_atexit
+1    0x00001050 0x00001050 GLOBAL FUNC   16       imp.__cxa_finalize
+2    0x0000105c 0x0000105c GLOBAL FUNC   16       imp.__cxa_atexit

I can find addresses that the previous implementation couldn't find (it was based on section name)

Can i have your opinion about this.

@ghost
Copy link

ghost commented Apr 11, 2020

[XX] db/formats/elf/symbols symbols non common LD script
R2_NOPLUGINS=1 radare2 -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -Qc is ../bins/elf/analysis/custom_ldscript
-- stdout
--- .a  2020-04-11 15:45:26.501957104 +0200
+++ .b  2020-04-11 15:45:26.501957104 +0200
@@ -26,7 +26,7 @@
22   0x00200844 0x01c00844 LOCAL  SECT   0        .custom_text
23   ---------- 0x00000000 LOCAL  SECT   0        .comment
24   ---------- 0x00000000 LOCAL  FILE   0        custom_ldscript.c
-25   ---------- 0x00000000 LOCAL  FILE   0
+25   ---------- 0x00000000 LOCAL  FILE   0
26   0x00000660 0x00600660 LOCAL  NOTYPE 0        __init_array_end
27   0x00000660 0x00600660 LOCAL  OBJ    0        _DYNAMIC
28   0x00000660 0x00600660 LOCAL  NOTYPE 0        __init_array_start

And this test result seems strange.
it was just some whitespace

@ghost
Copy link

ghost commented Apr 28, 2020

@ret2libc Hi, i was starting a small refacto, the idea was limiting the usage of Elf_(Dyn) *dyn_buf; inside ELFOBJ.

Some entrie like DT_RUNPATH could have a valid value of 0. But inside struct Elf_(r_bin_elf_dynamic_info) the default value is 0. So we need to init the struct with the value -1.

The only problem is that the definition of Elf_(Addr) and Elf_(Xword) have a different size

  42 /* Types for signed and unsigned 64-bit quantities.  */
  43 typedef uint64_t Elf32_Xword;
  44 typedef int64_t  Elf32_Sxword;
  45 typedef uint64_t Elf64_Xword;
  46 typedef int64_t  Elf64_Sxword;
  47
  48 /* Type of addresses.  */
  49 typedef uint32_t Elf32_Addr;
  50 typedef uint64_t Elf64_Addr;

So i was wandering what you prefer?
Create 2 macro Elf_(Xword_MAX) and Elf_(Addr_MAX)
or define the struct Elf_(r_bin_elf_dynamic_info) with ut64?

Last option forget the problem and switch to another task

@ret2libc
Copy link
Contributor Author

Create 2 macro Elf_(Xword_MAX) and Elf_(Addr_MAX)

i'd say this one.

@XVilka XVilka added the high-priority High priority bugs label Nov 13, 2020
@XVilka XVilka added this to To do in RBin via automation Nov 13, 2020
@trufae trufae removed the bug label Jun 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ELF high-priority High priority bugs
Projects
RBin
  
To do
Development

No branches or pull requests

4 participants