Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hide safepoint prologue and CFI instructions from reflection. #49948

Merged
merged 6 commits into from
May 26, 2023

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented May 24, 2023

Fixes #49918 (comment)

Before:

julia> @code_llvm 1+1
;  @ int.jl:87 within `+`
define i64 @"julia_+_137"(i64 signext %0, i64 signext %1) #0 {
top:
  %2 = call {}*** inttoptr (i64 7010020880 to {}*** (i64)*)(i64 262) #5
  %ptls_field3 = getelementptr inbounds {}**, {}*** %2, i64 2
  %3 = bitcast {}*** %ptls_field3 to i64***
  %ptls_load45 = load i64**, i64*** %3, align 8
  %4 = getelementptr inbounds i64*, i64** %ptls_load45, i64 2
  %safepoint = load i64*, i64** %4, align 8
  fence syncscope("singlethread") seq_cst
  %5 = load volatile i64, i64* %safepoint, align 8
  fence syncscope("singlethread") seq_cst
  %6 = add i64 %1, %0
  ret i64 %6
}

After:

julia> @code_llvm 1+1
;  @ int.jl:87 within `+`
define i64 @"julia_+_424"(i64 signext %0, i64 signext %1) #0 {
top:
  %2 = add i64 %1, %0
  ret i64 %2
}

Same for native code:

julia> @code_native 1+1
	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 13, 0
	.globl	"_julia_+_320"                  ; -- Begin function julia_+_320
	.p2align	2
"_julia_+_320":                         ; @"julia_+_320"
; ┌ @ int.jl:87 within `+`
	.cfi_startproc
; %bb.0:                                ; %top
	add	x0, x1, x0
	ret
	.cfi_endproc
; └
                                        ; -- End function
.subsections_via_symbols

julia> @code_native 1+1
	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 13, 0
	.globl	"_julia_+_377"                  ; -- Begin function julia_+_377
	.p2align	2
"_julia_+_377":                         ; @"julia_+_377"
; ┌ @ int.jl:87 within `+`
	.cfi_startproc
; %bb.0:                                ; %top
	stp	x20, x19, [sp, #-32]!           ; 16-byte Folded Spill
	.cfi_def_cfa_offset 32
	stp	x29, x30, [sp, #16]             ; 16-byte Folded Spill
	.cfi_offset w30, -8
	.cfi_offset w29, -16
	.cfi_offset w19, -24
	.cfi_offset w20, -32
	mov	x19, x1
	mov	x20, x0
	mov	x8, #28176
	movk	x8, #41428, lsl #16
	movk	x8, #1, lsl #32
	mov	w0, #262
	blr	x8
	ldr	x8, [x0, #16]
	ldr	x8, [x8, #16]
	; COMPILER BARRIER
	ldr	xzr, [x8]
	; COMPILER BARRIER
	add	x0, x19, x20
	ldp	x29, x30, [sp, #16]             ; 16-byte Folded Reload
	ldp	x20, x19, [sp], #32             ; 16-byte Folded Reload
	ret
	.cfi_endproc
; └
                                        ; -- End function
.subsections_via_symbols

For code_llvm we could hook up the toggle to raw, i.e., do still show the safepoint code when setting raw=true (although I'm not sure how useful that would be). code_native doesn't have a raw kwarg though...

@maleadt maleadt requested a review from KristofferC May 24, 2023 20:20
@vchuravy
Copy link
Sponsor Member

For code_llvm we could hook up the toggle to raw, i.e., do still show the safepoint code when setting raw=true (although I'm not sure how useful that would be). code_native doesn't have a raw kwarg though...

Yes having a raw option for code_native would be okay with me.

@gbaraldi
Copy link
Member

+1 for having raw=true add the safepoints back, it's quite useful for getting the modules out of julia to play with in godbolt/opt

@maleadt
Copy link
Member Author

maleadt commented May 25, 2023

Yes having a raw option for code_native would be okay with me.

Looks like code_native puts that functionality under dump_module...

If dump_module is false, do not print metadata such as rodata or directives.

I'll hook it up to that for now, but maybe we should rename the option?


EDIT: ah, that won't work. Well, it will work for code_llvm, but code_native is special: dump_module changes how assembly is generated, if set to true we use LLVM to emit assembly code (which always includes the assembly headers, cfi instructions, etc), whereas for dump_module=false we just disassemble the code that's been compiled by the JIT, which includes the safepoint. So with this design, we can't both hide the module metadata, and the safepoint instructions.

@maleadt
Copy link
Member Author

maleadt commented May 25, 2023

@vtjnash Would it be possible to re-use our own disassembler, jl_dump_fptr_asm, with binary output from jl_dump_fptr_asm (i.e., raw_mc=true)? I'm not sure where to get the debug info from, which the current implementation looks up from the fptr (jl_DI_for_fptr).

@maleadt maleadt changed the title Hide safepoint from reflection. Hide safepoint prologue and CFI instructions from reflection. May 25, 2023
@maleadt
Copy link
Member Author

maleadt commented May 25, 2023

Adding nounwind gets rid of the CFI instructions, so I added that here:

julia> @code_native 1+1
	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 13, 0
	.globl	"_julia_+_214"                  ; -- Begin function julia_+_214
	.p2align	2
"_julia_+_214":                         ; @"julia_+_214"
; ┌ @ int.jl:87 within `+`
; %bb.0:                                ; %top
	add	x0, x1, x0
	ret
; └
                                        ; -- End function
.subsections_via_symbols

Still some verbose module output, but AsmPrinter doesn't seem customizable enough to hide those.

str = _dump_function_linfo_native(linfo, world, wrapper, syntax, debuginfo, binary, params)
else
# if we don't want the module metadata, just disassemble what our JIT has.
# TODO: make it possible to hide the safepoint prologue here too.
# once that works, it would be good to default to dump_module=false
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dump_module=false will always give sometimes wrong or misleading answers and lacks configuration, so it typically is a bad default. (Also, TODO items should be issues, not code comments.)

Copy link
Member Author

@maleadt maleadt May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dump_module=false will always give sometimes wrong or misleading answers

Why is that? I was actually looking into using that disassembly path for code we get from jl_get_llvmf_defn, i.e., to allow configuration. The output is slightly nicer.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disassembly is an art, not a science. It is right most of the time, but often makes mistakes because doing it right is can actually be harder than turing-complete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I just hadn't run into actual issues. But you're right that it's probably better not to default to the disassembler.

Do you think it would be useful to upstream my changes that make it possible to disassemble arbitrary objects (i.e. not only fptrs we look up through the JIT), so that its possible to use the disassembler with non-default codegen options? May also be useful for external compilers, to be able to disassemble externally-produced objects while using the Julia assembly annotator.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see which changes you mean here, but yes, in general it is reasonable to want to ask for disassembly of some bytes in an arbitrary range, as long as we can detect the unwind info exists for it (via asking debuginfo.cpp if it knows about the code)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't committed it, but was working on it until you said the disassembler path is not what we want to use. Something like:

extern "C" JL_DLLEXPORT_CODEGEN
jl_value_t *jl_dump_bin_asm_impl(void* bin, size_t len, const char* asm_variant, const char *debuginfo, char binary)
{
    std::string code;
    raw_string_ostream stream(code);

    // read the object file
    auto buf = llvm::MemoryBuffer::getMemBuffer(llvm::StringRef(static_cast<const char*>(bin), len));
    auto objectFileOrError = llvm::object::ObjectFile::createObjectFile(buf->getMemBufferRef());
    if (!objectFileOrError) {
        jl_printf(JL_STDERR, "WARNING: failed to load object file: %s\n", llvm::toString(objectFileOrError.takeError()).c_str());
        return jl_an_empty_string;
    }
    auto objectFile = objectFileOrError.get().get();
    auto context = DWARFContext::create(*objectFile).release();

    // find the first text section
    object::SectionRef TextSection;
    uint64_t TextAddr = -1;
    for (auto &Section : objectFile->sections()) {
        if (!Section.isText())
            continue;
        uint64_t Addr = Section.getAddress();
        if (Addr < TextAddr) {
            TextAddr = Addr;
            TextSection = Section;
        }
    }
    if (TextAddr == -1) {
        jl_printf(JL_STDERR, "WARNING: no text section found\n");
        return jl_an_empty_string;
    }

    // determine the location in memory of the code
    auto sectionContents = TextSection.getContents();
    uintptr_t code_ptr = reinterpret_cast<std::uintptr_t>(sectionContents->data());
    size_t code_size = sectionContents->size();

    // determine the offset between the code in memory and the code in the object file
    int64_t slide = 0;
    if (auto *OF = dyn_cast<const object::COFFObjectFile>(objectFile)) {
        slide = OF->getImageBase() - (int64_t)code_ptr;
    }
    else {
        slide = -(int64_t)code_ptr;
    }

    // Dump assembly code
    jl_ptls_t ptls = jl_current_task->ptls;
    int8_t gc_state = jl_gc_safe_enter(ptls);
    jl_dump_asm_internal(
            code_ptr, code_size, slide,
            TextSection, context,
            stream,
            asm_variant,
            debuginfo,
            binary);
    jl_gc_safe_leave(ptls, gc_state);

    return jl_pchar_to_string(stream.str().data(), stream.str().size());
}

To be cleaned-up, etc. If useful, we could add this under a code_name(; disassemble=true) flag or something.

stdlib/InteractiveUtils/src/codeview.jl Outdated Show resolved Hide resolved
src/disasm.cpp Outdated Show resolved Hide resolved
@maleadt maleadt force-pushed the tb/reflection_safepoint branch 2 times, most recently from 474b394 to 64e43ce Compare May 26, 2023 12:17
@vtjnash vtjnash added the status:merge me PR is reviewed. Merge when all tests are passing label May 26, 2023
@maleadt maleadt merged commit 23e0b2d into master May 26, 2023
2 checks passed
@maleadt maleadt deleted the tb/reflection_safepoint branch May 26, 2023 17:09
@giordano giordano removed the status:merge me PR is reviewed. Merge when all tests are passing label Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants