Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code_llvm: print source locations #22819

Merged
merged 2 commits into from
Jul 27, 2017
Merged

code_llvm: print source locations #22819

merged 2 commits into from
Jul 27, 2017

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Jul 15, 2017

Supplements the printing of LLVM IR with source location information.

With inspiration from #19342

julia> code_llvm(sum, (Vector{Bool},))

; Function: sum
; Filename: reduce.jl
define i64 @julia_sum_62091(%jl_value_t addrspace(10)* dereferenceable(40)) #0 {
top:
    ; Filename: abstractarray.jl
    ; Source line: 66
  %1 = addrspacecast %jl_value_t addrspace(10)* %0 to %jl_value_t addrspace(11)*
  %2 = bitcast %jl_value_t addrspace(11)* %1 to %jl_value_t addrspace(10)* addrspace(11)*
  %3 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)* addrspace(11)* %2, i64 3
  %4 = bitcast %jl_value_t addrspace(10)* addrspace(11)* %3 to i64 addrspace(11)*
  %5 = load i64, i64 addrspace(11)* %4, align 8
    ; Filename: reduce.jl
    ; Source line: 701
  %6 = icmp slt i64 %5, 1
  br i1 %6, label %L26, label %if.lr.ph

if.lr.ph:                                         ; preds = %top
    ; Source line: 702
  %7 = bitcast %jl_value_t addrspace(11)* %1 to i8* addrspace(11)*
  %8 = load i8*, i8* addrspace(11)* %7, align 8
    ; Source line: 701
  %min.iters.check = icmp ult i64 %5, 16
  br i1 %min.iters.check, label %scalar.ph, label %min.iters.checked

min.iters.checked:                                ; preds = %if.lr.ph
  %n.vec = and i64 %5, -16
  %cmp.zero = icmp eq i64 %n.vec, 0
  %ind.end = or i64 %n.vec, 1
  br i1 %cmp.zero, label %scalar.ph, label %vector.ph

vector.ph:                                        ; preds = %min.iters.checked
  br label %vector.body

vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %vec.phi = phi <4 x i64> [ zeroinitializer, %vector.ph ], [ %28, %vector.body ]
  %vec.phi4 = phi <4 x i64> [ zeroinitializer, %vector.ph ], [ %29, %vector.body ]
  %vec.phi5 = phi <4 x i64> [ zeroinitializer, %vector.ph ], [ %30, %vector.body ]
  %vec.phi6 = phi <4 x i64> [ zeroinitializer, %vector.ph ], [ %31, %vector.body ]
  %9 = phi i64 [ 1, %vector.ph ], [ %10, %vector.body ]
  %10 = add i64 %9, 16
    ; Source line: 702
  %11 = add i64 %9, -1
  %12 = getelementptr i8, i8* %8, i64 %11
  %13 = bitcast i8* %12 to <4 x i8>*
  %wide.load = load <4 x i8>, <4 x i8>* %13, align 1
  %14 = getelementptr i8, i8* %12, i64 4
  %15 = bitcast i8* %14 to <4 x i8>*
  %wide.load10 = load <4 x i8>, <4 x i8>* %15, align 1
  %16 = getelementptr i8, i8* %12, i64 8
  %17 = bitcast i8* %16 to <4 x i8>*
  %wide.load11 = load <4 x i8>, <4 x i8>* %17, align 1
  %18 = getelementptr i8, i8* %12, i64 12
  %19 = bitcast i8* %18 to <4 x i8>*
  %wide.load12 = load <4 x i8>, <4 x i8>* %19, align 1
  %20 = zext <4 x i8> %wide.load to <4 x i64>
  %21 = zext <4 x i8> %wide.load10 to <4 x i64>
  %22 = zext <4 x i8> %wide.load11 to <4 x i64>
  %23 = zext <4 x i8> %wide.load12 to <4 x i64>
  %24 = and <4 x i64> %20, <i64 1, i64 1, i64 1, i64 1>
  %25 = and <4 x i64> %21, <i64 1, i64 1, i64 1, i64 1>
  %26 = and <4 x i64> %22, <i64 1, i64 1, i64 1, i64 1>
  %27 = and <4 x i64> %23, <i64 1, i64 1, i64 1, i64 1>
  %28 = add <4 x i64> %24, %vec.phi
  %29 = add <4 x i64> %25, %vec.phi4
  %30 = add <4 x i64> %26, %vec.phi5
  %31 = add <4 x i64> %27, %vec.phi6
    ; Source line: 701
  %index.next = add i64 %index, 16
  %32 = icmp eq i64 %index.next, %n.vec
  br i1 %32, label %middle.block, label %vector.body

middle.block:                                     ; preds = %vector.body
    ; Source line: 702
  %bin.rdx = add <4 x i64> %29, %28
  %bin.rdx13 = add <4 x i64> %30, %bin.rdx
  %bin.rdx14 = add <4 x i64> %31, %bin.rdx13
  %rdx.shuf = shufflevector <4 x i64> %bin.rdx14, <4 x i64> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
  %bin.rdx15 = add <4 x i64> %bin.rdx14, %rdx.shuf
  %rdx.shuf16 = shufflevector <4 x i64> %bin.rdx15, <4 x i64> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
  %bin.rdx17 = add <4 x i64> %bin.rdx15, %rdx.shuf16
  %33 = extractelement <4 x i64> %bin.rdx17, i32 0
  %cmp.n = icmp eq i64 %5, %n.vec
    ; Source line: 701
  br i1 %cmp.n, label %L26.loopexit, label %scalar.ph

scalar.ph:                                        ; preds = %middle.block, %min.iters.checked, %if.lr.ph
  %bc.resume.val = phi i64 [ %ind.end, %middle.block ], [ 1, %if.lr.ph ], [ 1, %min.iters.checked ]
  %bc.merge.rdx = phi i64 [ %33, %middle.block ], [ 0, %if.lr.ph ], [ 0, %min.iters.checked ]
  br label %if

if:                                               ; preds = %scalar.ph, %if
  %n.03 = phi i64 [ %bc.merge.rdx, %scalar.ph ], [ %40, %if ]
  %"#temp#.02" = phi i64 [ %bc.resume.val, %scalar.ph ], [ %34, %if ]
  %34 = add i64 %"#temp#.02", 1
    ; Source line: 702
  %35 = add i64 %"#temp#.02", -1
  %36 = getelementptr i8, i8* %8, i64 %35
  %37 = load i8, i8* %36, align 1
  %38 = zext i8 %37 to i64
  %39 = and i64 %38, 1
  %40 = add i64 %39, %n.03
    ; Source line: 701
  %41 = icmp eq i64 %"#temp#.02", %5
  br i1 %41, label %L26.loopexit, label %if

L26.loopexit:                                     ; preds = %middle.block, %if
  %.lcssa = phi i64 [ %40, %if ], [ %33, %middle.block ]
    ; Source line: 363
  br label %L26

L26:                                              ; preds = %L26.loopexit, %top
  %n.0.lcssa = phi i64 [ 0, %top ], [ %.lcssa, %L26.loopexit ]
  ret i64 %n.0.lcssa
}

src/disasm.cpp Outdated
#include "debuginfo.h"

class LineNumberAnnotatedWriter : public AssemblyAnnotationWriter {
DILocation *InstrLoc = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 space indent

src/disasm.cpp Outdated
@@ -494,4 +721,15 @@ void jl_dump_asm_internal(uintptr_t Fptr, size_t Fsize, int64_t slide,
DisInfo.createSymbols();
}
}

extern "C" JL_DLLEXPORT
LLVMDisasmContextRef jl_LLVMCreateDisasm(const char *TripleName, void *DisInfo, int TagType, LLVMOpInfoCallback GetOpInfo, LLVMSymbolLookupCallback SymbolLookUp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line length

src/disasm.cpp Outdated

void addSubprogram(const Function *F, DISubprogram *SP)
{
Subprogram[F] = SP;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 space indent

Supplements the printing of LLVM IR with source location information
@vtjnash
Copy link
Member Author

vtjnash commented Jul 19, 2017

I may have gone slightly overboard in fixing Tony's latest review comment, and now print out all available line info. The purpose of this is so that we can see whether it looks right when making changes to the debugging info, and to help with matching up source code with emitted instructions:

julia> code_llvm(sum, (Vector{Bool},))

; Function sum
; Location: reduce.jl
define i64 @julia_sum_61779(%jl_value_t addrspace(10)* dereferenceable(40)) #0 {
top:
; Location: reduce.jl:363
; Function countnz; {
; Location: reduce.jl:725
; Function count; {
; Location: reduce.jl:701
; Function eachindex; {
; Location: abstractarray.jl:763
; Function indices1; {
; Location: abstractarray.jl:73
; Function indices; {
; Location: abstractarray.jl:66
  %1 = addrspacecast %jl_value_t addrspace(10)* %0 to %jl_value_t addrspace(11)*
  %2 = bitcast %jl_value_t addrspace(11)* %1 to %jl_value_t addrspace(10)* addrspace(11)*
  %3 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)* addrspace(11)* %2, i64 3
  %4 = bitcast %jl_value_t addrspace(10)* addrspace(11)* %3 to i64 addrspace(11)*
  %5 = load i64, i64 addrspace(11)* %4, align 8
;}}}
  %6 = icmp slt i64 %5, 1
  br i1 %6, label %L26, label %if.lr.ph

if.lr.ph:                                         ; preds = %top
; Location: reduce.jl:702
  %7 = bitcast %jl_value_t addrspace(11)* %1 to i8* addrspace(11)*
  %8 = load i8*, i8* addrspace(11)* %7, align 8
; Location: reduce.jl:701
  %min.iters.check = icmp ult i64 %5, 4
  br i1 %min.iters.check, label %scalar.ph, label %min.iters.checked

min.iters.checked:                                ; preds = %if.lr.ph
  %n.vec = and i64 %5, -4
  %cmp.zero = icmp eq i64 %n.vec, 0
  %ind.end = or i64 %n.vec, 1
  br i1 %cmp.zero, label %scalar.ph, label %vector.ph

vector.ph:                                        ; preds = %min.iters.checked
  br label %vector.body

vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %vec.phi = phi <2 x i64> [ zeroinitializer, %vector.ph ], [ %20, %vector.body ]
  %vec.phi4 = phi <2 x i64> [ zeroinitializer, %vector.ph ], [ %21, %vector.body ]
  %9 = phi i64 [ 1, %vector.ph ], [ %10, %vector.body ]
  %10 = add i64 %9, 4
; Location: reduce.jl:702
  %11 = add i64 %9, -1
  %12 = getelementptr i8, i8* %8, i64 %11
  %13 = bitcast i8* %12 to <2 x i8>*
  %wide.load = load <2 x i8>, <2 x i8>* %13, align 1
  %14 = getelementptr i8, i8* %12, i64 2
  %15 = bitcast i8* %14 to <2 x i8>*
  %wide.load6 = load <2 x i8>, <2 x i8>* %15, align 1
  %16 = zext <2 x i8> %wide.load to <2 x i64>
  %17 = zext <2 x i8> %wide.load6 to <2 x i64>
  %18 = and <2 x i64> %16, <i64 1, i64 1>
  %19 = and <2 x i64> %17, <i64 1, i64 1>
  %20 = add <2 x i64> %18, %vec.phi
  %21 = add <2 x i64> %19, %vec.phi4
; Location: reduce.jl:701
  %index.next = add i64 %index, 4
  %22 = icmp eq i64 %index.next, %n.vec
  br i1 %22, label %middle.block, label %vector.body

middle.block:                                     ; preds = %vector.body
; Location: reduce.jl:702
  %bin.rdx = add <2 x i64> %21, %20
  %rdx.shuf = shufflevector <2 x i64> %bin.rdx, <2 x i64> undef, <2 x i32> <i32 1, i32 undef>
  %bin.rdx7 = add <2 x i64> %bin.rdx, %rdx.shuf
  %23 = extractelement <2 x i64> %bin.rdx7, i32 0
  %cmp.n = icmp eq i64 %5, %n.vec
; Location: reduce.jl:701
  br i1 %cmp.n, label %L26.loopexit, label %scalar.ph

scalar.ph:                                        ; preds = %middle.block, %min.iters.checked, %if.lr.ph
  %bc.resume.val = phi i64 [ %ind.end, %middle.block ], [ 1, %if.lr.ph ], [ 1, %min.iters.checked ]
  %bc.merge.rdx = phi i64 [ %23, %middle.block ], [ 0, %if.lr.ph ], [ 0, %min.iters.checked ]
  br label %if

if:                                               ; preds = %scalar.ph, %if
  %n.03 = phi i64 [ %bc.merge.rdx, %scalar.ph ], [ %30, %if ]
  %"#temp#.02" = phi i64 [ %bc.resume.val, %scalar.ph ], [ %24, %if ]
  %24 = add i64 %"#temp#.02", 1
; Location: reduce.jl:702
  %25 = add i64 %"#temp#.02", -1
  %26 = getelementptr i8, i8* %8, i64 %25
  %27 = load i8, i8* %26, align 1
  %28 = zext i8 %27 to i64
  %29 = and i64 %28, 1
  %30 = add i64 %29, %n.03
; Location: reduce.jl:701
  %31 = icmp eq i64 %"#temp#.02", %5
  br i1 %31, label %L26.loopexit, label %if

L26.loopexit:                                     ; preds = %middle.block, %if
  %.lcssa = phi i64 [ %30, %if ], [ %23, %middle.block ]
;}}
  br label %L26

L26:                                              ; preds = %L26.loopexit, %top
  %n.0.lcssa = phi i64 [ 0, %top ], [ %.lcssa, %L26.loopexit ]
  ret i64 %n.0.lcssa
}

julia> code_native(sum, (Vector{Bool},))
	.text
; Function <invalid> {
; Location: reduce.jl
	pushq	%rbp
	movq	%rsp, %rbp
;}
; Function sum {
; Location: reduce.jl:363
; Function countnz; {
; Location: reduce.jl:725
; Function count; {
; Location: reduce.jl:701
; Function eachindex; {
; Location: abstractarray.jl:763
; Function indices1; {
; Location: abstractarray.jl:73
; Function indices; {
; Location: abstractarray.jl:66
	movq	24(%rdi), %rcx
	xorl	%eax, %eax
;}}}
	testq	%rcx, %rcx
	jle	L193
; Location: reduce.jl:702
	movq	(%rdi), %r8
	xorl	%eax, %eax
	movl	$1, %esi
; Location: reduce.jl:701
	cmpq	$4, %rcx
	jb	L156
	movq	%rcx, %rdi
	andq	$-4, %rdi
	xorl	%eax, %eax
	movq	%rcx, %rdx
	andq	$-4, %rdx
	movl	$1, %esi
	je	L156
	movq	%rdi, %rsi
	orq	$1, %rsi
	leaq	2(%r8), %rax
	pxor	%xmm0, %xmm0
	movabsq	$140293241959392, %rdi  # imm = 0x7F9890D9F3E0
; Location: reduce.jl:702
	movdqa	(%rdi), %xmm2
	movq	%rdx, %rdi
	pxor	%xmm1, %xmm1
	nop
L96:
	pmovzxbq	-2(%rax), %xmm3 # xmm3 = mem[0],zero,zero,zero,zero,zero,zero,zero,mem[1],zero,zero,zero,zero,zero,zero,zero
	pmovzxbq	(%rax), %xmm4   # xmm4 = mem[0],zero,zero,zero,zero,zero,zero,zero,mem[1],zero,zero,zero,zero,zero,zero,zero
	pand	%xmm2, %xmm3
	pand	%xmm2, %xmm4
	paddq	%xmm3, %xmm0
	paddq	%xmm4, %xmm1
; Location: reduce.jl:701
	addq	$4, %rax
	addq	$-4, %rdi
	jne	L96
; Location: reduce.jl:702
	paddq	%xmm0, %xmm1
	pshufd	$78, %xmm1, %xmm0       # xmm0 = xmm1[2,3,0,1]
	paddq	%xmm1, %xmm0
	movd	%xmm0, %rax
	cmpq	%rdx, %rcx
; Location: reduce.jl:701
	je	L193
L156:
	incq	%rcx
	subq	%rsi, %rcx
	leaq	-1(%r8,%rsi), %rdx
	nopw	(%rax,%rax)
; Location: reduce.jl:702
L176:
	movzbl	(%rdx), %esi
	andl	$1, %esi
	addq	%rsi, %rax
; Location: reduce.jl:701
	incq	%rdx
	decq	%rcx
	jne	L176
;}}
L193:
	popq	%rbp
	retq
	nopw	%cs:(%rax,%rax)
;}

@tkelman
Copy link
Contributor

tkelman commented Jul 19, 2017

Can we have a test for that new feature?

@vtjnash
Copy link
Member Author

vtjnash commented Jul 20, 2017

We don't have a good way to unit test these, and it depends on LLVM optimizations and upstream representations, so we've been reluctant to commit too much to the exact representation this function returns (

# code_native / code_llvm (issue #8239)
). Especially since this is just intended for interactive use, the goal is more based around showing information as a best-effort, rather than specifically being careful to get the information exactly right

LinePrinter.emit_finish(Out);
}

// print an llvm IR acquired from jl_get_llvmf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get instead of print?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't approve the PR if you request changes... 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think print (as in -to-string) is sufficiently clear as a verb, in context here.

// print an llvm IR acquired from jl_get_llvmf
// warning: this takes ownership of, and destroys, f->getParent()
extern "C" JL_DLLEXPORT
const jl_value_t *jl_dump_function_ir(void *f, bool strip_ir_metadata, bool dump_module)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe confine the NFC code movements to a separate commit next time, this is kinda annoying to review with GitHub's diff view.

return 0;
}

// print a native disassembly for the function starting at fptr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

@maleadt
Copy link
Member

maleadt commented Jul 24, 2017

Huh, how did I miss emitInstructionAnnot.. This is a much better approach!

I don't like the current DILineInfoPrinter format though, pretty verbose and the braces are hard to follow. Maybe some indentation and/or structure characters and/or frame numbers, eg.

define i64 @julia_sum_61779(%jl_value_t addrspace(10)* dereferenceable(40)) #0 {
top:
; frame 0 (sum @ reduce.jl:363)
; └frame 1 (countnz @ reduce.jl:725)
;  └frame 2 (count @ reduce.jl:701)
;   └frame 3 (eachindex @ abstractarray.jl:763)
;    └frame 4 (indices1 @ abstractarray.jl:73)
;     └frame 5 (indices @ abstractarray.jl:66)
  %1 = addrspacecast %jl_value_t addrspace(10)* %0 to %jl_value_t addrspace(11)*
  %2 = bitcast %jl_value_t addrspace(11)* %1 to %jl_value_t addrspace(10)* addrspace(11)*
  %3 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)* addrspace(11)* %2, i64 3
  %4 = bitcast %jl_value_t addrspace(10)* addrspace(11)* %3 to i64 addrspace(11)*
  %5 = load i64, i64 addrspace(11)* %4, align 8
;     └end frame 5 (indices)
;    └end frame 4 (indices1)
;   └end frame 3 (eachindex)
  %6 = icmp slt i64 %5, 1
  br i1 %6, label %L26, label %if.lr.ph


if.lr.ph:                                         ; preds = %top
;   reduce.jl:702 (frame 2)
  %7 = bitcast %jl_value_t addrspace(11)* %1 to i8* addrspace(11)*
  %8 = load i8*, i8* addrspace(11)* %7, align 8

@vtjnash
Copy link
Member Author

vtjnash commented Jul 24, 2017

That looks nice, although we don't have the nice pretty syntax coloring 😛

OK to merge, and encourage further experimentation in future PRs?

@vtjnash vtjnash merged commit a2d375f into master Jul 27, 2017
@vtjnash vtjnash deleted the jn/asmwriter_di branch July 27, 2017 05:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants