Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix imaging value for generating sysimages for multiple targets #48809

Merged
merged 4 commits into from
Apr 12, 2023

Conversation

ven-k
Copy link
Member

@ven-k ven-k commented Feb 27, 2023

Fixes imaging_default function, so that sysimages for multiple architectures can be built without setting pkgimages=yes

Current issue:

For versions >1.9-dev,
For a use-case like:

julia> sysimage= unsafe_string(Base.JLOptions().image_file)

julia>  run(`./julia
               --sysimage=$sysimage
               --cpu-target='generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)'
               --output-o=mock.o mock.jl
               --pkgimages=no`)

fatal: error thrown and no exception handler available.
ErrorException("More than one command line CPU targets specified without a `--output-` flag specified")

[30365] signal (11.1): Segmentation fault
in expression starting at none:0

This is because, with pkgimages=no, even if output-o is passed, the imaging_default returns false. This in-turn causes issues while generating sysimages for multiple cpu-targets

After this PR, these work:

  1. Sysimages with multiple targets and no pkgimages
julia> run(`julia --sysimage=$sysimage --cpu-target='generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)' --output-o=mock.o mock.jl --pkgimages=no`)
  1. Pkgimages=yes and multiple targets.
    From the docs, there should be a unified cache. It's unclear to me if the current change results an issue in that regard. An alternative would be changing to return jl_options.image_codegen || jl_generating_output().
julia> run(`julia --cpu-target='generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)' --pkgimages=yes`)

@vchuravy
Copy link
Sponsor Member

Could you add a test here?

@vtjnash vtjnash added the status:merge me PR is reviewed. Merge when all tests are passing label Feb 27, 2023
@@ -597,6 +597,34 @@ let exename = `$(Base.julia_cmd()) --startup-file=no --color=no`
"Int(Base.JLOptions().fast_math)"`)) == JL_OPTIONS_FAST_MATH_DEFAULT
end

# --pkgimages with multiple cpu targets
@testset let v = readchomperrors(`$exename
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs a guard for x86_64 or a different list for supported architectures.

end

@test success(`$exename
--cpu-target='generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)'
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay this should be propagated to the target process.

cpu_target = unsafe_string(opts.cpu_target)

As should JULIA_CPU_TARGET.

cpu_target = get(ENV, "JULIA_CPU_TARGET", nothing)

@@ -597,6 +597,34 @@ let exename = `$(Base.julia_cmd()) --startup-file=no --color=no`
"Int(Base.JLOptions().fast_math)"`)) == JL_OPTIONS_FAST_MATH_DEFAULT
end

# --pkgimages with multiple cpu targets
@testset let v = readchomperrors(`$exename
--cpu-target='generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)'
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--cpu-target='generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)'
--cpu-target='generic;generic'

--cpu-target='generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)'
--output-o=$object_file $outputo_file
--pkgimages=no`)
end
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test sounds slow, and should not be included without further reconsideration of performance

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I've noticed that this test adds ~160s. It is too much time for 1 single test.
So, for easy dropping, I've added it in a dedicated commit.

I couldn't think of any other simpler test for output-o, which is faster, though.

Copy link
Member Author

@ven-k ven-k Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtjnash How about this: we intentionally don't pass the sysimage arg. When it fails we know everything before that passed, including our case. This takes only 0.05s

# Sysimage with multiple cpu targets
@testset "Sysimage for multiple microarchitectures" begin
    julia_path = joinpath(Sys.BINDIR, Base.julia_exename())
    outputo_file = tempname()
    write(outputo_file, "1")
    object_file = tempname() * ".o"

    # This is to test that even with `pkgimages=no`, we can create object file
    # with multiple cpu-targets
    # The cmd is checked for --object-o as soon as it is run. So, to avoid long
    # testing times, intentionally don't pass sysimage; when we reach the
    # corresponding error, we know that `check_cmdline` has already passed
    let v = readchomperrors(`$julia_path
        --cpu-target='native;native'
        --output-o=$object_file $outputo_file
        --pkgimages=no`)

        @test v[1] == false
        @test v[2] == ""

        err = first(split(v[3], "\n\n"))
        @test err != "fatal: error thrown and no exception handler available.\nErrorException(\"More than one command line CPU targets specified without a `--output-` flag specified\")"
        @test err == "ERROR: File \"boot.jl\" not found"
    end
end

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, though let's do @test !contains(err, "More than one command line CPU targets specified") so it will still work if other minor aspects change (though the second test for == will also catch those)

@DilumAluthge
Copy link
Member

This PR is marked as merge me, but it seems that there are still some unresolved conversations. So I'll remove the label.

@vtjnash or @vchuravy Can you add the merge me label back once this PR is ready?

@DilumAluthge DilumAluthge removed the status:merge me PR is reviewed. Merge when all tests are passing label Feb 28, 2023
@DilumAluthge
Copy link
Member

Also, looks like the PowerPC build is broken @vchuravy

@DilumAluthge DilumAluthge added the kind:bugfix This change fixes an existing bug label Feb 28, 2023
@KristofferC KristofferC mentioned this pull request Mar 3, 2023
50 tasks
@DilumAluthge
Copy link
Member

@vchuravy Do you want me to mark the PowerPC build as "allow fail"?

(The PowerPC tests are already marked as "allow fail", but currently the PowerPC build is marked as "regular".)

@vchuravy
Copy link
Sponsor Member

vchuravy commented Mar 5, 2023

No the PPC build is not allowed to fail.

Edit: I do not see how this change got impacted the architecture, but we should not downgrade the platform.

@KristofferC KristofferC mentioned this pull request Mar 7, 2023
52 tasks
@KristofferC KristofferC added this to the 1.9 milestone Mar 30, 2023
@DilumAluthge
Copy link
Member

It seems like this PR is a release blocker for 1.9. Are we willing to let a broken PowerPC build block this PR? The website says that PowerPC is a Tier 3 platform.

@vchuravy

This comment was marked as off-topic.

@gbaraldi

This comment was marked as off-topic.

@vchuravy

This comment was marked as off-topic.

@vtjnash

This comment was marked as off-topic.

@vchuravy
Copy link
Sponsor Member

That's a good observation. This PR also seems to change JIT compliation when --pkgimages=yes

 fence syncscope("singlethread") seq_cst                                                                                                                                                                                                                                                                                     
;  @ REPL[1]:1 within `my_cd`                                                                                                                                                                                                                                                                                                 
  %11 = load volatile i64, i64* %safepoint, align 8                                                                                                                                                                                                                                                                           
  fence syncscope("singlethread") seq_cst                                                                                                                                                                                                                                                                                     
;  @ REPL[1]:2 within `my_cd`                                                                                                                                                                                                                                                                                                 
; ┌ @ c.jl:234 within `unsafe_convert` @ pointer.jl:58                                                                                                                                                                                                                                                                        
   %12 = load {}**, {}*** bitcast ({}** @"jl_sym#.#2" to {}***), align 8                                                                                                                                                                                                                                                      
   %13 = getelementptr inbounds {}*, {}** %12, i64 3
   %14 = ptrtoint {}** %13 to i64
; └
  %15 = load atomic i32 (i64, i32)*, i32 (i64, i32)** bitcast (void ()** @jlplt_open_176_got to i32 (i64, i32)**) unordered, align 8
  %16 = call i32 %15(i64 %14, i32 0)

versus

  fence syncscope("singlethread") seq_cst                                                                                                                                                                                                                                                                                     
;  @ REPL[1]:1 within `my_cd`                                                                                                                                                                                                                                                                                                 
  %11 = load volatile i64, i64* %safepoint, align 8                                                                                                                                                                                                                                                                           
  fence syncscope("singlethread") seq_cst                                                                                                                                                                                                                                                                                     
;  @ REPL[1]:2 within `my_cd`                                                                                                                                                                                                                                                                                                 
  %12 = call i32 inttoptr (i64 140735334611648 to i32 (i64, i32)*)(i64 140735201562464, i32 0)                                                                                                                                                                                                                                
;  @ REPL[1]:4 within `my_cd`                                                                                                                                                                                                                                                                                                 
; ┌ @ c.jl:215 within `unsafe_convert` @ pointer.jl:60                                                                    

src/jitlayers.h Outdated Show resolved Hide resolved
@KristofferC KristofferC mentioned this pull request Apr 9, 2023
26 tasks
@staticfloat
Copy link
Sponsor Member

@ven-k can you please confirm that this PR fixes your original issue?

Copy link
Sponsor Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo checking that we are not using imaging mode in normal JIT mode.

@staticfloat
Copy link
Sponsor Member

LGTM, modulo checking that we are not using imaging mode in normal JIT mode.

I ran a check with @code_native of a few different functions (including ones that generated AVX instructions) with both --pkgimages=yes and --pkgimages=no. I couldn't find any meaningful difference, so I'm calling it good.

@staticfloat staticfloat merged commit fc4b079 into JuliaLang:master Apr 12, 2023
@ven-k
Copy link
Member Author

ven-k commented Apr 12, 2023

With this I can pass --cpu-targets and generate target-compatible sysimages.


Note that when it is added to env via JULIA_CPU_TARGET (Currently PackageCompiler does that for >=v1.9), on relocating I got a segfault. (But considering passing the --cpu-targets works, not a blocking issue for my use case)

./julia-fc4b079f69/bin/julia -J fc4b079f69_8.so

[5504] signal (7.2): Bus error
in expression starting at none:0
unknown function (ip: 0x7fb2ea3c75d3)
unknown function (ip: 0x7fb2ea3ab428)
unknown function (ip: 0x7fb2ea3ae61a)
unknown function (ip: 0x7fb2ea3b9d46)
_dl_catch_exception at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fb2ea3b9609)
unknown function (ip: 0x7fb2ea39934b)
_dl_catch_exception at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_dl_catch_error at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fb2ea399b58)
dlopen at /lib/x86_64-linux-gnu/libdl.so.2 (unknown line)
ijl_load_dynamic_library at /cache/build/default-amdci5-0/julialang/julia-master/src/dlload.c:355
ijl_preload_sysimg_so at /cache/build/default-amdci5-0/julialang/julia-master/src/staticdata.c:2641
_finish_julia_init at /cache/build/default-amdci5-0/julialang/julia-master/src/init.c:827
julia_init at /cache/build/default-amdci5-0/julialang/julia-master/src/init.c:819
jl_repl_entrypoint at /cache/build/default-amdci5-0/julialang/julia-master/src/jlapi.c:711
main at /cache/build/default-amdci5-0/julialang/julia-master/cli/loader_exe.c:58
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x401098)
Allocations: 0 (Pool: 0; Big: 0); GC: 0

[5504] signal (7.2): Bus error
in expression starting at none:0
unknown function (ip: 0x7fb2ea3c75d3)
unknown function (ip: 0x7fb2ea3ab428)
unknown function (ip: 0x7fb2ea3ae61a)
unknown function (ip: 0x7fb2ea3b9d46)
_dl_catch_exception at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fb2ea3b9609)
unknown function (ip: 0x7fb2ea39934b)
_dl_catch_exception at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_dl_catch_error at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fb2ea399b58)
dlopen at /lib/x86_64-linux-gnu/libdl.so.2 (unknown line)
ijl_load_dynamic_library at /cache/build/default-amdci5-0/julialang/julia-master/src/dlload.c:355
ijl_preload_sysimg_so at /cache/build/default-amdci5-0/julialang/julia-master/src/staticdata.c:2641
_finish_julia_init at /cache/build/default-amdci5-0/julialang/julia-master/src/init.c:827
julia_init at /cache/build/default-amdci5-0/julialang/julia-master/src/init.c:819
jl_repl_entrypoint at /cache/build/default-amdci5-0/julialang/julia-master/src/jlapi.c:711
main at /cache/build/default-amdci5-0/julialang/julia-master/cli/loader_exe.c:58
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x401098)
Allocations: 0 (Pool: 0; Big: 0); GC: 0
Bus error (core dumped)

@DilumAluthge
Copy link
Member

@vchuravy @staticfloat Would it be possible to make JULIA_CPU_TARGET support the same values as --cpu-targets?

KristofferC pushed a commit that referenced this pull request Apr 13, 2023
Fix imaging value for generating sysimages for multiple targets

(cherry picked from commit fc4b079)
@KristofferC KristofferC removed the backport 1.9 Change should be backported to release-1.9 label Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bugfix This change fixes an existing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants