Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seg Fault in compile #9475

Closed
samuela opened this issue Dec 27, 2014 · 16 comments
Closed

Seg Fault in compile #9475

samuela opened this issue Dec 27, 2014 · 16 comments

Comments

@samuela
Copy link
Contributor

samuela commented Dec 27, 2014

I have a script that looks something like this:

function test_calc_margs_and_logZ(Ntests = 5)
    function _brute(xl::AbstractMatrix, θ::AbstractMatrix, γ::AbstractMatrix, D::Int, K::Int)
        Tl = size(xl, 2)

        joint_states = Array(Any, Tl)
        push!(joint_states, (3, 4))

        # Iterate through all possible y sequences                                                                 
        @eval (@nloops $Tl _t ((_) -> 1:$D) begin
            yl = @ntuple $Tl _t
            stuff = (yl, $_brute_unnorm_logprob($xl, yl, $θ, $γ))
            println(stuff)
            $joint_states[prod(yl)] = stuff
#            push!($joint_states, 4)                                                                               
        end)

        joint_states
    end

    for i = 1:Ntests
        K = 5
        D = 2
        T = 3
        xl = randn(K, T)
        θ = randn(D, D)
        γ = randn(D, K)
        println(_brute(xl, θ, γ, D, K))
    end
end

test_calc_margs_and_logZ()

Attempting to run gives me:

signal (11): Segmentation fault
copy_ast at /gpfs/home/skainswo/julia/src/ast.c:695
copy_ast at /gpfs/home/skainswo/julia/src/ast.c:732
copy_ast at /gpfs/home/skainswo/julia/src/ast.c:751
copy_ast at /gpfs/home/skainswo/julia/src/ast.c:751
copy_ast at /gpfs/home/skainswo/julia/src/ast.c:743
jl_prepare_ast at /gpfs/home/skainswo/julia/src/ast.c:874
typeinf at ./inference.jl:1298
jlcall_typeinf_4804 at /gpfs/home/skainswo/julia/usr/bin/../lib/julia/sys.so (unknown line)
jl_apply_generic at /gpfs/home/skainswo/julia/src/gf.c:1411
typeinf_ext at ./inference.jl:1216
jl_apply_generic at /gpfs/home/skainswo/julia/src/gf.c:1411
jl_type_infer at /gpfs/home/skainswo/julia/src/gf.c:394
jl_toplevel_eval_flex at /gpfs/home/skainswo/julia/src/toplevel.c:508
jl_f_top_eval at /gpfs/home/skainswo/julia/src/builtins.c:399
_brute at /gpfs/home/skainswo/242chalearn/julia/crf_test.jl:31
jlcall__brute_19808 at  (unknown line)
jl_apply at /gpfs/home/skainswo/julia/src/gf.c:1431
test_calc_margs_and_logZ at /gpfs/home/skainswo/242chalearn/julia/crf_test.jl:52
test_calc_margs_and_logZ at /gpfs/home/skainswo/242chalearn/julia/crf_test.jl:24
jlcall_test_calc_margs_and_logZ_19804 at  (unknown line)
jl_apply at /gpfs/home/skainswo/julia/src/gf.c:1431
jl_apply at /gpfs/home/skainswo/julia/src/interpreter.c:66
eval at /gpfs/home/skainswo/julia/src/interpreter.c:207
jl_toplevel_eval_flex at /gpfs/home/skainswo/julia/src/toplevel.c:498
jl_parse_eval_all at /gpfs/home/skainswo/julia/src/toplevel.c:544
jl_load at /gpfs/home/skainswo/julia/src/toplevel.c:580
include at ./boot.jl:245
jl_apply_generic at /gpfs/home/skainswo/julia/src/gf.c:1411
include_from_node1 at loading.jl:128
jl_apply at /gpfs/home/skainswo/julia/src/gf.c:1431
process_options at ./client.jl:285
_start at ./client.jl:354
jlcall__start_17227 at /gpfs/home/skainswo/julia/usr/bin/../lib/julia/sys.so (unknown line)
jl_apply_generic at /gpfs/home/skainswo/julia/src/gf.c:1411
unknown function (ip: 4203497)
julia_trampoline at /gpfs/home/skainswo/julia/src/init.c:1027
unknown function (ip: 4201121)
__libc_start_main at /lib64/libc.so.6 (unknown line)
unknown function (ip: 4200793)
Segmentation fault (core dumped)
@samuela
Copy link
Contributor Author

samuela commented Dec 27, 2014

Also, if anyone knows how I can fill the joint_states array from within the @nLoops stuff, lemme know!

@timholy
Copy link
Sponsor Member

timholy commented Dec 28, 2014

I don't think you can do this, because the cartesian macros run at parsing time (e.g., @nloops expands code to generate the requested number of loops), but T1 is only known at runtime. You'd have to use the Dict trick to cache the generated function (see https://github.com/timholy/Cartesian.jl#supplying-the-dimensionality-from-functions).

But I haven't looked into the segfault.

@samuela
Copy link
Contributor Author

samuela commented Dec 28, 2014

Right, that's why I'm using @eval. At the very least, I can run

@eval (@nloops $Tl _t ((_) -> 1:$D) begin
    yl = @ntuple $Tl _t
    stuff = (yl, $_brute_unnorm_logprob($xl, yl, $θ, $γ))
    println(stuff)
end)

without any issue and it works as expected. What's the difference between println and any other side-effect-ful function like setindex!?

@timholy
Copy link
Sponsor Member

timholy commented Dec 28, 2014

You're right, I missed the @eval. So it looks valid, but not recommended. (Do you realize it will recompile every time you call this?) To debug, I'd recommend splitting out the inner function separately, and play with commenting out lines until you find a minimal case.

If you're on julia 0.4, you should be able to do this without any macro-fu with the eachindex/IndexIterator/CartesianIndex infrastructure.

@samuela
Copy link
Contributor Author

samuela commented Dec 29, 2014

Yeah it's far from ideal. I'm not using 0.4 yet so I ended up just using something along the lines of

map(x -> 1 + digits(x, D, Tl), 0:(D^Tl - 1))

I figured I should at least report the segfault though.

@samuela
Copy link
Contributor Author

samuela commented Dec 29, 2014

Ok here's a minimal test case

using Base.Cartesian

arr = Array(Any, 1)

@eval @eval $arr[1] = 1
@eval @eval $arr = 1

Either of the @eval lines will produce the same segfault.

@timholy
Copy link
Sponsor Member

timholy commented Dec 30, 2014

This is very helpful, and many thanks for reporting it.

Interestingly, this segfaults:

arr = Array(Any, 1)
@eval @eval $arr[1] = 1

but this does not:

arr = Array(Any, 1)
@eval $arr[1] = 1
@eval @eval $arr[1] = 1

@ihnorton
Copy link
Member

The cause is trying to copy an undefined array element:

(gdb) f 0
#0  jl_copy_ast (expr=0x0) at ast.c:772
772     if (jl_is_expr(expr)) {
(gdb) f 1
#1  0x00007ffff7993cd9 in jl_copy_ast (expr=0x55a5970) at ast.c:795
795             jl_cellset(na, i, jl_copy_ast(jl_cellref(a,i)));
(gdb) p jl_(expr)
Array{Any, 1}[#<null>]

A mildly unsatisfying fix:

diff --git a/src/ast.c b/src/ast.c
index e1ae91e..767d9bd 100644
--- a/src/ast.c
+++ b/src/ast.c
@@ -769,7 +769,10 @@ static jl_value_t *copy_ast(jl_value_t *expr, jl_tuple_t *sp, int do_sp)

 DLLEXPORT jl_value_t *jl_copy_ast(jl_value_t *expr)
 {
-    if (jl_is_expr(expr)) {
+    if (expr == NULL) {
+        return NULL;
+    }
+    else if (jl_is_expr(expr)) {
         jl_expr_t *e = (jl_expr_t*)expr;
         size_t i, l = jl_array_len(e->args);
         jl_expr_t *ne = NULL;

@timholy
Copy link
Sponsor Member

timholy commented Dec 30, 2014

Why do you need the 2nd jl_is_expr(expr)?

@ihnorton
Copy link
Member

@timholy not sure I follow? (the patch would move that condition to else if and check for NULL first)

@vtjnash
Copy link
Sponsor Member

vtjnash commented Dec 30, 2014

@ihnorton i don't see anything wrong with that. checking for NULL is a normal conditional, and it's currently missing here

@ihnorton
Copy link
Member

@vtjnash is it possible that expr could be some non-NULL but still invalid value? (due to 9147)

@timholy
Copy link
Sponsor Member

timholy commented Dec 30, 2014

@ihnorton, I read a - as a +. I'll try to find something to clean the gunk off my monitor (and maybe my glasses, too). Carry on 😄.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Dec 30, 2014

@ihnorton no, julia never leaves a jl_value_t* field uninitialized (to NULL). an isbits field could be uninitialized, but this code shouldn't come across any unboxed data.

@ihnorton
Copy link
Member

@vtjnash thanks. I'll check this in then if CI passes (just pushed a branch).

@tkelman
Copy link
Contributor

tkelman commented Dec 30, 2014

@ihnorton I don't know if there's a good way of telling only Travis but not AppVeyor to skip a build. Right now I have AppVeyor configured to not build branches other than master and release-0.3, but if you want it to build a personal test branch you can add the branch name to the whitelist in appveyor.yml of that branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants