Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend stacktrace for failed tests not directly under testset #49451

Merged

Conversation

serenity4
Copy link
Contributor

@serenity4 serenity4 commented Apr 21, 2023

Issue

When tests are factored out into functions, the stacktrace for failed or errored @tests only includes the frames related to the macrocall inside that function, discarding frames higher up. Because information about the callsite of such testing functions is not preserved, it makes finding failing tests much harder as there is no line indicating where the function was called. This is especially inconvenient when these testing functions are called in multiple places.

Basic example:

using Test

function test_smallprime(x)
  @test x < 10
  @test x in (2, 3, 5, 7)
end

@testset "Tests" begin
  test_smallprime(7)
  test_smallprime(1)
  test_smallprime(10)
  test_smallprime(13)
end
Stacktrace on master

Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
Expression: x in (2, 3, 5, 7)
 Evaluated: 1 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:4
  Expression: x < 10
   Evaluated: 10 < 10

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:4
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 10 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:4
  Expression: x < 10
   Evaluated: 13 < 10

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:4
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 13 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
Test Summary: | Pass  Fail  Total  Time
Tests         |    3     5      8  0.6s

As one can see, the stacktraces give no clue about where to look besides that testing function.

Stacktrace on this PR

Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 1 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:10 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:4
  Expression: x < 10
   Evaluated: 10 < 10

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:4
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:11 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 10 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:11 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:4
  Expression: x < 10
   Evaluated: 13 < 10

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:4
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:12 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 13 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:12 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Test Summary: | Pass  Fail  Total  Time
Tests         |    3     5      8  0.1s
ERROR: LoadError: Some tests did not pass: 3 passed, 5 failed, 0 errored, 0 broken.
in expression starting at /home/serenity4/julia/src/stdlib/Test/example.jl:8

This is more informative. Note that we include 3 additional frames instead of just one; that is due to inlining which groups these 3 frames into one stacktrace index. This makes the stacktrace more verbose, so we will have to make sure that we only include more frames when necessary.

Proposed implementation

The solution implemented in this PR improves backtrace scrubbing such that if the source location of a @test macrocall is not directly in scope of an outer @testset macrocall, then all frames higher up until a testset is reached are included.

@serenity4 serenity4 changed the title Include frames between test and testset in stacktraces Extend stacktrace for failed tests not directly under testset Apr 21, 2023
Copy link
Sponsor Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work, @serenity4, this is a very welcome improvement.

This looks ready to me. One question, though: lookup is expensive, and to me it looks like it's possible to call lookup on the backtrace twice. If it's a problem (not sure it is), a potential solution is to cache the result. To determine if this matters, have you timed how long Julia's own test suite takes with this change vs the implementation currently on master?

stdlib/Test/test/runtests.jl Show resolved Hide resolved
@serenity4
Copy link
Contributor Author

A quick profiling indicates that most of the time of is indeed spent in lookup, and notably for Error test results which always scrub the backtrace (Fail results only do it when printing, if printing is enabled). I think that is enough to motivate a more parsimonious use of this function. I'll see if it is possible to restructure the code in a way that avoids duplicate calls, and if it gets ugly we could indeed use a simple cache.

@serenity4
Copy link
Contributor Author

I reduced a bunch of potentially duplicated lookups, but there were very few and I did not notice performance improvements in practice. Nevertheless, I think the implementation is slightly cleaner so I'd leave it with the refactors.

The slow part is actually looking up stackframes all the way until the @testset expansion, which is only performed once, and includes a bunch of C frames (I've seen cases range from ~8 to ~20 frames in-between @test and @testset). Before, we were only looking up stackframes until the @test macro expansion; going until @testset requires more work and results roughly in a 30% increase in evaluation time for a replica of the example introduced in this PR (8 failed tests + 1 errored test). On the bright side, the increase in time only affects failed and errored tests which are not directly under a @testset. Local tests confirm that performance remains unaffected otherwise.

I don't think we can really improve performance without trying to further optimize lookup, and given that the difference will not be observed in most cases I think it is a reasonable cost for the extra frames.

@timholy
Copy link
Sponsor Member

timholy commented Apr 24, 2023

LGTM. Just need to fix the mingw32 test errors and I think this is good to go.

@serenity4
Copy link
Contributor Author

Should be all good now.

Copy link
Sponsor Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

@timholy timholy added the status:merge me PR is reviewed. Merge when all tests are passing label Apr 28, 2023
@timholy timholy merged commit 5304baa into JuliaLang:master May 3, 2023
@serenity4 serenity4 deleted the serenity4/improve-test-stacktraces branch May 3, 2023 09:57
@serenity4
Copy link
Contributor Author

Thanks for the review @timholy and @vtjnash! And thanks to @Seelengrab who pointed me to the right direction :)

@Seelengrab
Copy link
Contributor

I think this is a great step towards making julia testsuites more modular, by making the stacktraces of failed cases easily readable :) Thanks for implementing this!

@aviatesk
Copy link
Sponsor Member

aviatesk commented May 4, 2023

I noticed this PR seems to break something like @testset "..." include("path/to/test/file").

MRE would be:

  1. echo "using Test; @test false" >! foo.jl
  2. julia -e 'using Test; @testset "foo" include("foo.jl")'

Before:

foo: Test Failed at /Users/aviatesk/julia/julia1/foo.jl:1
  Expression: false
Stacktrace:
 [1] top-level scope
   @ /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Test/src/Test.jl:464
 [2] include(fname::String)
   @ Base.MainInclude ./client.jl:476
 [3] top-level scope
   @ /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Test/src/Test.jl:1363
 [4] eval
   @ ./boot.jl:368 [inlined]
 [5] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:276
 [6] _start()
   @ Base ./client.jl:522
Test Summary: | Fail  Total  Time
foo           |    1      1  0.4s
ERROR: Some tests did not pass: 0 passed, 1 failed, 0 errored, 0 broken.

Now:

foo: Test Failed at /Users/aviatesk/julia/julia1/foo.jl:1
  Expression: false
foo: Error During Test at none:1
  Got exception outside of a @test
  LoadError: TypeError: in typeassert, expected Int64, got a value of type Nothing
  Stacktrace:
    [1] test_callsite(bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}, file_ts::String, file_t::String)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:67
    [2] test_location(bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}, file_ts::String, file_t::String)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:58 [inlined]
    [3] scrub_backtrace(bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}, file_ts::String, file_t::String)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:93
    [4] record(ts::Test.DefaultTestSet, t::Union{Test.Error, Test.Fail}; print_result::Bool)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1083
    [5] record(ts::Test.DefaultTestSet, t::Test.Error)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1076 [inlined]
    [6] do_test(result::Test.ExecutionResult, orig_expr::Any)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:703
    [7] top-level scope
      @ ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:511
    [8] include(fname::String)
      @ Base.MainInclude ./client.jl:478
    [9] top-level scope
      @ ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1545
   [10] eval(m::Module, e::Expr)
      @ Core ./boot.jl:378 [inlined]
   [11] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:280
   [12] _start()
      @ Base ./client.jl:541
  in expression starting at /Users/aviatesk/julia/julia1/foo.jl:1
Test Summary: | Error  Total  Time
foo           |     1      1  0.6s
ERROR: Some tests did not pass: 0 passed, 0 failed, 1 errored, 0 broken.

Note that we now have an internal error within Test.jl

@serenity4
Copy link
Contributor Author

serenity4 commented May 4, 2023

Thanks for reporting this, it seems like originally this case was also not well supported (see how it includes frames until _start()), and I may have been too optimistic about stacktraces being reasonably structured (note the absence of any macro expansion in the original stacktrace).

One quick fix would be to remove the type assertions and be more relaxed, to avoid errors from occurring.

For a better fix, we might want to take a look at why no macro expansions are present in the stacktrace in the first place, and then see how easily we could change that.

I'll take a deeper look this evening and make a PR with either of the fixes depending on how it goes.

@KristofferC
Copy link
Sponsor Member

Ref #44757

@serenity4
Copy link
Contributor Author

For a better fix, we might want to take a look at why no macro expansions are present in the stacktrace in the first place, and then see how easily we could change that.

From #35360 (comment) it seems like that is not going to be easy, so I decided to go with the simpler fix for now in #49633.

@DilumAluthge DilumAluthge removed the status:merge me PR is reviewed. Merge when all tests are passing label May 5, 2023
@vtjnash
Copy link
Sponsor Member

vtjnash commented May 25, 2023

After #44995

$ ./julia -e 'using Test; @testset "foo" include("../julia/foo.jl")'
foo: Test Failed at /data/vtjnash/julia/foo.jl:1
  Expression: false

Stacktrace:
 [1] macro expansion
   @ /data/vtjnash/julia/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] top-level scope
   @ /data/vtjnash/julia/foo.jl:508
 [3] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [4] macro expansion
   @ /data/vtjnash/julia/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1547 [inlined]
 [5] top-level scope
   @ none:1543
 [6] eval(m::Module, e::Expr)
   @ Core ./boot.jl:383 [inlined]
 [7] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:280
 [8] _start()
   @ Base ./client.jl:541
Test Summary: | Fail  Total  Time
foo           |    1      1  0.6s
ERROR: Some tests did not pass: 0 passed, 1 failed, 0 errored, 0 broken.

vs. before

$ ./julia -e 'using Test; @testset "foo" include("../julia/foo.jl")'
foo: Test Failed at /data/vtjnash/julia/foo.jl:1
  Expression: false

Stacktrace:
 [1] top-level scope
   @ /data/vtjnash/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:512
 [2] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [3] top-level scope
   @ /data/vtjnash/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1546
 [4] eval(m::Module, e::Expr)
   @ Core ./boot.jl:383 [inlined]
 [5] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:280
 [6] _start()
   @ Base ./client.jl:541
Test Summary: | Fail  Total  Time
foo           |    1      1  0.7s
ERROR: Some tests did not pass: 0 passed, 1 failed, 0 errored, 0 broken.

notice we now have the missing top-level scope frame for foo.jl, though still appear to struggle a bit with deriving the file name for the @testset (since it is just from -e) and appear to accidentally combine it with the line number from @testset, and then still print the whole stack back to _start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants