-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix :noshift construction of an empty SubString #51923
Conversation
Also, unless I was doing something wrong it appears that |
@@ -37,7 +37,7 @@ struct SubString{T<:AbstractString} <: AbstractString | |||
return new(s, i-1, nextind(s,j)-i) | |||
end | |||
function SubString{T}(s::T, i::Int, j::Int, ::Val{:noshift}) where T<:AbstractString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole method feels kind of weird... Is there any benchmark or anything to show the nextind
call is significant which I guess is the reason for this to exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method isn't about avoiding the overhead, but being able to construct a SubString
with the exact same offset
and ncodeunits
as another substring. This was added as part of AnnotatedStrings because it's very useful for e.g. taking regex matches on the underlying string and "lifting" them to the AnnotatedString
without calling Expr(:new, ...)
to avoid the SubString inner constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be some internal thing to AnnotatedStrings (that maybe uses Expr(:new)
) then because it feels weird to me to have a constructor that is undocumented and doesn't do what the docs of the constructor says that it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just something like
julia> Base.SubString{String}("fαr", 1, 1)
"f"
julia> Base.SubString{String}("fαr", 1, 1, Val{:noshift}())
"Error showing value of type SubString{String}:
ERROR: BoundsError: attempt to access 1-codeunit SubString{String} at index [3]
Stacktrace:
[1] checkbounds
@ ./strings/basic.jl:214 [inlined]
[2] iterate
@ ./strings/substring.jl:92 [inlined]
[3] popfirst!
@ ./iterators.jl:1492 [inlined]
[4] iterate (repeats 2 times)
@ ./iterators.jl:1503 [inlined]
[5] escape_string(io::IOContext{Base.TTY}, s::SubString{String}, esc::Tuple{Char, Char}; keep::Tuple{})
@ Base ./strings/io.jl:430
[6] escape_string
@ ./strings/io.jl:428 [inlined]
[7] print_quoted
@ ./strings/io.jl:462 [inlined]
[8] show
@ ./strings/io.jl:197 [inlined]
feels non ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially had eval(Expr(:new, ...
but was told by someone (I forget if it was Stefan, Jameson, or someone else) that it would be preferable to have a variant inner constructor.
My 2c: I just need a way of directly contructing a substring without the index shifting, and I don't care how so much as that it's possible and that it works. We probably want to avoid eval
.
b406eeb
to
e97dd5a
Compare
Since this is a bugfix, would it be possible to merge this and redirect the design conversation into a separate issue/subsequent PR? I've just added some basic tests for the |
An empty constructed SubString is over the range 0:0. The :noshift constructor is intended to allow for copying the raw range of a SubString without modification. However, when trying to copy an empty SubString, the 0:0 range runs afoul of the boundscheck. This can be adjusted for by eliding the boundscheck when the range is 0:0.
It's nigh trivial, but why not have the test coverage.
e97dd5a
to
b164413
Compare
Regarding the "Merge when all tests are passing" part of the merge me label, I've just noticed the CI failures aren't unrelated here, but that there was a |
Confirmation: I can't see any sign of related failures in the failed CI tasks. |
I just re-triggered the failed job just to check |
This fixes an edge case in the recently introduced
:noshift
substring constructor.