Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix for regex matches ending with non-ASCII #26831

Merged
merged 1 commit into from
Apr 19, 2018

Conversation

stevengj
Copy link
Member

Fixes #26829, fixes #26199.

To backport to 0.6 (or earlier versions … this bug has been around for a while), replace endof with sizeof in the corresponding line of regex.jl.

@stevengj stevengj added domain:unicode Related to unicode characters and encodings kind:bugfix This change fixes an existing bug backport pending 0.6 labels Apr 17, 2018
@stevengj
Copy link
Member Author

Wow, green CI.

@stevengj
Copy link
Member Author

Looks like the bug was introduced in Julia 0.2, actually, since 892188b replaced m.offset + length(m.match.data) with prev_match.offset + endof(prev_match.match).

@martinholters
Copy link
Member

backport pending 0.2?

@stevengj
Copy link
Member Author

Okay to merge?

Copy link
Sponsor Member

@StefanKarpinski StefanKarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to object—it fixes a clear bug.

@StefanKarpinski StefanKarpinski merged commit aed8a84 into JuliaLang:master Apr 19, 2018
@stevengj stevengj deleted the uniregex branch April 19, 2018 12:06
mbauman added a commit that referenced this pull request Apr 19, 2018
* origin/master: (22 commits)
  separate `isbitstype(::Type)` from `isbits` (#26850)
  bugfix for regex matches ending with non-ASCII (#26831)
  [NewOptimizer] track inbounds state as a per-statement flag
  change default LOAD_PATH and DEPOT_PATH (#26804, fix #25709)
  Change url scheme to https (#26835)
  [NewOptimizer] inlining: Refactor todo object
  inference: enable CodeInfo method_for_inference_limit_heuristics support (#26822)
  [NewOptimizer] Fix _apply elision (#26821)
  add test case from issue #26607, cfunction with no args (#26838)
  add `do` in front-end deparser. fixes #17781 (#26840)
  Preserve CallInst metadata in LateLowerGCFrame pass.
  Improve differences from R documentation (#26810)
  reserve syntax that could be used for computed field types (#18466) (#26816)
  Add support for Atomic{Bool} (Fix #26542). (#26597)
  Remove argument restriction on dims2string and inds2string (#26799) (#26817)
  remove some unnecessary `eltype` methods (#26791)
  optimize: ensure merge_value_ssa doesn't drop PiNodes
  inference: improve tmerge for Conditional and Const
  ensure more iterators stay type-stable
  code loading docs (#26787)
  ...
ararslan pushed a commit that referenced this pull request Apr 26, 2018
ararslan pushed a commit that referenced this pull request Apr 26, 2018
ararslan pushed a commit that referenced this pull request Apr 27, 2018
ararslan pushed a commit that referenced this pull request May 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:unicode Related to unicode characters and encodings kind:bugfix This change fixes an existing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with eachmatch using unicode and regex pipes Issue with eachmatch
4 participants