-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple/naive CodeInfo validation pass #22938
Conversation
Perhaps |
base/codeinfovalidation.jl
Outdated
# This file is a part of Julia. License is MIT: https://julialang.org/license | ||
|
||
const VALID_EXPR_HEADS = Symbol[:call, :invoke, :static_parameter, :line, :gotoifnot, :(=), | ||
:method, :const, :null, :new, :return, :the_exception, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possible rule: method
is only valid in top-level code (i.e. inside a function with no arguments).
base/codeinfovalidation.jl
Outdated
""" | ||
function validate_code_info(c::CodeInfo) | ||
!(c.inferred) && (c.slottypes != nothing) && return 5 | ||
(length(c.slotnames) < 1 || c.slotnames[1] != Symbol("#self#")) && return 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first slot can have a different name for code like (f::FuncType)(...) = ...
.
base/codeinfovalidation.jl
Outdated
nslots = length(slotnums) | ||
nssavals = length(ssavals) | ||
length(c.slotflags) != nslots && return 1 | ||
length(c.slotnames) != nslots && return 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks are too strict. I don't think every slot needs to be referenced. But should check length(c.slotflags) == length(c.slotnames)
.
base/codeinfovalidation.jl
Outdated
elseif x.head == :(=) && !(is_valid_lhs(x.args[1])) | ||
error_code = 9 | ||
return true | ||
elseif x.head == :call && !(all(is_valid_call_arg(i) for i in x.args[2:end])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_valid_call_arg
should also be checked on the argument to :gotoifnot
exprs.
base/codeinfovalidation.jl
Outdated
|
||
is_valid_lhs(lhs) = isa(lhs, SlotNumber) || isa(lhs, SSAValue) || isa(lhs, GlobalRef) | ||
|
||
is_valid_call_arg(arg) = !(isa(arg, Expr)) || arg.head != :gotoifnot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many more things are invalid here, e.g. LineNumberNode, LabelNode, GotoNode, various other expr heads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was just a start. I'll add the ones you listed.
Maybe someday we can just replace this function with isa(arg, SSAValue) || isa(arg, SlotNumber)
😛
Cool! This will be nice to have. I think there should be an additional method for this that accepts a |
base/codeinfovalidation.jl
Outdated
error_code = 0 | ||
walkast(c.code) do x | ||
if isa(x, Expr) | ||
if !(in(x.head, VALID_EXPR_HEADS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super minor point, but parentheses aren't required after !
when the condition isn't a compound expression of some kind. For consistency with the rest of Base it might be nice to just use !in(x.head, VALID_EXPR_HEADS)
and likewise elsewhere in here.
I believe the latest commits implement all of the comments here besides Biggest change so far was @vtjnash's suggestion. Now, |
Alright, added some tests that pass locally (let's see how Travis fares...). Still have these questions from the OP:
|
Anything left to be done here besides hooking up a call to this thing somewhere (presumably somewhere in inference, see my question above)? Tests pass locally, but I'll rebase to see if we can't get the CI badges green... |
base/codevalidation.jl
Outdated
InvalidCodeError(errno, msg) = InvalidCodeError(errno, msg, nothing) | ||
|
||
""" | ||
validate_code!(errors::Vector{>:InvalidCodeError}, c::CodeInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, why is this a vector of any supertype of InvalidCodeError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the element type can be "anything", as long as I can push objects of type InvalidCodeError
into it...at least that was my intent, did I screw something up?
base/codevalidation.jl
Outdated
""" | ||
function validate_code!(errors::Vector{>:InvalidCodeError}, m::Method) | ||
if length(m.sig.parameters) != m.nargs | ||
push!(errors, InvalidCodeError(11, "number of types in method signature does not match number of arguments", (length(m.sig.parameters), m.nargs))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrap long lines
base/codevalidation.jl
Outdated
push!(errors, InvalidCodeError(7, "not all SSAValues in AST have a type in ssavaluetypes", missing)) | ||
end | ||
else | ||
if c.slottypes != nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!==
test/inference.jl
Outdated
|
||
# InvalidCodeError 13: encountered Expr head `:method` in non-top-level code | ||
|
||
# TODO: This is a tough case to test an isolation... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in isolation
base/codevalidation.jl
Outdated
- `h != :method` for any subexpression head `h` if `m.nargs > 0` | ||
""" | ||
function validate_code!(errors::Vector{>:InvalidCodeError}, m::Method) | ||
if length(m.sig.parameters) != m.nargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need Base.unwrap_unionall(m.sig).parameters
.
base/codevalidation.jl
Outdated
msg = "number of types in method signature does not match number of arguments" | ||
push!(errors, InvalidCodeError(11, msg, (length(m.sig.parameters), m.nargs))) | ||
end | ||
if m.isva != (last(m.sig.parameters) <: Vararg{Any}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use Base.isvatuple
. Vararg
is not a first-class type and this check will become an error eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two tests aren't quite correct. If isva
is true, at most you can say about length(unwrap_unionall(m.sig.parameters))
is that it is >= m.nargs - 1
. Otherwise, it is true that they must be exactly equal.
base/codevalidation.jl
Outdated
if !in(x.head, VALID_EXPR_HEADS) | ||
push!(errors, InvalidCodeError(1, "encountered invalid expression head", x.head)) | ||
elseif x.head == :(=) && !is_valid_lhs(x.args[1]) | ||
push!(errors, InvalidCodeError(2, "encountered invalid LHS value", x.args[1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_valid_call_arg
can be renamed is_valid_rvalue
, and also used to check the RHS of assignments here.
base/codevalidation.jl
Outdated
- all assigned-to slots have bit flag 2 set in their respective slotflags | ||
""" | ||
function validate_code!(errors::Vector{>:InvalidCodeError}, c::CodeInfo) | ||
ssavals = SSAValue[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be a Set
?
I don't feel a strong need to call this anywhere by default (except maybe in a debug build?). More stuff that could potentially slow down the compiler further is the last thing we need. |
base/codevalidation.jl
Outdated
push!(errors, InvalidCodeError(1, "encountered invalid expression head", x.head)) | ||
elseif x.head == :(=) && !is_valid_lhs(x.args[1]) | ||
push!(errors, InvalidCodeError(2, "encountered invalid LHS value", x.args[1])) | ||
elseif x.head == :call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recurse on invoke here too
base/codevalidation.jl
Outdated
msg = "number of types in method signature does not match number of arguments" | ||
push!(errors, InvalidCodeError(11, msg, (length(m.sig.parameters), m.nargs))) | ||
end | ||
if m.isva != (last(m.sig.parameters) <: Vararg{Any}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two tests aren't quite correct. If isva
is true, at most you can say about length(unwrap_unionall(m.sig.parameters))
is that it is >= m.nargs - 1
. Otherwise, it is true that they must be exactly equal.
Okay, I did a refactor to implement the suggestions here (thanks for the review!) and some suggestions Jameson gave me offline. The tests are only half-updated to reflect the new changes; I'll update them/add more tests tomorrow (and probably move them to their own file).
I'm down to add that; what's the right way to do so? |
Took a stab at enabling the validation pass for debug builds. If enabled, it just prints any encountered invalidities as warnings; we can switch to throwing errors later if we want. Do we test debug builds as part of CI? Also, I updated the tests so that they pass locally and moved them to their own file. |
base/codevalidation.jl
Outdated
# This file is a part of Julia. License is MIT: https://julialang.org/license | ||
|
||
# Expr head => argument count bounds | ||
const VALID_EXPR_HEADS = Pair{Symbol,UnitRange{Int}}[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a Dict
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It originally was, before I realized that Dict
isn't defined yet. Core.Inference
is a strange, strange world 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. ObjectIdDict
should work though.
Finally got around to running the full test suite on the debug build. After fixing some bugs, the only It seems like a bunch of these warnings were triggered by tests like this one: # issue #15809 --- TODO: this code should be disallowed
function f15809()
global g15809
g15809(x::T) where {T} = T
end
f15809()
@test g15809(2) === Int Given the TODO there, I'm assuming the validator was actually correct? Might be good for somebody to give that constraint one last check. Otherwise, I think this is good to go. |
Given that tests are passing and it's been a few days, I'll plan on merging this tomorrow (barring anybody finding any other issues). |
This PR implements
validate_code_info(c::CodeInfo)
, a barebones IR validation pass that returns0
ifc
is valid or a positive integer error code otherwise. I can change the return convention to whatever we want; returning integers was just the easiest way to get started.The validation pass is in its own file, which is
include
d fromBase.Core.Inference
. Is inference the right location for this?I'm also not sure where exactly
validate_code_info
should be called.ref #22440