Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple/naive CodeInfo validation pass #22938

Merged
merged 24 commits into from
Aug 11, 2017
Merged

Simple/naive CodeInfo validation pass #22938

merged 24 commits into from
Aug 11, 2017

Conversation

jrevels
Copy link
Member

@jrevels jrevels commented Jul 24, 2017

This PR implements validate_code_info(c::CodeInfo), a barebones IR validation pass that returns 0 if c is valid or a positive integer error code otherwise. I can change the return convention to whatever we want; returning integers was just the easiest way to get started.

The validation pass is in its own file, which is included from Base.Core.Inference. Is inference the right location for this?

I'm also not sure where exactly validate_code_info should be called.

ref #22440

@jrevels jrevels added the needs tests Unit tests are required for this change label Jul 24, 2017
@vtjnash
Copy link
Member

vtjnash commented Jul 24, 2017

I can change the return convention to whatever we want; returning integers was just the easiest way to get started.

Perhaps Vector{Exception}? (The IR is valid if isempty(errors).)

# This file is a part of Julia. License is MIT: https://julialang.org/license

const VALID_EXPR_HEADS = Symbol[:call, :invoke, :static_parameter, :line, :gotoifnot, :(=),
:method, :const, :null, :new, :return, :the_exception,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible rule: method is only valid in top-level code (i.e. inside a function with no arguments).

"""
function validate_code_info(c::CodeInfo)
!(c.inferred) && (c.slottypes != nothing) && return 5
(length(c.slotnames) < 1 || c.slotnames[1] != Symbol("#self#")) && return 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first slot can have a different name for code like (f::FuncType)(...) = ....

nslots = length(slotnums)
nssavals = length(ssavals)
length(c.slotflags) != nslots && return 1
length(c.slotnames) != nslots && return 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks are too strict. I don't think every slot needs to be referenced. But should check length(c.slotflags) == length(c.slotnames).

elseif x.head == :(=) && !(is_valid_lhs(x.args[1]))
error_code = 9
return true
elseif x.head == :call && !(all(is_valid_call_arg(i) for i in x.args[2:end]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_valid_call_arg should also be checked on the argument to :gotoifnot exprs.


is_valid_lhs(lhs) = isa(lhs, SlotNumber) || isa(lhs, SSAValue) || isa(lhs, GlobalRef)

is_valid_call_arg(arg) = !(isa(arg, Expr)) || arg.head != :gotoifnot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many more things are invalid here, e.g. LineNumberNode, LabelNode, GotoNode, various other expr heads.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was just a start. I'll add the ones you listed.

Maybe someday we can just replace this function with isa(arg, SSAValue) || isa(arg, SlotNumber) 😛

@JeffBezanson
Copy link
Member

Cool! This will be nice to have.

I think there should be an additional method for this that accepts a Method object, in order to validate that nargs and isva are consistent with the type signature and CodeInfo.

error_code = 0
walkast(c.code) do x
if isa(x, Expr)
if !(in(x.head, VALID_EXPR_HEADS))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super minor point, but parentheses aren't required after ! when the condition isn't a compound expression of some kind. For consistency with the rest of Base it might be nice to just use !in(x.head, VALID_EXPR_HEADS) and likewise elsewhere in here.

@jrevels
Copy link
Member Author

jrevels commented Jul 24, 2017

I believe the latest commits implement all of the comments here besides validate_code_info(::Method) and the :method head check (which I'm going to tackle next). Might be buggy since I haven't written tests yet.

Biggest change so far was @vtjnash's suggestion. Now, validate_code_info no longer returns immediately after finding a bug, but keeps chugging along, pushing uncovered bugs into a vector that gets returned at the end. Wasn't sure if performance is a big deal here or not; if so, CodeInfoError can be changed to store indices into a preallocated message table instead of taking in a string (assuming those allocations don't already get compiled away - I thought at one point they didn't, but that might be fixed by now).

@jrevels jrevels removed the needs tests Unit tests are required for this change label Jul 26, 2017
@jrevels
Copy link
Member Author

jrevels commented Jul 26, 2017

Alright, added some tests that pass locally (let's see how Travis fares...).

Still have these questions from the OP:

The validation pass is in its own file, which is included from Base.Core.Inference. Is inference the right location for this?

I'm also not sure where exactly validate_code_info should be called.

@jrevels
Copy link
Member Author

jrevels commented Jul 31, 2017

Anything left to be done here besides hooking up a call to this thing somewhere (presumably somewhere in inference, see my question above)?

Tests pass locally, but I'll rebase to see if we can't get the CI badges green...

InvalidCodeError(errno, msg) = InvalidCodeError(errno, msg, nothing)

"""
validate_code!(errors::Vector{>:InvalidCodeError}, c::CodeInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, why is this a vector of any supertype of InvalidCodeError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the element type can be "anything", as long as I can push objects of type InvalidCodeError into it...at least that was my intent, did I screw something up?

"""
function validate_code!(errors::Vector{>:InvalidCodeError}, m::Method)
if length(m.sig.parameters) != m.nargs
push!(errors, InvalidCodeError(11, "number of types in method signature does not match number of arguments", (length(m.sig.parameters), m.nargs)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap long lines

push!(errors, InvalidCodeError(7, "not all SSAValues in AST have a type in ssavaluetypes", missing))
end
else
if c.slottypes != nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!==


# InvalidCodeError 13: encountered Expr head `:method` in non-top-level code

# TODO: This is a tough case to test an isolation...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in isolation

- `h != :method` for any subexpression head `h` if `m.nargs > 0`
"""
function validate_code!(errors::Vector{>:InvalidCodeError}, m::Method)
if length(m.sig.parameters) != m.nargs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need Base.unwrap_unionall(m.sig).parameters.

msg = "number of types in method signature does not match number of arguments"
push!(errors, InvalidCodeError(11, msg, (length(m.sig.parameters), m.nargs)))
end
if m.isva != (last(m.sig.parameters) <: Vararg{Any})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use Base.isvatuple. Vararg is not a first-class type and this check will become an error eventually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two tests aren't quite correct. If isva is true, at most you can say about length(unwrap_unionall(m.sig.parameters)) is that it is >= m.nargs - 1. Otherwise, it is true that they must be exactly equal.

if !in(x.head, VALID_EXPR_HEADS)
push!(errors, InvalidCodeError(1, "encountered invalid expression head", x.head))
elseif x.head == :(=) && !is_valid_lhs(x.args[1])
push!(errors, InvalidCodeError(2, "encountered invalid LHS value", x.args[1]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_valid_call_arg can be renamed is_valid_rvalue, and also used to check the RHS of assignments here.

- all assigned-to slots have bit flag 2 set in their respective slotflags
"""
function validate_code!(errors::Vector{>:InvalidCodeError}, c::CodeInfo)
ssavals = SSAValue[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be a Set?

@JeffBezanson
Copy link
Member

I don't feel a strong need to call this anywhere by default (except maybe in a debug build?). More stuff that could potentially slow down the compiler further is the last thing we need.

push!(errors, InvalidCodeError(1, "encountered invalid expression head", x.head))
elseif x.head == :(=) && !is_valid_lhs(x.args[1])
push!(errors, InvalidCodeError(2, "encountered invalid LHS value", x.args[1]))
elseif x.head == :call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recurse on invoke here too

msg = "number of types in method signature does not match number of arguments"
push!(errors, InvalidCodeError(11, msg, (length(m.sig.parameters), m.nargs)))
end
if m.isva != (last(m.sig.parameters) <: Vararg{Any})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two tests aren't quite correct. If isva is true, at most you can say about length(unwrap_unionall(m.sig.parameters)) is that it is >= m.nargs - 1. Otherwise, it is true that they must be exactly equal.

@jrevels
Copy link
Member Author

jrevels commented Aug 1, 2017

Okay, I did a refactor to implement the suggestions here (thanks for the review!) and some suggestions Jameson gave me offline. The tests are only half-updated to reflect the new changes; I'll update them/add more tests tomorrow (and probably move them to their own file).

except maybe in a debug build?

I'm down to add that; what's the right way to do so?

@jrevels
Copy link
Member Author

jrevels commented Aug 2, 2017

Took a stab at enabling the validation pass for debug builds. If enabled, it just prints any encountered invalidities as warnings; we can switch to throwing errors later if we want. Do we test debug builds as part of CI?

Also, I updated the tests so that they pass locally and moved them to their own file.

# This file is a part of Julia. License is MIT: https://julialang.org/license

# Expr head => argument count bounds
const VALID_EXPR_HEADS = Pair{Symbol,UnitRange{Int}}[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a Dict?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It originally was, before I realized that Dict isn't defined yet. Core.Inference is a strange, strange world 😛

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. ObjectIdDict should work though.

@jrevels
Copy link
Member Author

jrevels commented Aug 8, 2017

Finally got around to running the full test suite on the debug build. After fixing some bugs, the only InvalidCodeError warnings I got were all encountered Expr head :method in non-top-level code (i.e. nargs > 0).

It seems like a bunch of these warnings were triggered by tests like this one:

# issue #15809 --- TODO: this code should be disallowed
function f15809()
    global g15809
    g15809(x::T) where {T} = T
end
f15809()
@test g15809(2) === Int

Given the TODO there, I'm assuming the validator was actually correct? Might be good for somebody to give that constraint one last check.

Otherwise, I think this is good to go.

@jrevels
Copy link
Member Author

jrevels commented Aug 10, 2017

Given that tests are passing and it's been a few days, I'll plan on merging this tomorrow (barring anybody finding any other issues).

@jrevels jrevels merged commit 1fcc47c into master Aug 11, 2017
@jrevels jrevels deleted the jr/irvalidator branch August 11, 2017 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants