Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disallow methods of local functions in different blocks #15602

Open
wbhart opened this issue Mar 23, 2016 · 31 comments
Open

disallow methods of local functions in different blocks #15602

wbhart opened this issue Mar 23, 2016 · 31 comments
Assignees
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) kind:bug Indicates an unexpected problem or unintended behavior kind:minor change Marginal behavior change acceptable for a minor release needs docs Documentation for this change is required

Comments

@wbhart
Copy link

wbhart commented Mar 23, 2016

In Julia 0.5.0-dev+3171 the following code creates a bogus duplicate function definition warning.

function myfun(n::Int)
     if n < 3
        doit() = 3
     else
        doit() = 5
     end 
end
@JeffBezanson
Copy link
Sponsor Member

This isn't allowed any more; I'll have to come up with a better error message. Now you need to do

function myfun(n::Int)
     if n < 3
        doit = ()->3
     else
        doit = ()->5
     end 
end

@JeffBezanson JeffBezanson self-assigned this Mar 24, 2016
@JeffBezanson JeffBezanson added kind:bug Indicates an unexpected problem or unintended behavior compiler:lowering Syntax lowering (compiler front end, 2nd stage) labels Jun 16, 2016
@ChrisRackauckas
Copy link
Member

Is there a reason why this is no longer allowed?

@JeffBezanson
Copy link
Sponsor Member

Yes; this restriction makes it possible to hoist construction of f's method table to the top level. A clearer example is

function myfun()
    f() = 0
    if rand(Bool)
        f(x::Int) = 1
    end
    f(...)
end

Here, f would sometimes have one method and sometimes have two methods. That would require rebuilding f's method table on each call to myfun, which is too slow to be useful. The work-around is to use separate functions, as in my comment above, or

function myfun()
    f1() = 0
    f2() = f1()
    f2(x::Int) = 1
    f = rand(Bool) ? f2 : f1
    f(...)
end

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Jun 16, 2016

Yeah, I found those workarounds. Thanks for explaining it. Now that I know there's a reason it's easier to remember.

I think that this should say something like Warning: functions cannot be conditionally defined. Last method definition is chosen for each dispatch. and then have a link to somewhere in documentation which elaborates more on it (I think there should be a whole page which elaborates on errors and warnings, especially for more obscure ones like this which are best explained via examples).

However, to make it a little more user friendly, my example

function g(a)
  if a
    f() = 2
  else
    f() = 3
  end
  return f
end
f = g(true)
f() # Returns 3

shouldn't error with g(false), but should just have the same answer as g(true) and give the same warning.

@ViralBShah
Copy link
Member

Perhaps this should be documented in NEWS.md as @josefsachsconning pointed out.

@josefsachsconning
Copy link
Contributor

Documentation pretty crucial, since this is a breaking change.

@josefsachsconning
Copy link
Contributor

Would it not be better for this to generate an error instead of a warning?

@StefanKarpinski
Copy link
Sponsor Member

Would it be possible to automatically do the renaming that the workaround implements?

@JeffBezanson
Copy link
Sponsor Member

It's iffy; the general case is very complicated. For example

function f()
    if a
        f(x::A) = ...
    end
    if b
        f(x::B) = ...
    end
    if c
        f(x::C) = ...
    end
    ...
end

describes 2^n different functions.

@JeffBezanson
Copy link
Sponsor Member

JeffBezanson commented Oct 3, 2016

Thinking about it some more, I could almost imagine treating definitions in different basic blocks like separate functions. In other words this:

function g(a)
  if a
    f() = 2
    f(x) = 4
  else
    f() = 3
  end
  return f
end

would return a function with two methods (returning 2 or 4) for g(true), and a function with one method (returning 3) for g(false). That would be backwards-compatible for cases like the OP. That very well might be the only viable option other than giving an error.

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Oct 3, 2016

If you're treating definitions in different blocks like separate functions, wouldn't that break:

function g(a,b)
  if a
    f() = 2
  end
  if b
    f(x) = 4
  end
  f
end

as in, what would it return for g(true,true)? (Or am I misunderstanding it?)

@JeffBezanson
Copy link
Sponsor Member

That case is already broken. For now at least, our choices are to disallow it (since it doesn't do anything reasonable) or give it another meaning. Have you used that pattern? In my informal survey of packages during this change I don't think I found any uses of it, though there were several uses of the pattern in the OP.

For reference, my proposal would return a function with the one method f(x)=4 for g(true,true). It would be equivalent to

_f_1() = 2
_f_2(x) = 4
function g(a,b)
  if a
    f = _f_1
  end
  if b
    f = _f_2
  end
  f
end

@StefanKarpinski
Copy link
Sponsor Member

That seems extremely confusing. How about detecting if we're in a case that can be refactored like that while giving the expected behavior and error only if that's not the case? I.e. support the if/else case (which does occur) but raise an error in the if/if case (which seems not to).

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Oct 3, 2016

@JeffBezanson I did have something like that before, i.e. adding dispatches conditionally. But I think one reason why you won't find any uses of it is because everything related to this issue has such non-intuitive behavior that any package is probably avoiding anything like this (for example, I changed some codes away from conditional definitions like this to all using anonymous function when I did the change from v0.4 to v0.5 a few months ago). I think that checking which packages are using the broken design pattern, seeing none are using it, and then declaring that it must not be wanted is kind of a self-fulfilling prophecy.

That said, I don't think it's a necessary design pattern. I've avoided it now for awhile and haven't needed it since. I like what @StefanKarpinski is suggesting: you know that the design pattern is going to have issues, so instead of letting odd behavior happen, throw an error. Otherwise it's one hell of a bug to debug. I'd rather be told to avoid it rather than try to make it work.

@JeffBezanson
Copy link
Sponsor Member

@ChrisRackauckas I did that survey while working on this change, before it was pushed to master, so it was 0.4 code only.

But yes, my default choice here would be to give an error (#15602 (comment)), I just thought I'd throw out another alternative. Allowing the else case is an interesting possibility, but of course breaks under simple rewrites like

if x
   ...
end
if !x
   ...
end

I figured it might be easier to reason about what is in separate blocks, vs. which blocks the compiler can tell are mutually exclusive.

@ChrisRackauckas
Copy link
Member

Got it. Thanks for the clarification.

@jessebett
Copy link
Contributor

Just ran into this for the first time and had a very difficult time debugging it (with print line debugging):

const a=true
function blah()
    if a==true
        println("here")
        f_() = 1
        return f_
    elseif a==false
        println("there")
        f_() = 2
        return f_
    end
end
blah()()

This function returns:

here
2

Which makes it appear that it is only evaluating the first conditional block, and never the second, yet the function definition in the second is returned.
Also, I had expected that since a is a const here it would be known during compile time and result in the correct conditional branch being compiled. Not the case.

@ChrisRackauckas
Copy link
Member

The one thing that gets me about this is that they are now functionally different in v1.0. With anonymous functions, you can add dispatches in the global scope using a fun little hack:

julia> f = (u) -> u
#3 (generic function with 1 method)

julia> (::typeof(f))(u,v) = u*v

julia> f(1,2)
2

but you cannot do so in a function scope:

function foo()
    f = (u) -> u
    (::typeof(f))(u,v) = u*v
    f
end

which incentivizes the use of non-anonymous functions in local scopes like in DiffEqFlux:

https://github.com/JuliaDiffEq/DiffEqFlux.jl/blob/master/src/Flux/neural_de.jl#L4-L5

while I would like to stick to the nice and safe anonymous functions here (and indeed @jessebett uncovered this issue DiffEqFlux examples), I don't think I can. There probably is some underlying deep compiler reason for it, but I do remember that this was allowed in v0.6 so I'm not sure what happened to disallow it in v1.0.

@JeffBezanson
Copy link
Sponsor Member

JeffBezanson commented Jul 6, 2019

Anonymous-ness does not make a difference here; the construct (::typeof(f))(x) = ... adds a globally-visible method and so isn't allowed in a local scope independent of whether f is an anonymous function. The way to think about it is that any anonymous function can be rewritten to a named function with a compiler-generated unique name. For example f = u -> u can be rewritten to

_f1337(u) = u
f = _f1337

That approach can be safely used to add multiple methods to an anonymous function. It's really just that there's no syntax for an anonymous function with multiple methods. One could write a macro for that though, e.g.

f = @multianon begin
    (u) -> u
    (u, v) -> 2u + v
end

With the DiffEq example, say you are tempted to write

function neural_ode(model,x,tspan,
                    args...;kwargs...)
  p = destructure(model)
  if condition
    dudt_(u::TrackedArray,p,t) = restructure(model,p)(u)
    dudt_(u::AbstractArray,p,t) = Flux.data(restructure(model,p)(u))
  else
    dudt_(u::TrackedArray,p,t) = something_else1
    dudt_(u::AbstractArray,p,t) = something_else2
  end
end

The correct way to write it is:

function neural_ode(model,x,tspan,
                    args...;kwargs...)
  p = destructure(model)
  dudt_1(u::TrackedArray,p,t) = restructure(model,p)(u)
  dudt_1(u::AbstractArray,p,t) = Flux.data(restructure(model,p)(u))
  dudt_2(u::TrackedArray,p,t) = something_else1
  dudt_2(u::AbstractArray,p,t) = something_else2
  if condition
    dudt_1
  else
    dudt_2
  end
end

@antoine-levitt
Copy link
Contributor

Setting aside the question of how to support this better, can the obviously broken case be made to error? The example above by @bkamins is particularly bad; no warning at all is delivered, the if false version gives a strange error (doit not defined), and the if true version an incorrect result.

@natschil
Copy link

natschil commented Feb 10, 2020

I agree with @antoine-levitt , the fact that code inside the else block of an if true construction can affect the outside scope is worthy of at least a warning. I just spent quite a lot of time trying trying to debug code due to this issue (as have other commenters here), this needs to be documented better.

@StefanKarpinski
Copy link
Sponsor Member

Agree. It would be great if someone took on figuring out how to make this warn in the problematic cases, or better still an error since it's unlikely to be doing the right thing in any code doing this.

@daviehh
Copy link
Contributor

daviehh commented Jun 25, 2020

Recently ran into this as well... how about making the default option to warn such overwrites, i.e. change this to 1 or JL_OPTIONS_WARN_OVERWRITE_ON

0, // method overwrite warning

then add a way to programmatically suppress the warning/error message when the overwrite is intentional, such as Revise.jl? e.g. a macro like @overwrite. Currently, the warning messages are printed in c

julia/src/gf.c

Lines 1497 to 1510 in 52c55d7

jl_printf(s, "WARNING: Method definition ");
jl_static_show_func_sig(s, (jl_value_t*)newentry->sig);
jl_printf(s, " in module %s", jl_symbol_name(oldmod->name));
print_func_loc(s, oldvalue);
jl_printf(s, " overwritten");
if (oldmod != newmod)
jl_printf(s, " in module %s", jl_symbol_name(newmod->name));
print_func_loc(s, method);
jl_printf(s, ".\n");
jl_uv_flush(s);
}
if (jl_options.incremental && jl_generating_output())
jl_printf(JL_STDERR, " ** incremental compilation may be fatally broken for this module **\n\n");
}

One possible way to implement is by adding a function to flip the warning setting,

JL_DLLEXPORT void jl_options_flip_warn_overwrite(void)
{
    jl_options.warn_overwrite ^= 1;
}

then in julia, the warning can be turned on/off via ccall(:jl_options_flip_warn_overwrite, Cvoid, ()). There should be more elegant solutions...

vtjnash added a commit that referenced this issue Jul 10, 2020
Always show the warning for anonymous functions, but update the verbiage
to give additional information. This warning can still be avoided by
explicitly calling delete_method first.

fixes #32635
fixes #35140
refs #15602
vtjnash added a commit that referenced this issue Jul 10, 2020
Always show the warning for anonymous functions, but update the verbiage
to give additional information. This warning can still be avoided by
explicitly calling delete_method first.

fixes #32635
fixes #35140
refs #15602
StefanKarpinski pushed a commit that referenced this issue Jul 10, 2020
Always show the warning for anonymous functions, but update the verbiage
to give additional information. This warning can still be avoided by
explicitly calling delete_method first.

fixes #32635
fixes #35140
refs #15602
simeonschaub pushed a commit to simeonschaub/julia that referenced this issue Aug 11, 2020
Always show the warning for anonymous functions, but update the verbiage
to give additional information. This warning can still be avoided by
explicitly calling delete_method first.

fixes JuliaLang#32635
fixes JuliaLang#35140
refs JuliaLang#15602
@andyferris
Copy link
Member

I ran into this issue today. I would love to see this error out at lowering time.

@c42f
Copy link
Member

c42f commented Feb 3, 2021

It looks like this was improved in Julia 1.6 — #36609 gives a warning now.

I made this to an error in #39498 if we think that's compatible enough to merge into Julia 1.7.

@StefanKarpinski
Copy link
Sponsor Member

What is the issue preventing a more comprehensive error here? Do we just not know the precise conditions under which we should allow / disallow this kind of construct? What about disallowing method definitions for an inner function unless they are all in the same basic block?

@iago-lito
Copy link

As a new user, I've stumbled accross one case here that I find very dangerous, because Julia produced definite undesired behaviour with no warning (i.e. evaluating g(A, 5) to 25 instead of 10).

@bkamins
Copy link
Member

bkamins commented Oct 9, 2023

Could someone please comment on the plans related to this issue? Thank you! (I am asking because the problem is often raised by new Julia users)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) kind:bug Indicates an unexpected problem or unintended behavior kind:minor change Marginal behavior change acceptable for a minor release needs docs Documentation for this change is required
Projects
None yet
Development

No branches or pull requests