Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation time grows quickly with size of functions #16434

Closed
toivoh opened this issue May 18, 2016 · 3 comments
Closed

Compilation time grows quickly with size of functions #16434

toivoh opened this issue May 18, 2016 · 3 comments
Assignees
Labels
performance Must go faster

Comments

@toivoh
Copy link
Contributor

toivoh commented May 18, 2016

Consider the following code,

function code_big_function(n::Int)
    code = [:( $(symbol("x$k")) = $(symbol("x$(k-1)")) + 1 ) for k=1:n]
    quote
        let
            function f(x0)
                $(code...)
                return $(symbol("x$n"))
            end
        end
    end
end

@show code_big_function(5)

ns = [10,100, 1000, 2000, 5000]
ts = Float64[]

for n in ns
    code = code_big_function(n)
    f = eval(code)
    data = @timed f(0)
    push!(ts, data[2])
end

display(vcat(["n", "t", "t/n", "t/n^2"]',
             hcat(ns, ts, ts./ns, ts./ns.^2)))

which creates functions like

function f(x0)
    x1 = x0 + 1
    x2 = x1 + 1
    x3 = x2 + 1
    x4 = x3 + 1
    x5 = x4 + 1
    return x5
end

and records the compilation time as a function of the number of computations n.

The result I got was (with an 81 days old Julia 0.5)

     "n"    "t"         "t/n"        "t/n^2"  
   10.0    0.00408641  0.000408641  4.08641e-5
  100.0    0.0340387   0.000340387  3.40387e-6
 1000.0    0.779312    0.000779312  7.79312e-7
 2000.0    2.5837      0.00129185   6.45926e-7
 5000.0   25.0702      0.00501404   1.00281e-6

ie it seems that the compilation time scales at least quadratically with the size of the function. Reducing the amount of optimization with julia -O0 or julia -O1 doesn't seem to help either.

Is there anything that can be done about this?

@vtjnash
Copy link
Sponsor Member

vtjnash commented May 18, 2016

fwiw, runtime performance of that function will also be poor due to the size of the function. Modern processors are much faster at doing loops than long straight-line code.

@JeffBezanson
Copy link
Sponsor Member

Might be related to #6685.

The time seems to be in type inference, perhaps mostly type_annotate!. That code does an O(n*m) loop, where n is the number of statements and m is the number of slots. We might be able to use a sparser structure for this.

@toivoh
Copy link
Contributor Author

toivoh commented May 19, 2016

That would be great!

In the meantime, I have been experimenting with splitting up the code.
Here's a slightly bigger example that creates a type like

type T
    x0::Float64
    x1::Float64
    x2::Float64
    x3::Float64
    x4::Float64
    x5::Float64
end

and then creates functions to operate on a subset of the variables of the type at a time (100 variables and 100 statements at a time). It generates and compiles a function that will operate on the last 100 variables in this case:

module BigFunction5

function code_big_type(Tname::Symbol, n::Int)
    code = [:($(symbol("x$k"))::Float64) for k=0:n]
    quote
        type $Tname
            $(code...)
            $Tname() = new($(zeros(n+1)...))
        end
        $Tname
    end
end

function code_function(ks)
    code = [:(v.$(symbol("x$k")) = v.$(symbol("x$(k-1)")) + 1) for k in ks]
    quote
        let
            function f(v)
                $(code...)
            end
        end
    end
end

m = 100
ns = [100, 1000, 2000, 5000, 10000]
ts = Float64[]

for n in ns
    T = eval(code_big_type(Symbol("T$n"), n))
    v=T()

    f = eval(code_function((n-m+1):n))
#    @time f = eval(code_function(1:m))
    data = @timed f(v)
    push!(ts, data[2])
end

display(vcat(["n", "t"]',
             hcat(ns, ts)))

end

Time for compiling the function that operates on the last 100 variables grows substantially with the size of the type T:

      "n"   "t"    
   100.0   0.144561
  1000.0   0.202715
  2000.0   0.268287
  5000.0   0.499345
 10000.0   0.961893

Time for compiling the function that operates on the first 100 variables stays constant when the type grows, on the other hand. Could this behavior be related?

@JeffBezanson JeffBezanson added the performance Must go faster label May 28, 2016
@JeffBezanson JeffBezanson self-assigned this May 28, 2016
JeffBezanson added a commit that referenced this issue May 29, 2016
fix #16434, superlinear compiler run time on large functions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

3 participants