Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option to not delay TCPServer socket creation #24115

Merged
merged 2 commits into from
Oct 25, 2017

Conversation

tanmaykm
Copy link
Member

This change adds an option to TCPServer socket constructor to not delay creation of the socket. This will allow TCP servers to set socket options like SO_REUSEPORT before the socket is bound and listened on.

With early initialization:

tan@tanlt ~/temp/soreuseport $ julia ./soreuseport.jl false &
[1] 23061
tan@tanlt ~/temp/soreuseport $ creating socket bound to 8010...
has reuseport
tcp_reuseport returned 0
listening...
accepting...

tan@tanlt ~/temp/soreuseport $ julia ./soreuseport.jl false
creating socket bound to 8010...
has reuseport
tcp_reuseport returned 0
listening...
accepting...

But that is not possible with delayed initialization:

tan@tanlt ~/temp/soreuseport $ julia ./soreuseport.jl true &
[1] 22977
tan@tanlt ~/temp/soreuseport $ creating socket bound to 8010...
has reuseport
tcp_reuseport returned -1
listening...
accepting...

tan@tanlt ~/temp/soreuseport $ julia ./soreuseport.jl true
creating socket bound to 8010...
has reuseport
tcp_reuseport returned -1
listening...
ERROR: LoadError: listen: address already in use (EADDRINUSE)
Stacktrace:
 [1] uv_error at ./libuv.jl:68 [inlined]
 [2] #listen#369 at ./stream.jl:933 [inlined]
 [3] listen at ./stream.jl:933 [inlined]
 [4] run_server(::Bool) at /home/tan/temp/soreuseport/soreuseport.jl:13
 [5] include_relative(::Module, ::String) at ./loading.jl:533
 [6] include(::Module, ::String) at ./sysimg.jl:14
 [7] process_options(::Base.JLOptions) at ./client.jl:325
 [8] _start() at ./client.jl:391
in expression starting at /home/tan/temp/soreuseport/soreuseport.jl:22

My soreuseport.jl is:

function run_server(delay)
    println("creating socket bound to 8010...")
    s = Base.TCPServer(; delay=delay)
    if ccall(:jl_has_so_reuseport, Int32, ()) == 1
        println("has reuseport")
        rc = ccall(:jl_tcp_reuseport, Int32, (Ptr{Void},), s.handle)
        println("tcp_reuseport returned ", rc)
    else
        println("no reuseport")
    end
    bind(s, Base.InetAddr(ip"0.0.0.0", 8010))
    println("listening...")
    listen(s)
    println("accepting...")
    s1 = accept(s)
    println("accepted, closing...")
    close(s1)
    close(s)
    println("done")
end

run_server(parse(Bool, ARGS[1]))

This change adds an option to TCPServer socket constructor to not delay creation of the socket. This will allow TCP servers to set socket options like `SO_REUSEPORT` before the socket is bound and listened on.

With early initialization:

```
tan@tanlt ~/temp/soreuseport $ julia ./soreuseport.jl false &
[1] 23061
tan@tanlt ~/temp/soreuseport $ creating socket bound to 8010...
has reuseport
tcp_reuseport returned 0
listening...
accepting...

tan@tanlt ~/temp/soreuseport $ julia ./soreuseport.jl false
creating socket bound to 8010...
has reuseport
tcp_reuseport returned 0
listening...
accepting...
```

But that is not possible with delayed initialization:

```
tan@tanlt ~/temp/soreuseport $ julia ./soreuseport.jl true &
[1] 22977
tan@tanlt ~/temp/soreuseport $ creating socket bound to 8010...
has reuseport
tcp_reuseport returned -1
listening...
accepting...

tan@tanlt ~/temp/soreuseport $ julia ./soreuseport.jl true
creating socket bound to 8010...
has reuseport
tcp_reuseport returned -1
listening...
ERROR: LoadError: listen: address already in use (EADDRINUSE)
Stacktrace:
 [1] uv_error at ./libuv.jl:68 [inlined]
 [2] #listen#369 at ./stream.jl:933 [inlined]
 [3] listen at ./stream.jl:933 [inlined]
 [4] run_server(::Bool) at /home/tan/temp/soreuseport/soreuseport.jl:13
 [5] include_relative(::Module, ::String) at ./loading.jl:533
 [6] include(::Module, ::String) at ./sysimg.jl:14
 [7] process_options(::Base.JLOptions) at ./client.jl:325
 [8] _start() at ./client.jl:391
in expression starting at /home/tan/temp/soreuseport/soreuseport.jl:22
```

My `soreuseport.jl` is:

```
function run_server(delay)
    println("creating socket bound to 8010...")
    s = Base.TCPServer(; delay=delay)
    if ccall(:jl_has_so_reuseport, Int32, ()) == 1
        println("has reuseport")
        rc = ccall(:jl_tcp_reuseport, Int32, (Ptr{Void},), s.handle)
        println("tcp_reuseport returned ", rc)
    else
        println("no reuseport")
    end
    bind(s, Base.InetAddr(ip"0.0.0.0", 8010))
    println("listening...")
    listen(s)
    println("accepting...")
    s1 = accept(s)
    println("accepted, closing...")
    close(s1)
    close(s)
    println("done")
end

run_server(parse(Bool, ARGS[1]))
```
Copy link
Sponsor Member

@StefanKarpinski StefanKarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Needs docs and ideally a test even if only to test that the keyword is accepted, if not that it does something in particular.

if ccall(:jl_has_so_reuseport, Int32, ()) == 1
@test 0 == ccall(:jl_tcp_reuseport, Int32, (Ptr{Void},), s.handle)
end
end
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe test the positive case as well? (Unless that's already done somewhere else)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is tested elsewhere as that's the default.

base/socket.jl Outdated

# Keyword arg "delay": if true, libuv delays creation of socket fd till bind.
# It can be set to false if there is a need to set socket options before
# further calls to `bind` and `listen`, e.g. `SO_REUSEPORT`.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant as part of the doc string for the TCPServer constructor, although a comment is better than nothing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCPServer is not exported, so I think this should be okay for now.

@tanmaykm
Copy link
Member Author

Can this be merged now?
It will be good to have this backported to 0.6 as well.

@aviks
Copy link
Member

aviks commented Oct 25, 2017

Bump, anything preventing this from being merged?

@StefanKarpinski StefanKarpinski merged commit 306c02c into JuliaLang:master Oct 25, 2017
@aviks
Copy link
Member

aviks commented Nov 6, 2017

@ararslan Any chance getting this into 0.6.2 please?

@ararslan
Copy link
Member

ararslan commented Nov 6, 2017

Yes, will do.

@tkelman
Copy link
Contributor

tkelman commented Nov 7, 2017

A new keyword argument is a feature, not a bugfix.

@tanmaykm
Copy link
Member Author

tanmaykm commented Nov 7, 2017

TCPServer constructor is not exported, and there is no change in the default behavior.

@tkelman
Copy link
Contributor

tkelman commented Nov 7, 2017

So? Shouldn't encourage writing code that works on 0.6.2 that wouldn't work on 0.6.1 unless there's a legitimate bug involved.

@StefanKarpinski
Copy link
Sponsor Member

Strictly, @tkelman is right, of course, but @tanmaykm is also right that this is a new option to an unexported type, which is about as mild as a feature gets. What's at stake here? What is this feature needed for? Is this effectively a bug fix?

@aviks
Copy link
Member

aviks commented Nov 7, 2017

So this feature is needed to run a Julia web server with real world load. Currently, a Julia web server can effectively only serve one connection at a time (even though the webserver is written with @async). This allows us to serve multiple connections on one machine. So in some ways this is crucial for realistic deploys.

The actual underlying feature (SO_REUSEPORT) has been in Julia (via libuv) for a long time. However, it cannot be used, due to an incorrect order of initialisation (to simplify things a bit). This PR changes the order of initialisation based on the keyword argument, thus allowing the feature to be actually used. So yes, this could be seen as a bugfix, since the feature has been added, but was unable to be used.

I think we can safely say that a new keyword argument is pretty low risk, there are bugfixes with much higher risk (as we've seen recently). Hence I would argue for its inclusion.

@StefanKarpinski
Copy link
Sponsor Member

Could we just change the incorrect initialization order and consider that a bugfix?

@tkelman
Copy link
Contributor

tkelman commented Nov 7, 2017

Or carry a local copy of the modified version of the constructor, which should work correctly even on 0.6.0 or 0.6.1.

@tanmaykm
Copy link
Member Author

tanmaykm commented Nov 8, 2017

To answer Stefan's question, I consider the keyword argument as the proper fix.

The bug was that Julia allowed only one way of constructing server socket (delayed), where no options could be set until it is listened on. While delayed creation is useful to avoid locking resources until use, we also need early creation to be able to set certain options.

Without the keyword we will have to make all sockets initialize early, which I think is worse.

@tanmaykm
Copy link
Member Author

tanmaykm commented Nov 8, 2017

TCPServer() is an internal API. Exported APIs listen and listenany are the only ones that use this. The next step should be to have these exported APIs changed. I would agree with not backporting those to v0.6.

I would however argue for backporting this. It would allow packages that are willing to depend on this internal API in > v0.6.2 onwards to get this feature without calling a bunch of other internal APIs.

There is no impact on packages that choose to be compatible across v0.6. They can either not use reuseport at all and be oblivious of this change, or implement all of the constructor on their own.

@tkelman
Copy link
Contributor

tkelman commented Nov 8, 2017

There shouldn't be such a thing as an API that only exists in 0.6.2, internal or otherwise. Feature freeze for release-0.6 was in February.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants