Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: listening on dual-stack UDP socket sometimes silently fails on macOS #67226

Open
marten-seemann opened this issue May 7, 2024 · 4 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Milestone

Comments

@marten-seemann
Copy link
Contributor

Go version

go version go1.22.0 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/marten/Library/Caches/go-build'
GOENV='/Users/marten/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/marten/src/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/marten/src/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/marten/bin/go1.22ex'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/marten/bin/go1.22ex/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.22.0'
GCCGO='gccgo'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/Users/marten/src/go/src/github.com/quic-go/udp-test/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/q0/b5ynf00142l7bl9sp8y098zr0000gn/T/go-build1330860487=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

I start a dual-stack UDP listener:

net.ListenUDP("udp", &net.UDPAddr{IP: net.IPv4zero, Port: 0})

Then I send a UDP datagram (from localhost) to this listener and assert that this packet is received. At the same time, I capture all traffic on localhost using Wireshark.

https://gist.github.com/marten-seemann/bfa811133331b9c053137fd5df12638d

What did you see happen?

Running the test 10000 times, I pretty reliably get a test failure on macOS. The datagram is sent (as confirmed by the Wireshark trace), but it never arrives at the listener.

This is not caused by UDP packet loss (which shouldn't happen on localhost anyway). To make sure that this is not the cause of the problem, I added another test (TestUDPUnconnectedDualStackWithRetransmission), which retransmits the datagram up to 50 times.


This bug is the source of a lot of flakiness in quic-go's test suite. We're running a lot of UDP transfers to test all facets of the protocol, enough to hit this bug on every other CI run or so.

What did you expect to see?

I expect the packet to be received reliably. This test should never fail.

@marten-seemann marten-seemann changed the title net: listen on dual-stack UDP socket sometimes silently fails on macOS net: listening on dual-stack UDP socket sometimes silently fails on macOS May 7, 2024
@cherrymui
Copy link
Member

Thanks for the report. Does the failure happens only on macOS? And it doesn't fail on other OSes? Does the failure reproduce if you don't capture traffic in Wireshark?

https://gist.github.com/marten-seemann/bfa811133331b9c053137fd5df12638d

Could you share a full buildable version of the test? Thanks.

cc @ianlancetaylor @neild

@cherrymui cherrymui added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 7, 2024
@cherrymui cherrymui added this to the Backlog milestone May 7, 2024
@marten-seemann
Copy link
Contributor Author

Does the failure happens only on macOS?

As far as I can tell, yes. I tested it on Linux and Windows (on CI), and it seems to work there.

Does the failure reproduce if you don't capture traffic in Wireshark?

Yes.

Could you share a full buildable version of the test? Thanks.

Sure! I updated the Gist: https://gist.github.com/marten-seemann/bfa811133331b9c053137fd5df12638d.

@marten-seemann
Copy link
Contributor Author

From what I can tell, this seems to be a macOS bug. I rewrote the test in C, and I'm seeing the same failure there: https://gist.github.com/marten-seemann/67eecb83006fdc020456821b69112385 (see comment there for how to run it).

@neild
Copy link
Contributor

neild commented Jun 24, 2024

This seems likely to be the cause of https://go.dev/issue/29225 as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Projects
None yet
Development

No branches or pull requests

3 participants