Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

threads: force connect when adding thread #109

Merged
merged 1 commit into from
Nov 20, 2019

Conversation

jsign
Copy link
Contributor

@jsign jsign commented Nov 20, 2019

@sanderpick , here's a PR to show a bug that I'd like your confirmation.
The current PR has the fix, but let me explain the situation.

If you run TestE2EWithThreads test (with boostrap commented as shown) without the patch here showed in threads.go, the test fail with:
store_test.go:65: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing failed to find any peer in table"
If you re-enable boostrap in the test, the test pass OK.

From a conceptual POV, this didn't made sense. Why I would need to boostrap with cafes (since are the defaults now) to let two peers sync to each other, where one of them joined with a thread link?

Was quite hard to figure out since the error was too generic and didn't give a clue on where was the problem.

The problem was in lgs, err := t.service.getLogs(ctx, id, pid), just below the patch I did in threads.go. The problem there was that the gRPC connection failed since it doesn't know how to resolve the peer.ID addrs.

What I added was to explicitly connect to the peer corresponding to the Thread link, so to avoid that problem. This solves the problem... now unboostrapped peers sync correctly.

I wanted to know your opinion... this seems reasonable to me, since why not connect to the peer of the thread link?. But I'm not sure that is the right place, or something else should have done this job correctly. The good part is that is easy to reproduce: take this PR, and comment only the changes in thread.go, then go run ./eventstore -run TestE2EWithThreads.

@sanderpick
Copy link
Member

Ah, good catch! We must have not seen this in the normal thread tests because the thread services are always bootstrapped. Furthermore, their address info is manually swapped :)

Without the manual address swap, in the bootstapped case, the routed host falls back to the DHT and queries for the other peers info, which can be found via the bootstrappers. That's my hunch why your test works, but as you observed, bootstrapping should not be required.

@jsign
Copy link
Contributor Author

jsign commented Nov 20, 2019

Force pushed just because of signing, and cleaning the solution: return err instead of panic, delete commented bootstrap in test. Test passed!

@jsign jsign marked this pull request as ready for review November 20, 2019 23:30
@jsign jsign merged commit 2d227e6 into textileio:master Nov 20, 2019
@jsign jsign deleted the forceconnect branch November 20, 2019 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants