Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tickets may not be returned to the pool and isTimeout race condition #67

Merged
merged 9 commits into from
Aug 15, 2017

Conversation

cfchou
Copy link
Contributor

@cfchou cfchou commented Jun 25, 2017

In Go(), the original logic is one task goroutine acquires a ticket and then another timer goroutine returns it. Albeit it's unlikely, there's a chance that a ticket may never be returned when timer tries to return a ticket before task acquires one.

@cfchou
Copy link
Contributor Author

cfchou commented Jun 25, 2017

I check the reason why travis failed. It's the same issue as #59. Except for 1.6.0, other versions run just fine.

@cfchou
Copy link
Contributor Author

cfchou commented Jun 26, 2017

I found that a potential race condition could happen when the task goroutine passed !cmd.isTimedOut() and then immediately the timer goroutine set cmd.timeout = true.

My solution is to use a local sync.Once to ensure only the faster goroutine runs errWithFallback() and circuit.ReportEvent().

@cfchou cfchou changed the title tickets may not be returned to the pool tickets may not be returned to the pool and a race condition Jun 26, 2017
@cfchou cfchou changed the title tickets may not be returned to the pool and a race condition tickets may not be returned to the pool and isTimeout race condition Jun 26, 2017
@cfchou cfchou mentioned this pull request Jun 27, 2017
@afex
Copy link
Owner

afex commented Aug 15, 2017

thanks very much for this fix!

@afex afex merged commit f118cd9 into afex:master Aug 15, 2017
This was referenced Sep 26, 2017
vermapratyush added a commit to myteksi/hystrix-go that referenced this pull request Sep 29, 2017
Adds a buffer on requests before throwing ErrMaxConcurrency.
The Timeout is still adhered to, timer starts when the request is submitted to hystrix-go, not when execution starts. This basically implements MaxQueueSize as present in the Netflix's Hystrix

The default value of MaxQueueSize is 50 (5 * DefaultMaxConcurrency), although can be overridden when initialising circuit.

In addition to the request buffer, the PR includes a different way to solving for afex#67 . It uses channels instead of sync.Once as it makes the Go() function simpler. Also some go-lint fixes.

We have been running this in production at GrabTaxi for a while, and there seems to be no side effects.

> This also includes a rebase on afex/hystrix-go
vermapratyush added a commit to myteksi/hystrix-go that referenced this pull request Sep 30, 2017
Adds a buffer on requests before throwing ErrMaxConcurrency.
The Timeout is still adhered to, timer starts when the request is submitted to hystrix-go, not when execution starts. This basically implements MaxQueueSize as present in the Netflix's Hystrix

The default value of MaxQueueSize is 50 (5 * DefaultMaxConcurrency), although can be overridden when initialising circuit.

In addition to the request buffer, the PR includes a different way to solving for afex#67 . It uses channels instead of sync.Once as it makes the Go() function simpler. Also some go-lint fixes.

We have been running this in production at GrabTaxi for a while, and there seems to be no side effects.

> This also includes a rebase on afex/hystrix-go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants