Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Straggling Task Detection Improvement #53

Open
zhangpengshan opened this issue Oct 10, 2014 · 0 comments
Open

Straggling Task Detection Improvement #53

zhangpengshan opened this issue Oct 10, 2014 · 0 comments
Milestone

Comments

@zhangpengshan
Copy link
Contributor

So far, if a task/container run over threshold three times, it will be killed and fail-over make this task run again in other machine. But I found in our cluster(very busy), sometimes there are always a slow task blocking all other tasks. A good detection improvement is needed to detect such kind of task.

Not to let user set the threshold, while collecting metrics each iteration from all workers, if someone is over standard deviation too more, kill it.

@zhangpengshan zhangpengshan added this to the 0.7.0 milestone Feb 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant