Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler ignores current resource consumption levels while scheduling #6467

Open
gjoseph92 opened this issue May 27, 2022 · 1 comment · May be fixed by #6468
Open

Scheduler ignores current resource consumption levels while scheduling #6467

gjoseph92 opened this issue May 27, 2022 · 1 comment · May be fixed by #6468
Labels
discussion Discussing a topic with no specific actions yet

Comments

@gjoseph92
Copy link
Collaborator

Currently on the scheduler, when a task is assigned to a worker and consumes resources, that's set in one place. When deciding whether a task can be assigned to a worker, that's checked in a different place. Therefore, current resource consumption levels are not considered in task scheduling.

The current scheduling appears to just consider which workers can run a task in theory: do they have enough of the resource to be able to run this task ever (even if none of it is available right now)?

Considering resources like GPUs, I suppose this makes sense: queuing extra tasks onto workers is beneficial so there's no idleness. Still, it's a little surprising. And the fact that worker_objective doesn't take current resource consumption into account seems likely to cause bad scheduling, since we could easily assign a task to a worker whose resource is currently used up, when there are other workers with the resource available.


When a task gets assigned to a worker, consume_resources only adjusts the count in WorkerState.used_resources:

for r, required in ts.resource_restrictions.items():
ws.used_resources[r] += required

But SchedulerState.valid_workers looks for which workers can run a task, it only checks self.resouces[resource][address], and never looks at WorkerState.used_resources:

for resource, required in ts.resource_restrictions.items():
dr: dict = self.resources.get(resource) # type: ignore
if dr is None:
self.resources[resource] = dr = {}
sw: set = set()
for addr, supplied in dr.items():
if supplied >= required:
sw.add(addr)

So tasks will not enter the no-worker state just because all resources in the cluster are currently used up.

Instead, as usual, more tasks will get queued onto workers than they can run at once. Each worker will manage only running the correct number of tasks at once.


  • Is this intentional?
  • Why do we track resource counts in both self.resources and WorkerState.resources?
  • Why do we bother tracking WorkerState.used_resources if it's never actually used for scheduling decisions?

Note that changing this behavior would likely provide a viable temporary solution for #6360, a very common pain point for many users.

cc @mrocklin @fjetter

@gjoseph92 gjoseph92 added bug Something is broken discussion Discussing a topic with no specific actions yet and removed bug Something is broken labels May 27, 2022
@elementace
Copy link

FWIW, I'm running into a problem where XGBoost does some weird thread management per node in my cluster. If the sum of the 'njobs' of all assigned xgboost training tasks to a node is > # of vCPUs, then it uses only 1 vCPU in total for all tasks.

Hence I was looking to worker resource management to solve this problem (by making each require the whole resource of the worker).

The problem then stemming from this ticket is it successfully completes the first task by itself using the full compute capacity (all 32 cores), and then tries to run all the remaining tasks at the same time (without assessing the resources available and queuing 1 at a time accordingly). Resulting in the cpu utilisation dropping to 1 / 32 cores.

Has there been any progress on this in discussions @gjoseph92 ? @mrocklin ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussing a topic with no specific actions yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants