Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in parsing of tasks in monitor, it was not using flatbuffers. #372

Merged
merged 1 commit into from
Mar 16, 2017
Merged

Fix bug in parsing of tasks in monitor, it was not using flatbuffers. #372

merged 1 commit into from
Mar 16, 2017

Conversation

robertnishihara
Copy link
Collaborator

@robertnishihara robertnishihara commented Mar 16, 2017

This should address #354. When we updated some of the Redis payloads to be serialized using flatbuffers (in #341), we forgot to update monitor.py. This means that whenever RAY.TASK_TABLE_GET was called from cleanup_task_table in monitor.py, the monitor incorrectly parsed the local scheduler ID, and since the local scheduler ID was garbage, the monitor assumed that the task had been assigned to a dead node and so marked the task as lost. This caused the an attempt to reconstruct the task, which was fatal error because the task was an actor task.

So presumably the test failure was happening when the following conditions were met:

  1. A task was already present in Redis when the monitor started up.
  2. ray.get was called on the task triggering reconstruction.
  3. The task was an actor task, since those currently can't be reconstructed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/305/
Test PASSed.

@pcmoritz pcmoritz merged commit 3333e1d into ray-project:master Mar 16, 2017
@pcmoritz pcmoritz deleted the reconstructionfailurefix branch March 16, 2017 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants