-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve] Make the checkpoint and recover only from GCS #26753
Conversation
1b87bea
to
9abca9d
Compare
this is also mentioned in documentation. Can you take care of that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can also get rid of most of the code under serve/storage
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also feel free to remove s3 and other stores in storage folder! since they are now no longer used.
@sihanwang41 DCO build failed. Need to signoff on your commits. Use this to squash the changes here into one signed-off commit:
|
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: Sihan Wang <[email protected]>
…rch-extra-index-url * 'master' of https://github.com/ray-project/ray: (26 commits) [runtime env] plugin refactor[6/n]: java api refactor (ray-project#26783) easy test? (ray-project#26905) [core] Introduce a flag which allows a longer timeout for raylet when GCS restarts. (ray-project#26919) [RLLib] Record framework and algorithm used by an RLlib run. (ray-project#26956) [train] set split locality_hints (ray-project#26973) [Serve] Make the checkpoint and recover only from GCS (ray-project#26753) [AIR DOC] minor tweaks to checkpoint user guide for clarity and consistency subheadings (ray-project#26937) [tune] Only sync down from cloud if needed (ray-project#26725) [Core] Refactoring Ray DAG object scanner (ray-project#26917) [air] Raise error on path-like access for Checkpoints (ray-project#26970) [AIR] Enable other notebooks previously marked with # REGRESSION (ray-project#26896) [RLlib] Simplify agent collector (ray-project#26803) [Datasets] Automatically cast tensor columns when building Pandas blocks. (ray-project#26924) [Workflow] Fix flaky example(ray-project#26960) [dashboard] Update cluster_activities endpoint to use pydantic. (ray-project#26609) [air] Un-revert "[air] remove unnecessary logs + improve repr for result" (ray-project#26942) [setup-dev] Add flag to skip symlink certain folders (ray-project#26899) [air/tune/docs] Cont. convert Tune examples to use Tuner.fit() (ray-project#26959) [air/tune/docs] Change Tuner() occurences in rest of ray/tune (ray-project#26961) [data] set iter_batches default batch_size (ray-project#26955) ... Signed-off-by: ddelange <[email protected]>
) Signed-off-by: klwuibm <[email protected]>
) Signed-off-by: Catch-Bull <[email protected]>
) Signed-off-by: Rohan138 <[email protected]>
) Signed-off-by: Frank Luan <[email protected]>
) Signed-off-by: Scott Graham <[email protected]>
) Signed-off-by: Stefan van der Kleij <[email protected]>
Why are these changes needed?
Make the GCS as only way to recover. (Not well tested yet for external storage)
Checks
scripts/format.sh
to lint the changes in this PR.