forked from ray-project/ray
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[AIR] Introduce DatasetIterator for bulk and streaming ingest (ray-pr…
…oject#31470) Introduces ray.air.DatasetIterator which exposes the same iteration-based interfaces as Dataset: iter_batches() to_tf() iter_torch_batches() stats() This interface replaces Dataset and DatasetPipeline as the default data iterator interface in AIR trainers. Since both bulk and streaming ingest now use the same interface, this PR also hard-deprecates use_stream_api and stream_window_size (previously experimental). These are now replaced with a single max_object_store_memory_fraction, or the fraction of Ray's object store memory to use. The value defaults to -1, meaning bulk ingest. This also simplifies the configs for specifying bulk/streaming ingest with global shuffle. Previously, global_shuffle=True would shuffle once before training (using Dataset) or once before each epoch (using DatasetPipeline). Now the preprocessed dataset is always shuffled once before each epoch (using DatasetPipeline). For backwards compatibility in v2.3, DatasetIterator currently forwards unsupported methods to Dataset or DatasetPipeline. Signed-off-by: Stephanie Wang <[email protected]>
- Loading branch information
1 parent
05d6f30
commit 835d1d5
Showing
30 changed files
with
987 additions
and
234 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
.. _dataset-iterator-api: | ||
|
||
DatasetIterator API | ||
=================== | ||
|
||
.. currentmodule:: ray.data | ||
|
||
.. autoclass:: DatasetIterator | ||
|
||
.. autosummary:: | ||
:toctree: doc/ | ||
|
||
DatasetIterator.iter_batches | ||
DatasetIterator.iter_torch_batches | ||
DatasetIterator.to_tf | ||
DatasetIterator.stats |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.