Skip to content

Commit

Permalink
[Train] Hard deprecate BatchPredictor (ray-project#38209)
Browse files Browse the repository at this point in the history
Upgrades the deprecation of BatchPredictor from warning to now raising a DeprecationWarning exception.

Closes ray-project#37035

---------

Signed-off-by: amogkam <[email protected]>
  • Loading branch information
amogkam committed Aug 9, 2023
1 parent b2dd418 commit 0416259
Show file tree
Hide file tree
Showing 36 changed files with 81 additions and 2,386 deletions.
9 changes: 0 additions & 9 deletions doc/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,6 @@ py_test_run_all_subdirectory(
size = "large",
include = ["source/train/doc_code/*.py"],
exclude = [
"source/train/doc_code/predictors.py", # Too large
"source/train/doc_code/hf_trainer.py", # Too large
],
extra_srcs = [],
Expand Down Expand Up @@ -261,14 +260,6 @@ py_test_run_all_subdirectory(
# Run GPU tests
# --------------

py_test_run_all_subdirectory(
size = "large",
include = ["source/train/doc_code/predictors.py"],
exclude = [],
extra_srcs = [],
tags = ["exclusive", "team:ml", "ray_air", "gpu"],
)

py_test(
name = "pytorch_resnet_finetune",
size = "large",
Expand Down
74 changes: 1 addition & 73 deletions doc/source/data/batch_inference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -506,76 +506,4 @@ The rest of the logic looks the same as in the `Quickstart <#quickstart>`_.
.. testoutput::
:options: +MOCK

{'predictions': 0.9969483017921448}

Benchmarks
----------

Below we document key performance benchmarks for common batch prediction workloads.


XGBoost Batch Prediction
~~~~~~~~~~~~~~~~~~~~~~~~

This task uses the BatchPredictor module to process different amounts of data
using an XGBoost model.

We test out the performance across different cluster sizes and data sizes.

- `XGBoost Prediction Script`_
- `XGBoost Cluster Configuration`_

.. TODO: Add script for generating data and running the benchmark.
.. list-table::

* - **Cluster Setup**
- **Data Size**
- **Performance**
- **Command**
* - 1 m5.4xlarge node (1 actor)
- 10 GB (26M rows)
- 275 s (94.5k rows/s)
- `python xgboost_benchmark.py --size 10GB`
* - 10 m5.4xlarge nodes (10 actors)
- 100 GB (260M rows)
- 331 s (786k rows/s)
- `python xgboost_benchmark.py --size 100GB`


GPU image batch prediction
~~~~~~~~~~~~~~~~~~~~~~~~~~

This task uses the BatchPredictor module to process different amounts of data
using a Pytorch pre-trained ResNet model.

We test out the performance across different cluster sizes and data sizes.

- `GPU image batch prediction script`_
- `GPU prediction small cluster configuration`_
- `GPU prediction large cluster configuration`_

.. list-table::

* - **Cluster Setup**
- **Data Size**
- **Performance**
- **Command**
* - 1 g4dn.8xlarge node
- 1 GB (1623 images)
- 46.12 s (35.19 images/sec)
- `python gpu_batch_inference.py --data-directory=1G-image-data-synthetic-raw --data-format=raw`
* - 1 g4dn.8xlarge node
- 20 GB (32460 images)
- 285.2 s (113.81 images/sec)
- `python gpu_batch_inference.py --data-directory=20G-image-data-synthetic-raw --data-format=raw`
* - 4 g4dn.12xlarge nodes
- 100 GB (162300 images)
- 304.01 s (533.86 images/sec)
- `python gpu_batch_inference.py --data-directory=100G-image-data-synthetic-raw --data-format=raw`

.. _`XGBoost Prediction Script`: https://github.com/ray-project/ray/blob/a241e6a0f5a630d6ed5b84cce30c51963834d15b/release/air_tests/air_benchmarks/workloads/xgboost_benchmark.py#L63-L71
.. _`XGBoost Cluster Configuration`: https://github.com/ray-project/ray/blob/a241e6a0f5a630d6ed5b84cce30c51963834d15b/release/air_tests/air_benchmarks/xgboost_compute_tpl.yaml#L6-L24
.. _`GPU image batch prediction script`: https://github.com/ray-project/ray/blob/master/release/air_tests/air_benchmarks/workloads/gpu_batch_inference.py#L18-L49
.. _`GPU prediction small cluster configuration`: https://github.com/ray-project/ray/blob/master/release/air_tests/air_benchmarks/compute_gpu_1_cpu_16_aws.yaml#L6-L15
.. _`GPU prediction large cluster configuration`: https://github.com/ray-project/ray/blob/master/release/air_tests/air_benchmarks/compute_gpu_4x4_aws.yaml#L6-L15
{'predictions': 0.9969483017921448}
18 changes: 15 additions & 3 deletions doc/source/data/doc_code/preprocessors.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,13 +84,25 @@


# __predictor_start__
from ray.train.batch_predictor import BatchPredictor
from ray.train.xgboost import XGBoostPredictor

test_dataset = ray.data.from_items([{"x": x} for x in range(2, 32, 3)])

batch_predictor = BatchPredictor.from_checkpoint(checkpoint, XGBoostPredictor)
predicted_probabilities = batch_predictor.predict(test_dataset)

class XGBoostPredictorWrapper:
def __init__(self, checkpoint):
self.predictor = XGBoostPredictor.from_checkpoint(checkpoint)

def __call__(self, batch):
return self.predictor.predict(batch)


predicted_probabilities = test_dataset.map_batches(
XGBoostPredictorWrapper,
compute=ray.data.ActorPoolStrategy(size=2),
fn_constructor_kwargs={"checkpoint": checkpoint},
batch_format="pandas",
)
predicted_probabilities.show()
# {'predictions': 0.09843720495700836}
# {'predictions': 5.604666709899902}
Expand Down
2 changes: 1 addition & 1 deletion doc/source/data/preprocessors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ Predictor
A ``Predictor`` can be constructed from a saved ``Checkpoint``. If the ``Checkpoint`` contains a ``Preprocessor``,
then the ``Preprocessor`` calls ``transform_batch`` on input batches prior to performing inference.

In the following example, we show the Batch Predictor flow.
In the following example, we show the batch inference flow.

.. literalinclude:: doc_code/preprocessors.py
:language: python
Expand Down
27 changes: 0 additions & 27 deletions doc/source/ray-air/api/predictor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,33 +48,6 @@ Supported Data Formats
predictor.Predictor.preferred_batch_format
~predictor.DataBatchType


Batch Predictor
---------------

Constructor Options
~~~~~~~~~~~~~~~~~~~

.. autosummary::
:toctree: doc/

batch_predictor.BatchPredictor

.. autosummary::
:toctree: doc/

batch_predictor.BatchPredictor.from_checkpoint
batch_predictor.BatchPredictor.from_pandas_udf

Batch Prediction API
~~~~~~~~~~~~~~~~~~~~

.. autosummary::
:toctree: doc/

batch_predictor.BatchPredictor.predict
batch_predictor.BatchPredictor.predict_pipelined

.. _air_framework_predictors:

Built-in Predictors for Library Integrations
Expand Down
130 changes: 1 addition & 129 deletions doc/source/ray-air/computer-vision.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@ This guide explains how to perform common computer vision tasks like:
* `Reading image data`_
* `Transforming images`_
* `Training vision models`_
* `Batch predicting images`_
* `Serving vision models`_

Reading image data
------------------
Expand Down Expand Up @@ -202,130 +200,4 @@ Training vision models
:end-before: __tensorflow_trainer_stop__
:dedent:

For more information, check out :ref:`the Ray Train documentation <train-docs>`.

Creating checkpoints
--------------------

:class:`Checkpoints <ray.train.Checkpoint>` are required for batch inference and model
serving. They contain model state and optionally a preprocessor.

If you're going from training to prediction, don't create a new checkpoint.
:meth:`Trainer.fit() <ray.train.trainer.BaseTrainer.fit>` returns a
:class:`~ray.train.Result` object. Use
:attr:`Result.checkpoint <ray.train.Result.checkpoint>` instead.

.. tab-set::

.. tab-item:: Torch

To create a :class:`~ray.train.torch.TorchCheckpoint`, pass a Torch model and
the :class:`~ray.data.preprocessor.Preprocessor` you created in `Transforming images`_
to :meth:`TorchCheckpoint.from_model() <ray.train.torch.TorchCheckpoint.from_model>`.

.. literalinclude:: ./doc_code/computer_vision.py
:start-after: __torch_checkpoint_start__
:end-before: __torch_checkpoint_stop__
:dedent:

.. tab-item:: TensorFlow

To create a :class:`~ray.train.tensorflow.TensorflowCheckpoint`, pass a TensorFlow model and
the :class:`~ray.data.preprocessor.Preprocessor` you created in `Transforming images`_
to :meth:`TensorflowCheckpoint.from_model() <ray.train.tensorflow.TensorflowCheckpoint.from_model>`.

.. literalinclude:: ./doc_code/computer_vision.py
:start-after: __tensorflow_checkpoint_start__
:end-before: __tensorflow_checkpoint_stop__
:dedent:


Batch predicting images
-----------------------

:class:`~ray.train.batch_predictor.BatchPredictor` lets you perform inference on large
image datasets.

.. tab-set::

.. tab-item:: Torch

To create a :class:`~ray.train.batch_predictor.BatchPredictor`, call
:meth:`BatchPredictor.from_checkpoint <ray.train.batch_predictor.BatchPredictor.from_checkpoint>` and pass the checkpoint
you created in `Creating checkpoints`_.

.. literalinclude:: ./doc_code/computer_vision.py
:start-after: __torch_batch_predictor_start__
:end-before: __torch_batch_predictor_stop__
:dedent:

.. tab-item:: TensorFlow

To create a :class:`~ray.train.batch_predictor.BatchPredictor`, call
:meth:`BatchPredictor.from_checkpoint <ray.train.batch_predictor.BatchPredictor.from_checkpoint>` and pass the checkpoint
you created in `Creating checkpoints`_.

.. literalinclude:: ./doc_code/computer_vision.py
:start-after: __tensorflow_batch_predictor_start__
:end-before: __tensorflow_batch_predictor_stop__
:dedent:

Serving vision models
---------------------

:class:`~ray.serve.Deployment` lets you
deploy a model to an endpoint and make predictions over the Internet.

Deployments use :ref:`HTTP adapters <serve-http>` to define how HTTP messages are converted to model
inputs. For example, :func:`~ray.serve.http_adapters.json_to_ndarray` converts HTTP messages like this:

.. code-block::
{"array": [[1, 2], [3, 4]]}
To NumPy ndarrays like this:

.. code-block::
array([[1., 2.],
[3., 4.]])
.. tab-set::

.. tab-item:: Torch

To deploy a Torch model to an endpoint, create a predictor from the checkpoint you created in `Creating checkpoints`_
and serve via a Ray Serve deployment.

.. literalinclude:: ./doc_code/computer_vision.py
:start-after: __torch_serve_start__
:end-before: __torch_serve_stop__
:dedent:

Then, make a request to classify an image.

.. literalinclude:: ./doc_code/computer_vision.py
:start-after: __torch_online_predict_start__
:end-before: __torch_online_predict_stop__
:dedent:

For more in-depth examples, read about :ref:`Ray Serve <serve-getting-started>`.

.. tab-item:: TensorFlow

To deploy a TensorFlow model to an endpoint, use the checkpoint you created in `Creating checkpoints`_
to create a Ray Serve deployment serving the model.

.. literalinclude:: ./doc_code/computer_vision.py
:start-after: __tensorflow_serve_start__
:end-before: __tensorflow_serve_stop__
:dedent:

Then, make a request to classify an image.

.. literalinclude:: ./doc_code/computer_vision.py
:start-after: __tensorflow_online_predict_start__
:end-before: __tensorflow_online_predict_stop__
:dedent:

For more information, see :ref:`Ray Serve <serve-getting-started>`.
For more information, check out :ref:`the Ray Train documentation <train-docs>`.
28 changes: 0 additions & 28 deletions doc/source/ray-air/doc_code/computer_vision.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,11 @@ def test(*, framework: str, datasource: str):
preprocessor, per_epoch_preprocessor = create_torch_preprocessors()
train_torch_model(dataset, preprocessor, per_epoch_preprocessor)
checkpoint = create_torch_checkpoint(preprocessor)
batch_predict_torch(dataset, checkpoint)
online_predict_torch(checkpoint)
if framework == "tensorflow":
preprocessor, per_epoch_preprocessor = create_tensorflow_preprocessors()
train_tensorflow_model(dataset, preprocessor, per_epoch_preprocessor)
checkpoint = create_tensorflow_checkpoint(preprocessor)
batch_predict_tensorflow(dataset, checkpoint)
online_predict_tensorflow(checkpoint)


Expand Down Expand Up @@ -327,32 +325,6 @@ def create_tensorflow_checkpoint(preprocessor):
return checkpoint


def batch_predict_torch(dataset, checkpoint):
# __torch_batch_predictor_start__
from ray.train.batch_predictor import BatchPredictor
from ray.train.torch import TorchPredictor

predictor = BatchPredictor.from_checkpoint(checkpoint, TorchPredictor)
predictor.predict(dataset, feature_columns=["image"], keep_columns=["label"])
# __torch_batch_predictor_stop__


def batch_predict_tensorflow(dataset, checkpoint):
# __tensorflow_batch_predictor_start__
import tensorflow as tf

from ray.train.batch_predictor import BatchPredictor
from ray.train.tensorflow import TensorflowPredictor

predictor = BatchPredictor.from_checkpoint(
checkpoint,
TensorflowPredictor,
model_definition=tf.keras.applications.resnet50.ResNet50,
)
predictor.predict(dataset, feature_columns=["image"], keep_columns=["label"])
# __tensorflow_batch_predictor_stop__


def online_predict_torch(checkpoint):
# __torch_serve_start__
from io import BytesIO
Expand Down
Loading

0 comments on commit 0416259

Please sign in to comment.