[Data] [Docs] Adding in references to explain how to use credentials …

…with Ray Data (ray-project#44205) Signed-off-by: Matthew Owen <[email protected]>
architkulkarni · Mar 21, 2024 · e1d7025 · e1d7025
1 parent 4ee4a69
commit e1d7025
Show file tree

Hide file tree

Showing 2 changed files with 26 additions and 3 deletions.
diff --git a/doc/source/data/loading-data.rst b/doc/source/data/loading-data.rst
@@ -209,6 +209,10 @@ To read formats other than Parquet, see the :ref:`Input/Output reference <input-
  petal.width double
  variety string
 
+ Ray Data relies on PyArrow for authenticaion with Amazon S3. For more on how to configure
+ your credentials to be compatible with PyArrow, see their
+ `S3 Filesytem docs <https://arrow.apache.org/docs/python/filesystems.html#s3>`_.
+
  .. tab-item:: GCS
 
  To read files from Google Cloud Storage, install the
@@ -227,7 +231,7 @@ To read formats other than Parquet, see the :ref:`Input/Output reference <input-
 
  filesystem = gcsfs.GCSFileSystem(project="my-google-project")
  ds = ray.data.read_parquet(
- "s3:https://anonymous@ray-example-data/iris.parquet",
+ "gcs:https://anonymous@ray-example-data/iris.parquet",
  filesystem=filesystem
  )
 
@@ -243,6 +247,10 @@ To read formats other than Parquet, see the :ref:`Input/Output reference <input-
  petal.width double
  variety string
 
+ Ray Data relies on PyArrow for authenticaion with Google Cloud Storage. For more on how
+ to configure your credentials to be compatible with PyArrow, see their
+ `GCS Filesytem docs <https://arrow.apache.org/docs/python/filesystems.html#google-cloud-storage-file-system>`_.
+
  .. tab-item:: ABS
 
  To read files from Azure Blob Storage, install the
@@ -277,6 +285,10 @@ To read formats other than Parquet, see the :ref:`Input/Output reference <input-
  petal.width double
  variety string
 
+ Ray Data relies on PyArrow for authenticaion with Azure Blob Storage. For more on how
+ to configure your credentials to be compatible with PyArrow, see their
+ `fsspec-compatible filesystems docs <https://arrow.apache.org/docs/python/filesystems.html#using-fsspec-compatible-filesystems-with-arrow>`_.
+
 Reading files from NFS
 ~~~~~~~~~~~~~~~~~~~~~~
 

diff --git a/doc/source/data/saving-data.rst b/doc/source/data/saving-data.rst
@@ -47,6 +47,8 @@ with your cloud service provider. Then, call a method like
 :meth:`Dataset.write_parquet <ray.data.Dataset.write_parquet>` and specify a URI with
 the appropriate scheme. URI can point to buckets or folders.
 
+To write data to formats other than Parquet, read the :ref:`Input/Output reference <input-output>`.
+
 .. tab-set::
 
  .. tab-item:: S3
@@ -62,6 +64,10 @@ the appropriate scheme. URI can point to buckets or folders.
 
  ds.write_parquet("s3:https://my-bucket/my-folder")
 
+ Ray Data relies on PyArrow for authenticaion with Amazon S3. For more on how to configure
+ your credentials to be compatible with PyArrow, see their
+ `S3 Filesytem docs <https://arrow.apache.org/docs/python/filesystems.html#s3>`_.
+
  .. tab-item:: GCS
 
  To save data to Google Cloud Storage, install the
@@ -83,6 +89,10 @@ the appropriate scheme. URI can point to buckets or folders.
  filesystem = gcsfs.GCSFileSystem(project="my-google-project")
  ds.write_parquet("gcs:https://my-bucket/my-folder", filesystem=filesystem)
 
+ Ray Data relies on PyArrow for authenticaion with Google Cloud Storage. For more on how
+ to configure your credentials to be compatible with PyArrow, see their
+ `GCS Filesytem docs <https://arrow.apache.org/docs/python/filesystems.html#google-cloud-storage-file-system>`_.
+
  .. tab-item:: ABS
 
  To save data to Azure Blob Storage, install the
@@ -104,8 +114,9 @@ the appropriate scheme. URI can point to buckets or folders.
  filesystem = adlfs.AzureBlobFileSystem(account_name="azureopendatastorage")
  ds.write_parquet("az:https://my-bucket/my-folder", filesystem=filesystem)
 
-To write data to formats other than Parquet, read the
-:ref:`Input/Output reference <input-output>`.
+ Ray Data relies on PyArrow for authenticaion with Azure Blob Storage. For more on how
+ to configure your credentials to be compatible with PyArrow, see their
+ `fsspec-compatible filesystems docs <https://arrow.apache.org/docs/python/filesystems.html#using-fsspec-compatible-filesystems-with-arrow>`_.
 
 Writing data to NFS
 ~~~~~~~~~~~~~~~~~~~