Skip to content

Commit

Permalink
[hotfix][docs][s3] Clarify wording of S3 Filesystem support
Browse files Browse the repository at this point in the history
This closes apache#12043
  • Loading branch information
rmetzger committed May 8, 2020
1 parent 4959ebf commit 7c5ac35
Showing 1 changed file with 8 additions and 3 deletions.
11 changes: 8 additions & 3 deletions docs/ops/filesystems/s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,10 +60,15 @@ Flink provides two file systems to talk to Amazon S3, `flink-s3-fs-presto` and `
Both implementations are self-contained with no dependency footprint, so there is no need to add Hadoop to the classpath to use them.

- `flink-s3-fs-presto`, registered under the scheme *s3:https://* and *s3p:https://*, is based on code from the [Presto project](https://prestodb.io/).
You can configure it the same way you can [configure the Presto file system](https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration) by placing adding the configurations to your `flink-conf.yaml`. Presto is the recommended file system for checkpointing to S3.
You can configure it using [the same configuration keys as the Presto file system](https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration), by adding the configurations to your `flink-conf.yaml`. The Presto S3 implementation is the recommended file system for checkpointing to S3.

- `flink-s3-fs-hadoop`, registered under *s3:https://* and *s3a:https://*, based on code from the [Hadoop Project](https://hadoop.apache.org/).
The file system can be [configured exactly like Hadoop's s3a](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A) by placing adding the configurations to your `flink-conf.yaml`. It is the only S3 file system with support for the [StreamingFileSink]({{ site.baseurl}}/dev/connectors/streamfile_sink.html).
The file system can be [configured using Hadoop's s3a configuration keys](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A) by adding the configurations to your `flink-conf.yaml`.

For example, Hadoop has a `fs.s3a.connection.maximum` configuration key. If you want to change it, you need to put `s3.connection.maximum: xyz` to the `flink-conf.yaml`. Flink will internally translate this back to `fs.s3a.connection.maximum`. There is no need to pass configuration parameters using Hadoop's XML configuration files.

It is the only S3 file system with support for the [StreamingFileSink]({{ site.baseurl}}/dev/connectors/streamfile_sink.html).


Both `flink-s3-fs-hadoop` and `flink-s3-fs-presto` register default FileSystem
wrappers for URIs with the *s3:https://* scheme, `flink-s3-fs-hadoop` also registers
Expand Down Expand Up @@ -111,7 +116,7 @@ s3.endpoint: your-endpoint-hostname

## Configure Path Style Access

Some of the S3 compliant object stores might not have virtual host style addressing enabled by default. In such cases, you will have to provide the property to enable path style access in `flink-conf.yaml`.
Some S3 compliant object stores might not have virtual host style addressing enabled by default. In such cases, you will have to provide the property to enable path style access in `flink-conf.yaml`.

{% highlight yaml %}
s3.path.style.access: true
Expand Down

0 comments on commit 7c5ac35

Please sign in to comment.