Skip to content

Commit

Permalink
Clarify default values behaviour for metrics in GCS
Browse files Browse the repository at this point in the history
  • Loading branch information
JTaky committed May 8, 2024
1 parent 07f00e1 commit 617b14b
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 38 deletions.
24 changes: 12 additions & 12 deletions docs/content.zh/docs/deployment/filesystems/gcs.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,18 +79,18 @@ You can also set `gcs-connector` options directly in the Hadoop `core-site.xml`

`flink-gs-fs-hadoop` can also be configured by setting the following options in [Flink configuration file]({{< ref "docs/deployment/config#flink-配置文件" >}}):

| Key | Description |
|---------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| gs.writer.temporary.bucket.name | Set this property to choose a bucket to hold temporary blobs for in-progress writes via `RecoverableWriter`. If this property is not set, temporary blobs will be written to same bucket as the final file being written. In either case, temporary blobs are written with the prefix `.inprogress/`. <br><br> It is recommended to choose a separate bucket in order to [assign it a TTL](https://cloud.google.com/storage/docs/lifecycle), to provide a mechanism to clean up orphaned blobs that can occur when restoring from check/savepoints.<br><br>If you do use a separate bucket with a TTL for temporary blobs, attempts to restart jobs from check/savepoints after the TTL interval expires may fail.
| gs.writer.chunk.size | Set this property to [set the chunk size](https://cloud.google.com/java/docs/reference/google-cloud-core/latest/com.google.cloud.WriteChannel#com_google_cloud_WriteChannel_setChunkSize_int_) for writes via `RecoverableWriter`. <br><br>If not set, a Google-determined default chunk size will be used. |
| gs.filesink.entropy.enabled | Set this property to improve performance due to hotspotting issues on GCS. This option defines whether to enable entropy injection in filesink gcs path. If this is enabled, entropy in the form of temporary object id will be injected in beginning of the gcs path of the temporary objects. The final object path remains unchanged. |
| gs.http.connect-timeout | Set this property to [set the connection timeout](https://cloud.google.com/java/docs/reference/google-cloud-core/latest/com.google.cloud.http.HttpTransportOptions.Builder#com_google_cloud_http_HttpTransportOptions_Builder_setConnectTimeout_int_) for java-storage client. |
| gs.http.read-timeout | Set this property to [set the content read timeout](https://cloud.google.com/java/docs/reference/google-cloud-core/latest/com.google.cloud.http.HttpTransportOptions.Builder#com_google_cloud_http_HttpTransportOptions_Builder_setReadTimeout_int_) from connection established via java-storage client. |
| gs.retry.max-attempt | Set this property to [define the maximum number of retry attempts](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getMaxAttempts__) to perform. |
| gs.retry.init-rpc-timeout | Set this property to [set the timeout](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getInitialRpcTimeout__) for the initial RPC. Subsequent calls will use this value adjusted according to the gs.retry.rpc-timeout-multiplier. |
| gs.retry.rpc-timeout-multiplier | Set this property to [controls the change in RPC timeout](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getRpcTimeoutMultiplier__). The timeout of the previous call is multiplied by the RpcTimeoutMultiplier to calculate the timeout for the next call. |
| gs.retry.max-rpc-timeout | Set this property to [put a limit](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getMaxRpcTimeout__) on the value of the RPC timeout, so that the max rpc timeout can't increase the RPC timeout higher than this amount. |
| gs.retry.total-timeout | Set this property to [change the total duration](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getTotalTimeout__) during which retries could be attempted. |
| Key | Description |
|---------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| gs.writer.temporary.bucket.name | Set this property to choose a bucket to hold temporary blobs for in-progress writes via `RecoverableWriter`. If this property is not set, temporary blobs will be written to same bucket as the final file being written. In either case, temporary blobs are written with the prefix `.inprogress/`. <br><br> It is recommended to choose a separate bucket in order to [assign it a TTL](https://cloud.google.com/storage/docs/lifecycle), to provide a mechanism to clean up orphaned blobs that can occur when restoring from check/savepoints.<br><br>If you do use a separate bucket with a TTL for temporary blobs, attempts to restart jobs from check/savepoints after the TTL interval expires may fail. |
| gs.writer.chunk.size | Set this property to [set the chunk size](https://cloud.google.com/java/docs/reference/google-cloud-core/latest/com.google.cloud.WriteChannel#com_google_cloud_WriteChannel_setChunkSize_int_) for writes via `RecoverableWriter`. <br><br>If not set, a Google-determined default chunk size will be used. |
| gs.filesink.entropy.enabled | Set this property to improve performance due to hotspotting issues on GCS. This option defines whether to enable entropy injection in filesink gcs path. If this is enabled, entropy in the form of temporary object id will be injected in beginning of the gcs path of the temporary objects. The final object path remains unchanged. |
| gs.http.connect-timeout | Set this property to [set the connection timeout](https://cloud.google.com/java/docs/reference/google-cloud-core/latest/com.google.cloud.http.HttpTransportOptions.Builder#com_google_cloud_http_HttpTransportOptions_Builder_setConnectTimeout_int_) for java-storage client. GCS default will be used if not configured. |
| gs.http.read-timeout | Set this property to [set the content read timeout](https://cloud.google.com/java/docs/reference/google-cloud-core/latest/com.google.cloud.http.HttpTransportOptions.Builder#com_google_cloud_http_HttpTransportOptions_Builder_setReadTimeout_int_) from connection established via java-storage client. GCS default will be used if not configured. |
| gs.retry.max-attempt | Set this property to [define the maximum number of retry attempts](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getMaxAttempts__) to perform. GCS default will be used if not configured. |
| gs.retry.init-rpc-timeout | Set this property to [set the timeout](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getInitialRpcTimeout__) for the initial RPC. Subsequent calls will use this value adjusted according to the gs.retry.rpc-timeout-multiplier. GCS default will be used if not configured. |
| gs.retry.rpc-timeout-multiplier | Set this property to [controls the change in RPC timeout](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getRpcTimeoutMultiplier__). The timeout of the previous call is multiplied by the RpcTimeoutMultiplier to calculate the timeout for the next call. GCS default will be used if not configured. |
| gs.retry.max-rpc-timeout | Set this property to [put a limit](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getMaxRpcTimeout__) on the value of the RPC timeout, so that the max rpc timeout can't increase the RPC timeout higher than this amount. GCS default will be used if not configured. |
| gs.retry.total-timeout | Set this property to [change the total duration](https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getTotalTimeout__) during which retries could be attempted. GCS default will be used if not configured. |

### Authentication to access GCS

Expand Down
Loading

0 comments on commit 617b14b

Please sign in to comment.