Skip to content

Commit

Permalink
[FLINK-22518][docs-zh] Translate the documents of High Availability i…
Browse files Browse the repository at this point in the history
…nto Chinese(#16084) (#16084)

This fix #16084.
  • Loading branch information
movesan committed Jun 18, 2021
1 parent 3ae6801 commit fc73b3f
Show file tree
Hide file tree
Showing 4 changed files with 90 additions and 108 deletions.
4 changes: 2 additions & 2 deletions docs/content.zh/docs/deployment/ha/_index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: High Availablity
title: 高可用
bookCollapseSection: true
weight: 6
---
Expand All @@ -20,4 +20,4 @@ software distributed under the License is distributed on an
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
-->
49 changes: 22 additions & 27 deletions docs/content.zh/docs/deployment/ha/kubernetes_ha.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Kubernetes HA Services
title: Kubernetes 高可用服务
weight: 3
type: docs
aliases:
Expand All @@ -24,52 +24,50 @@ specific language governing permissions and limitations
under the License.
-->

# Kubernetes HA Services
# Kubernetes 高可用服务

Flink's Kubernetes HA services use [Kubernetes](https://kubernetes.io/) for high availability services.
FlinkKubernetes 高可用模式使用 [Kubernetes](https://kubernetes.io/) 提供高可用服务。

Kubernetes high availability services can only be used when deploying to Kubernetes.
Consequently, they can be configured when using [standalone Flink on Kubernetes]({{< ref "docs/deployment/resource-providers/standalone/kubernetes" >}}) or the [native Kubernetes integration]({{< ref "docs/deployment/resource-providers/native_kubernetes" >}})
Kubernetes 高可用服务只能在部署到 Kubernetes 时使用。因此,当使用 [在 Kubernetes 上单节点部署 Flink]({{< ref "docs/deployment/resource-providers/standalone/kubernetes" >}}) 或 [Flink 原生 Kubernetes 集成]({{< ref "docs/deployment/resource-providers/native_kubernetes" >}}) 两种模式时,可以对它们进行配置。

## Prerequisites
## 准备

In order to use Flink's Kubernetes HA services you must fulfill the following prerequisites:
为了使用 FlinkKubernetes 高可用服务,你必须满足以下先决条件:

- Kubernetes >= 1.9.
- Service account with permissions to create, edit, delete ConfigMaps.
Take a look at how to configure a service account for [Flink's native Kubernetes integration]({{< ref "docs/deployment/resource-providers/native_kubernetes" >}}#rbac) and [standalone Flink on Kubernetes]({{< ref "docs/deployment/resource-providers/standalone/kubernetes" >}}#kubernetes-high-availability-services) for more information.
- 具有创建、编辑、删除 ConfigMaps 权限的服务帐户。想了解更多信息,请查看如何在 [Flink 原生 Kubernetes 集成]({{< ref "docs/deployment/resource-providers/native_kubernetes" >}}#rbac) 和 [在 Kubernetes 上单节点部署 Flink]({{< ref "docs/deployment/resource-providers/standalone/kubernetes" >}}#kubernetes-high-availability-services) 两种模式中配置服务帐户。


## Configuration
## 配置

In order to start an HA-cluster you have to configure the following configuration keys:
为了启用高可用集群(HA-cluster),你必须设置以下配置项:

- [high-availability]({{< ref "docs/deployment/config" >}}#high-availability-1) (required):
The `high-availability` option has to be set to `KubernetesHaServicesFactory`.
- [high-availability]({{< ref "docs/deployment/config" >}}#high-availability-1) (必要的):
`high-availability` 选项必须设置为 `KubernetesHaServicesFactory`.

```yaml
high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
```
- [high-availability.storageDir]({{< ref "docs/deployment/config" >}}#high-availability-storagedir) (required):
JobManager metadata is persisted in the file system `high-availability.storageDir` and only a pointer to this state is stored in Kubernetes.
- [high-availability.storageDir]({{< ref "docs/deployment/config" >}}#high-availability-storagedir) (必要的):
JobManager 元数据持久化到文件系统 `high-availability.storageDir` 配置的路径中,并且在 Kubernetes 中只能有一个目录指向此位置。

```yaml
high-availability.storageDir: s3:https:///flink/recovery
```

The `storageDir` stores all metadata needed to recover a JobManager failure.
- [kubernetes.cluster-id]({{< ref "docs/deployment/config" >}}#kubernetes-cluster-id) (required):
In order to identify the Flink cluster, you have to specify a `kubernetes.cluster-id`.
`storageDir` 存储要从 JobManager 失败恢复时所需的所有元数据。

- [kubernetes.cluster-id]({{< ref "docs/deployment/config" >}}#kubernetes-cluster-id) (必要的):
为了识别 Flink 集群,你必须指定 `kubernetes.cluster-id`

```yaml
kubernetes.cluster-id: cluster1337
```

### Example configuration
### 配置示例

Configure high availability mode in `conf/flink-conf.yaml`:
`conf/flink-conf.yaml` 中配置高可用模式:

```yaml
kubernetes.cluster-id: <cluster-id>
Expand All @@ -79,11 +77,8 @@ high-availability.storageDir: hdfs:https:///flink/recovery

{{< top >}}

## High availability data clean up
## 高可用数据清理

To keep HA data while restarting the Flink cluster, simply delete the deployment (via `kubectl delete deployment <cluster-id>`).
All the Flink cluster related resources will be deleted (e.g. JobManager Deployment, TaskManager pods, services, Flink conf ConfigMap).
HA related ConfigMaps will be retained because they do not set the owner reference.
When restarting the cluster, all previously running jobs will be recovered and restarted from the latest successful checkpoint.
要在重新启动 Flink 集群时保留高可用数据,只需删除部署(通过 `kubectl delete deployment <cluster-id>`)。所有与 Flink 集群相关的资源将被删除(例如:JobManager Deployment、TaskManager pods、services、Flink conf ConfigMap)。高可用相关的 ConfigMaps 将被保留,因为它们没有设置所有者引用。当重新启动集群时,所有以前运行的作业将从最近成功的检查点恢复并重新启动。

{{< top >}}
{{< top >}}
54 changes: 24 additions & 30 deletions docs/content.zh/docs/deployment/ha/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,56 +26,50 @@ specific language governing permissions and limitations
under the License.
-->

# High Availability
# 高可用

JobManager High Availability (HA) hardens a Flink cluster against JobManager failures.
This feature ensures that a Flink cluster will always continue executing your submitted jobs.
JobManager 高可用(HA)模式加强了 Flink 集群防止 JobManager 故障的能力。
此特性确保 Flink 集群将始终持续执行你提交的作业。

## JobManager High Availability
## JobManager 高可用

The JobManager coordinates every Flink deployment.
It is responsible for both *scheduling* and *resource management*.
JobManager 协调每个 Flink 的部署。它同时负责 *调度**资源管理*

By default, there is a single JobManager instance per Flink cluster.
This creates a *single point of failure* (SPOF): if the JobManager crashes, no new programs can be submitted and running programs fail.
默认情况下,每个 Flink 集群只有一个 JobManager 实例。这会导致 *单点故障(SPOF)*:如果 JobManager 崩溃,则不能提交任何新程序,运行中的程序也会失败。

With JobManager High Availability, you can recover from JobManager failures and thereby eliminate the *SPOF*.
You can configure high availability for every cluster deployment.
See the [list of available high availability services](#high-availability-services) for more information.
使用 JobManager 高可用模式,你可以从 JobManager 失败中恢复,从而消除单点故障。你可以为每个集群部署配置高可用模式。
有关更多信息,请参阅 [高可用服务](#high-availability-services)

### How to make a cluster highly available
### 如何启用集群高可用

The general idea of JobManager High Availability is that there is a *single leading JobManager* at any time and *multiple standby JobManagers* to take over leadership in case the leader fails.
This guarantees that there is *no single point of failure* and programs can make progress as soon as a standby JobManager has taken leadership.
JobManager 高可用一般概念是指,在任何时候都有 *一个领导者 JobManager*,如果领导者出现故障,则有多个备用 JobManager 来接管领导。这保证了 *不存在单点故障*,只要有备用 JobManager 担任领导者,程序就可以继续运行。

As an example, consider the following setup with three JobManager instances:
如下是一个使用三个 JobManager 实例的例子:

{{< img src="/fig/jobmanager_ha_overview.png" class="center" >}}

Flink's [high availability services](#high-availability-services) encapsulate the required services to make everything work:
* **Leader election**: Selecting a single leader out of a pool of `n` candidates
* **Service discovery**: Retrieving the address of the current leader
* **State persistence**: Persisting state which is required for the successor to resume the job execution (JobGraphs, user code jars, completed checkpoints)
Flink[高可用服务](#high-availability-services) 封装了所需的服务,使一切可以正常工作:
* **领导者选举**:从 `n` 个候选者中选出一个领导者
* **服务发现**:检索当前领导者的地址
* **状态持久化**:继承程序恢复作业所需的持久化状态(JobGraphs、用户代码jar、已完成的检查点)

{{< top >}}

## High Availability Services
<a name="high-availability-services" />

Flink ships with two high availability service implementations:
## 高可用服务

* [ZooKeeper]({{< ref "docs/deployment/ha/zookeeper_ha" >}}):
ZooKeeper HA services can be used with every Flink cluster deployment.
They require a running ZooKeeper quorum.
Flink 提供了两种高可用服务实现:

* [Kubernetes]({{< ref "docs/deployment/ha/kubernetes_ha" >}}):
Kubernetes HA services only work when running on Kubernetes.

* [ZooKeeper]({{< ref "docs/deployment/ha/zookeeper_ha" >}}):每个 Flink 集群部署都可以使用 ZooKeeper HA 服务。它们需要一个运行的 ZooKeeper 复制组(quorum)。

* [Kubernetes]({{< ref "docs/deployment/ha/kubernetes_ha" >}}):Kubernetes HA 服务只能运行在 Kubernetes 上。

{{< top >}}

## High Availability data lifecycle
## 高可用数据生命周期

In order to recover submitted jobs, Flink persists metadata and the job artifacts.
The HA data will be kept until the respective job either succeeds, is cancelled or fails terminally.
Once this happens, all the HA data, including the metadata stored in the HA services, will be deleted.
为了恢复提交的作业,Flink 持久化元数据和 job 组件。高可用数据将一直保存,直到相应的作业执行成功、被取消或最终失败。当这些情况发生时,将删除所有高可用数据,包括存储在高可用服务中的元数据。

{{< top >}}
Loading

0 comments on commit fc73b3f

Please sign in to comment.