Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-29668][Gelly] Remove Gelly #21096

Merged
merged 3 commits into from
Oct 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
14 changes: 14 additions & 0 deletions docs/content.zh/docs/dev/dataset/cluster_execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,20 @@ specific language governing permissions and limitations
under the License.
-->

{{< hint warning >}}
Starting with Flink 1.12 the DataSet API has been soft deprecated.

We recommend that you use the [Table API and SQL]({{< ref "docs/dev/table/overview" >}}) to run efficient
batch pipelines in a fully unified API. Table API is well integrated with common batch connectors and
catalogs.

Alternatively, you can also use the DataStream API with `BATCH` [execution mode]({{< ref "docs/dev/datastream/execution_mode" >}}).
The linked section also outlines cases where it makes sense to use the DataSet API but those cases will
become rarer as development progresses and the DataSet API will eventually be removed. Please also
see [FLIP-131](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741) for
background information on this decision.
{{< /hint >}}

# 集群执行


Expand Down
14 changes: 14 additions & 0 deletions docs/content.zh/docs/dev/dataset/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,20 @@ specific language governing permissions and limitations
under the License.
-->

{{< hint warning >}}
Starting with Flink 1.12 the DataSet API has been soft deprecated.

We recommend that you use the [Table API and SQL]({{< ref "docs/dev/table/overview" >}}) to run efficient
batch pipelines in a fully unified API. Table API is well integrated with common batch connectors and
catalogs.

Alternatively, you can also use the DataStream API with `BATCH` [execution mode]({{< ref "docs/dev/datastream/execution_mode" >}}).
The linked section also outlines cases where it makes sense to use the DataSet API but those cases will
become rarer as development progresses and the DataSet API will eventually be removed. Please also
see [FLIP-131](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741) for
background information on this decision.
{{< /hint >}}

# Batch 示例

以下示例展示了 Flink 从简单的WordCount到图算法的应用。示例代码展示了 [Flink's DataSet API]({{< ref "docs/dev/dataset/overview" >}}) 的使用。
Expand Down
14 changes: 14 additions & 0 deletions docs/content.zh/docs/dev/dataset/hadoop_compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,20 @@ specific language governing permissions and limitations
under the License.
-->

{{< hint warning >}}
Starting with Flink 1.12 the DataSet API has been soft deprecated.

We recommend that you use the [Table API and SQL]({{< ref "docs/dev/table/overview" >}}) to run efficient
batch pipelines in a fully unified API. Table API is well integrated with common batch connectors and
catalogs.

Alternatively, you can also use the DataStream API with `BATCH` [execution mode]({{< ref "docs/dev/datastream/execution_mode" >}}).
The linked section also outlines cases where it makes sense to use the DataSet API but those cases will
become rarer as development progresses and the DataSet API will eventually be removed. Please also
see [FLIP-131](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741) for
background information on this decision.
{{< /hint >}}

# Hadoop 兼容

Flink is compatible with Apache Hadoop MapReduce interfaces and therefore allows
Expand Down
21 changes: 20 additions & 1 deletion docs/content.zh/docs/dev/dataset/iterations.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,32 @@ specific language governing permissions and limitations
under the License.
-->

{{< hint warning >}}
Starting with Flink 1.12 the DataSet API has been soft deprecated.

We recommend that you check out [Flink ML Iterations](https://nightlies.apache.org/flink/flink-ml-docs-stable/docs/development/iteration/)
as a potential replacement.

We recommend that you use the [Table API and SQL]({{< ref "docs/dev/table/overview" >}}) to run efficient
batch pipelines in a fully unified API. Table API is well integrated with common batch connectors and
catalogs.

Alternatively, you can also use the DataStream API with `BATCH` [execution mode]({{< ref "docs/dev/datastream/execution_mode" >}}).
The linked section also outlines cases where it makes sense to use the DataSet API but those cases will
become rarer as development progresses and the DataSet API will eventually be removed. Please also
see [FLIP-131](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741) for
background information on this decision.
{{< /hint >}}

# 迭代

Iterative algorithms occur in many domains of data analysis, such as *machine learning* or *graph analysis*. Such algorithms are crucial in order to realize the promise of Big Data to extract meaningful information out of your data. With increasing interest to run these kinds of algorithms on very large data sets, there is a need to execute iterations in a massively parallel fashion.

Flink programs implement iterative algorithms by defining a **step function** and embedding it into a special iteration operator. There are two variants of this operator: **Iterate** and **Delta Iterate**. Both operators repeatedly invoke the step function on the current iteration state until a certain termination condition is reached.

Here, we provide background on both operator variants and outline their usage. The [programming guide](index.html) explains how to implement the operators in both Scala and Java. We also support both **vertex-centric and gather-sum-apply iterations** through Flink's graph processing API, [Gelly]({{< ref "docs/libs/gelly/overview" >}}).
Here, we provide background on both operator variants and outline their usage.
The [programming guide]({{< ref "docs/dev/dataset/overview" >}}) explains how to implement the
operators in both Scala and Java.

The following table provides an overview of both operators:

Expand Down
14 changes: 14 additions & 0 deletions docs/content.zh/docs/dev/dataset/local_execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,20 @@ specific language governing permissions and limitations
under the License.
-->

{{< hint warning >}}
Starting with Flink 1.12 the DataSet API has been soft deprecated.

We recommend that you use the [Table API and SQL]({{< ref "docs/dev/table/overview" >}}) to run efficient
batch pipelines in a fully unified API. Table API is well integrated with common batch connectors and
catalogs.

Alternatively, you can also use the DataStream API with `BATCH` [execution mode]({{< ref "docs/dev/datastream/execution_mode" >}}).
The linked section also outlines cases where it makes sense to use the DataSet API but those cases will
become rarer as development progresses and the DataSet API will eventually be removed. Please also
see [FLIP-131](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741) for
background information on this decision.
{{< /hint >}}

# 本地执行

Flink can run on a single machine, even in a single Java Virtual Machine. This allows users to test and debug Flink programs locally. This section gives an overview of the local execution mechanisms.
Expand Down
14 changes: 14 additions & 0 deletions docs/content.zh/docs/dev/dataset/transformations.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,20 @@ specific language governing permissions and limitations
under the License.
-->

{{< hint warning >}}
Starting with Flink 1.12 the DataSet API has been soft deprecated.

We recommend that you use the [Table API and SQL]({{< ref "docs/dev/table/overview" >}}) to run efficient
batch pipelines in a fully unified API. Table API is well integrated with common batch connectors and
catalogs.

Alternatively, you can also use the DataStream API with `BATCH` [execution mode]({{< ref "docs/dev/datastream/execution_mode" >}}).
The linked section also outlines cases where it makes sense to use the DataSet API but those cases will
become rarer as development progresses and the DataSet API will eventually be removed. Please also
see [FLIP-131](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741) for
background information on this decision.
{{< /hint >}}

# DataSet Transformations

This document gives a deep-dive into the available transformations on DataSets. For a general introduction to the
Expand Down
14 changes: 14 additions & 0 deletions docs/content.zh/docs/dev/dataset/zip_elements_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,20 @@ specific language governing permissions and limitations
under the License.
-->

{{< hint warning >}}
Starting with Flink 1.12 the DataSet API has been soft deprecated.

We recommend that you use the [Table API and SQL]({{< ref "docs/dev/table/overview" >}}) to run efficient
batch pipelines in a fully unified API. Table API is well integrated with common batch connectors and
catalogs.

Alternatively, you can also use the DataStream API with `BATCH` [execution mode]({{< ref "docs/dev/datastream/execution_mode" >}}).
The linked section also outlines cases where it makes sense to use the DataSet API but those cases will
become rarer as development progresses and the DataSet API will eventually be removed. Please also
see [FLIP-131](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741) for
background information on this decision.
{{< /hint >}}

<a name="zipping-elements-in-a-dataset"></a>

# 给 DataSet 中的元素编号
Expand Down
4 changes: 3 additions & 1 deletion docs/content.zh/docs/dev/table/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,9 @@ under the License.

Apache Flink 有两种关系型 API 来做流批统一处理:Table API 和 SQL。Table API 是用于 Scala 和 Java 语言的查询API,它可以用一种非常直观的方式来组合使用选取、过滤、join 等关系型算子。Flink SQL 是基于 [Apache Calcite](https://calcite.apache.org) 来实现的标准 SQL。无论输入是连续的(流式)还是有界的(批处理),在两个接口中指定的查询都具有相同的语义,并指定相同的结果。

Table API 和 SQL 两种 API 是紧密集成的,以及 DataStream API。你可以在这些 API 之间,以及一些基于这些 API 的库之间轻松的切换。比如,你可以先用 [CEP]({{< ref "docs/libs/cep" >}}) 从 DataStream 中做模式匹配,然后用 Table API 来分析匹配的结果;或者你可以用 SQL 来扫描、过滤、聚合一个批式的表,然后再跑一个 [Gelly 图算法]({{< ref "docs/libs/gelly/overview" >}}) 来处理已经预处理好的数据。
Table API 和 SQL 两种 API 是紧密集成的,以及 DataStream API。你可以在这些 API 之间,以及一些基于这些 API 的库之间轻松的切换。
For instance, you can detect patterns from a table using [`MATCH_RECOGNIZE` clause]({{< ref "docs/dev/table/sql/queries/match_recognize" >}})
and later use the DataStream API to build alerting based on the matched patterns.

## Table 程序依赖

Expand Down
23 changes: 0 additions & 23 deletions docs/content.zh/docs/libs/gelly/_index.md

This file was deleted.

186 changes: 0 additions & 186 deletions docs/content.zh/docs/libs/gelly/bipartite_graph.md

This file was deleted.

Loading