Skip to content

Commit

Permalink
[FLINK-22880][table] Remove 'blink' term from code base
Browse files Browse the repository at this point in the history
This removes all mentionings of the term "blink" in the code
base. In order to reduce user confusion, do not use this term
anymore but refer to as "Flink SQL" or "Flink Table API".

This closes apache#16374.
  • Loading branch information
twalthr committed Jul 6, 2021
1 parent 012dc6a commit 312fe4c
Show file tree
Hide file tree
Showing 147 changed files with 315 additions and 432 deletions.
3 changes: 1 addition & 2 deletions docs/content.zh/docs/connectors/table/hive/hive_catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,6 @@ Add all Hive dependencies to `/lib` dir in Flink distribution, and modify SQL CL
```yaml

execution:
planner: blink
type: streaming
...
current-catalog: myhive # set the HiveCatalog as the current catalog of the session
Expand Down Expand Up @@ -394,4 +393,4 @@ Something to note about the type mapping:
## Scala Shell
NOTE: since blink planner is not well supported in Scala Shell at the moment, it's **NOT** recommended to use Hive connector in Scala Shell.
Note: It's **NOT** recommended to use the Hive connector in the Scala Shell.
1 change: 0 additions & 1 deletion docs/content.zh/docs/connectors/table/hive/hive_dialect.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,6 @@ SQL 方言可以通过 `table.sql-dialect` 属性指定。因此你可以通过
```yaml

execution:
planner: blink
type: batch
result-mode: table

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,6 @@ To use a Hive User Defined Function, user have to

- set a HiveCatalog backed by Hive Metastore that contains that function as current catalog of the session
- include a jar that contains that function in Flink's classpath
- use Blink planner.

## Using Hive User Defined Functions

Expand Down
5 changes: 0 additions & 5 deletions docs/content.zh/docs/connectors/table/hive/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,6 @@ Flink 与 Hive 的集成包含两个层面。
`HiveCatalog`的设计提供了与 Hive 良好的兼容性,用户可以"开箱即用"的访问其已有的 Hive 数仓。
您不需要修改现有的 Hive Metastore,也不需要更改表的数据位置或分区。

* 我们强烈建议用户使用 [Blink planner]({{< ref "docs/dev/table/overview" >}}#dependency-structure) 与 Hive 集成。

## 支持的Hive版本

Flink 支持一下的 Hive 版本。
Expand Down Expand Up @@ -303,8 +301,6 @@ export HADOOP_CLASSPATH=`hadoop classpath`

通过 TableEnvironment 或者 YAML 配置,使用 [Catalog 接口]({{< ref "docs/dev/table/catalogs" >}}) 和 [HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}})连接到现有的 Hive 集群。

请注意,虽然 HiveCatalog 不需要特定的 planner,但读写Hive表仅适用于 Blink planner。因此,强烈建议您在连接到 Hive 仓库时使用 Blink planner。

以下是如何连接到 Hive 的示例:

{{< tabs "2ca7cad8-0b84-45db-92d9-a75abd8808e7" >}}
Expand Down Expand Up @@ -367,7 +363,6 @@ tableEnv.use_catalog("myhive")
```yaml

execution:
planner: blink
...
current-catalog: myhive # set the HiveCatalog as the current catalog of the session
current-database: mydatabase
Expand Down
1 change: 0 additions & 1 deletion docs/content.zh/docs/connectors/table/jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -414,7 +414,6 @@ t_env.use_catalog("mypg")
```yaml

execution:
planner: blink
...
current-catalog: mypg # 设置 JdbcCatalog 为会话的当前 catalog
current-database: mydb
Expand Down
12 changes: 4 additions & 8 deletions docs/content.zh/docs/dev/python/table/intro_to_table_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,11 +91,11 @@ table_env.execute_sql("INSERT INTO print SELECT * FROM datagen").wait()
```python
from pyflink.table import EnvironmentSettings, TableEnvironment

# create a blink streaming TableEnvironment
# create a streaming TableEnvironment
env_settings = EnvironmentSettings.in_streaming_mode()
table_env = TableEnvironment.create(env_settings)

# or create a blink batch TableEnvironment
# or create a batch TableEnvironment
env_settings = EnvironmentSettings.in_batch_mode()
table_env = TableEnvironment.create(env_settings)
```
Expand All @@ -112,10 +112,6 @@ table_env = TableEnvironment.create(env_settings)
* 管理 Python 依赖,更多细节可查阅 [依赖管理]({{< ref "docs/dev/python/dependency_management" >}})
* 提交作业执行

目前有2个可用的执行器 : flink 执行器 和 blink 执行器。

你应该在当前程序中显式地设置使用哪个执行器,建议尽可能使用 blink 执行器。

{{< top >}}

创建表
Expand All @@ -132,7 +128,7 @@ table_env = TableEnvironment.create(env_settings)
```python
from pyflink.table import EnvironmentSettings, TableEnvironment

# 创建 blink 批 TableEnvironment
# 创建 批 TableEnvironment
env_settings = EnvironmentSettings.in_batch_mode()
table_env = TableEnvironment.create(env_settings)

Expand Down Expand Up @@ -196,7 +192,7 @@ print('Now the type of the "id" column is %s.' % type)
```python
from pyflink.table import EnvironmentSettings, TableEnvironment

# 创建 blink 流 TableEnvironment
# 创建 流 TableEnvironment
env_settings = EnvironmentSettings.in_streaming_mode()
table_env = TableEnvironment.create(env_settings)

Expand Down
2 changes: 1 addition & 1 deletion docs/content.zh/docs/dev/python/table/table_environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ table_env = TableEnvironment.create(env_settings)
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.table import StreamTableEnvironment

# create a blink streaming TableEnvironment from a StreamExecutionEnvironment
# create a streaming TableEnvironment from a StreamExecutionEnvironment
env = StreamExecutionEnvironment.get_execution_environment()
table_env = StreamTableEnvironment.create(env)
```
Expand Down
5 changes: 2 additions & 3 deletions docs/content.zh/docs/dev/python/table/udfs/python_udfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ def iterable_func(x):

A user-defined aggregate function (_UDAGG_) maps scalar values of multiple rows to a new scalar value.

**NOTE:** Currently the general user-defined aggregate function is only supported in the GroupBy aggregation and Group Window Aggregation of the blink planner in streaming mode. For batch mode, it's currently not supported and it is recommended to use the [Vectorized Aggregate Functions]({{< ref "docs/dev/python/table/udfs/vectorized_python_udfs" >}}#vectorized-aggregate-functions).
**NOTE:** Currently the general user-defined aggregate function is only supported in the GroupBy aggregation and Group Window Aggregation in streaming mode. For batch mode, it's currently not supported and it is recommended to use the [Vectorized Aggregate Functions]({{< ref "docs/dev/python/table/udfs/vectorized_python_udfs" >}}#vectorized-aggregate-functions).

The behavior of an aggregate function is centered around the concept of an accumulator. The _accumulator_
is an intermediate data structure that stores the aggregated values until a final aggregation result
Expand Down Expand Up @@ -416,8 +416,7 @@ A user-defined table aggregate function (_UDTAGG_) maps scalar values of multipl
The returned record may consist of one or more fields. If an output record consists of only a single field,
the structured record can be omitted, and a scalar value can be emitted that will be implicitly wrapped into a row by the runtime.

**NOTE:** Currently the general user-defined table aggregate function is only supported in the GroupBy aggregation
of the blink planner in streaming mode.
**NOTE:** Currently the general user-defined table aggregate function is only supported in the GroupBy aggregation in streaming mode.

Similar to an [aggregate function](#aggregate-functions), the behavior of a table aggregate is centered around the concept of an accumulator.
The accumulator is an intermediate data structure that stores the aggregated values until a final aggregation result is computed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,6 @@ table_env.sql_query("SELECT add(bigint, bigint) FROM MyTable")
<span class="label label-info">注意</span> 向量化聚合函数不支持部分聚合,而且一个组或者窗口内的所有数据,
在执行的过程中,会被同时加载到内存,所以需要确保所配置的内存大小足够容纳这些数据。

<span class="label label-info">注意</span> 向量化聚合函数只支持运行在 Blink Planner 上。

以下示例显示了如何定一个自己的向量化聚合函数,该函数计算一列的平均值,并在 `GroupBy Aggregation`, `GroupBy Window Aggregation`
and `Over Window Aggregation` 使用它:

Expand Down
6 changes: 3 additions & 3 deletions docs/content.zh/docs/dev/table/common.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,11 +183,11 @@ val tEnv = TableEnvironment.create(settings)
```python
from pyflink.table import EnvironmentSettings, TableEnvironment

# create a blink streaming TableEnvironment
# create a streaming TableEnvironment
env_settings = EnvironmentSettings.in_streaming_mode()
table_env = TableEnvironment.create(env_settings)

# create a blink batch TableEnvironment
# create a batch TableEnvironment
env_settings = EnvironmentSettings.in_batch_mode()
table_env = TableEnvironment.create(env_settings)

Expand Down Expand Up @@ -310,7 +310,7 @@ table_env.register_table("projectedTable", proj_table)

**注意:** 从传统数据库系统的角度来看,`Table` 对象与 `VIEW` 视图非常像。也就是,定义了 `Table` 的查询是没有被优化的,
而且会被内嵌到另一个引用了这个注册了的 `Table`的查询中。如果多个查询都引用了同一个注册了的`Table`,那么它会被内嵌每个查询中并被执行多次,
也就是说注册了的`Table`的结果**不会**被共享(注:Blink 计划器的`TableEnvironment`会优化成只执行一次)
也就是说注册了的`Table`的结果**不会**被共享。

{{< top >}}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,6 @@ Yen 1

时态表
-----
<span class="label label-danger">注意</span> 仅 Blink planner 支持此功能。

Flink 使用主键约束和事件时间来定义一张版本表和版本视图。

Expand Down
2 changes: 0 additions & 2 deletions docs/content.zh/docs/dev/table/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,6 @@ Flink SQL> SET 'table.exec.mini-batch.size' = '5000';
{{< /tab >}}
{{< /tabs >}}

<span class="label label-danger">注意</span> 目前,key-value 配置项仅被 Blink planner 支持。

### 执行配置

以下选项可用于优化查询执行的性能。
Expand Down
6 changes: 3 additions & 3 deletions docs/content.zh/docs/dev/table/sql/set.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,11 @@ The following examples show how to run a `SET` statement in SQL CLI.
{{< tabs "set" >}}
{{< tab "SQL CLI" >}}
```sql
Flink SQL> SET 'table.planner' = 'blink';
Flink SQL> SET 'table.local-time-zone' = 'Europe/Berlin';
[INFO] Session property has been set.

Flink SQL> SET;
'table.planner' = 'blink'
'table.local-time-zone' = 'Europe/Berlin'
```
{{< /tab >}}
{{< /tabs >}}
Expand All @@ -58,6 +58,6 @@ Flink SQL> SET;
SET ('key' = 'value')?
```

If no key and value are specified, it just print all the properties. Otherwise, set the key with specified value.
If no key and value are specified, it just prints all the properties. Otherwise, set the key with specified value.

{{< top >}}
2 changes: 1 addition & 1 deletion docs/content/docs/connectors/table/formats/raw.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ The Raw format allows to read and write raw (byte based) values as a single colu

Note: this format encodes `null` values as `null` of `byte[]` type. This may have limitation when used in `upsert-kafka`, because `upsert-kafka` treats `null` values as a tombstone message (DELETE on the key). Therefore, we recommend avoiding using `upsert-kafka` connector and the `raw` format as a `value.format` if the field can have a `null` value.

The Raw connector is built-in into the Blink planner, no additional dependencies are required.
The Raw connector is built-in, no additional dependencies are required.

Example
----------------
Expand Down
3 changes: 1 addition & 2 deletions docs/content/docs/connectors/table/hive/hive_catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,6 @@ Add all Hive dependencies to `/lib` dir in Flink distribution, and modify SQL CL
```yaml

execution:
planner: blink
type: streaming
...
current-catalog: myhive # set the HiveCatalog as the current catalog of the session
Expand Down Expand Up @@ -394,4 +393,4 @@ Something to note about the type mapping:
## Scala Shell
NOTE: since blink planner is not well supported in Scala Shell at the moment, it's **NOT** recommended to use Hive connector in Scala Shell.
Note: It's **NOT** recommended to use Hive connector in Scala Shell.
5 changes: 2 additions & 3 deletions docs/content/docs/connectors/table/hive/hive_dialect.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ the `configuration` section of the yaml file for your SQL Client.
```yaml

execution:
planner: blink
type: batch
result-mode: table

Expand All @@ -59,10 +58,10 @@ You can also set the dialect after the SQL Client has launched.

```bash

Flink SQL> set table.sql-dialect=hive; -- to use hive dialect
Flink SQL> SET 'table.sql-dialect' = 'hive'; -- to use hive dialect
[INFO] Session property has been set.

Flink SQL> set table.sql-dialect=default; -- to use default dialect
Flink SQL> SET 'table.sql-dialect' = 'default'; -- to use default dialect
[INFO] Session property has been set.

```
Expand Down
1 change: 0 additions & 1 deletion docs/content/docs/connectors/table/hive/hive_functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,6 @@ To use a Hive User Defined Function, user have to

- set a HiveCatalog backed by Hive Metastore that contains that function as current catalog of the session
- include a jar that contains that function in Flink's classpath
- use Blink planner.

## Using Hive User Defined Functions

Expand Down
8 changes: 0 additions & 8 deletions docs/content/docs/connectors/table/hive/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,6 @@ The second is to offer Flink as an alternative engine for reading and writing Hi
The `HiveCatalog` is designed to be “out of the box” compatible with existing Hive installations.
You do not need to modify your existing Hive Metastore or change the data placement or partitioning of your tables.

* Note that we highly recommend users using the [blink planner]({{< ref "docs/dev/table/overview" >}}#dependency-structure) with Hive integration.



## Supported Hive Versions

Flink supports the following Hive versions.
Expand Down Expand Up @@ -309,9 +305,6 @@ You're supposed to add dependencies as stated above at runtime.
Connect to an existing Hive installation using the [catalog interface]({{< ref "docs/dev/table/catalogs" >}})
and [HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}) through the table environment or YAML configuration.

Please note while HiveCatalog doesn't require a particular planner, reading/writing Hive tables only works with blink planner.
Therefore it's highly recommended that you use blink planner when connecting to your Hive warehouse.

Following is an example of how to connect to Hive:

{{< tabs "5d3cc7e1-a304-4f9e-b36e-ff1f32394ec7" >}}
Expand Down Expand Up @@ -374,7 +367,6 @@ tableEnv.use_catalog("myhive")
```yaml

execution:
planner: blink
...
current-catalog: myhive # set the HiveCatalog as the current catalog of the session
current-database: mydatabase
Expand Down
1 change: 0 additions & 1 deletion docs/content/docs/connectors/table/jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -413,7 +413,6 @@ t_env.use_catalog("mypg")
```yaml

execution:
planner: blink
...
current-catalog: mypg # set the JdbcCatalog as the current catalog of the session
current-database: mydb
Expand Down
13 changes: 4 additions & 9 deletions docs/content/docs/dev/python/table/intro_to_table_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,11 +89,11 @@ The `TableEnvironment` is a central concept of the Table API and SQL integration
```python
from pyflink.table import EnvironmentSettings, TableEnvironment

# create a blink streaming TableEnvironment
# create a streaming TableEnvironment
env_settings = EnvironmentSettings.in_streaming_mode()
table_env = TableEnvironment.create(env_settings)

# or create a blink batch TableEnvironment
# or create a batch TableEnvironment
env_settings = EnvironmentSettings.in_batch_mode()
table_env = TableEnvironment.create(env_settings)
```
Expand All @@ -110,11 +110,6 @@ The `TableEnvironment` is responsible for:
* Managing Python dependencies, see [Dependency Management]({{< ref "docs/dev/python/dependency_management" >}}) for more details
* Submitting the jobs for execution

Currently there are 2 planners available: flink planner and blink planner.

You should explicitly set which planner to use in the current program.
We recommend using the blink planner as much as possible.

{{< top >}}

Create Tables
Expand All @@ -131,7 +126,7 @@ You can create a Table from a list object:
```python
from pyflink.table import EnvironmentSettings, TableEnvironment

# create a blink batch TableEnvironment
# create a batch TableEnvironment
env_settings = EnvironmentSettings.in_batch_mode()
table_env = TableEnvironment.create(env_settings)

Expand Down Expand Up @@ -195,7 +190,7 @@ You can create a Table using connector DDL:
```python
from pyflink.table import EnvironmentSettings, TableEnvironment

# create a blink stream TableEnvironment
# create a stream TableEnvironment
env_settings = EnvironmentSettings.in_streaming_mode()
table_env = TableEnvironment.create(env_settings)

Expand Down
2 changes: 1 addition & 1 deletion docs/content/docs/dev/python/table/table_environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Alternatively, users can create a `StreamTableEnvironment` from an existing `Str
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.table import StreamTableEnvironment

# create a blink streaming TableEnvironment from a StreamExecutionEnvironment
# create a streaming TableEnvironment from a StreamExecutionEnvironment
env = StreamExecutionEnvironment.get_execution_environment()
table_env = StreamTableEnvironment.create(env)
```
Expand Down
5 changes: 2 additions & 3 deletions docs/content/docs/dev/python/table/udfs/python_udfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ def iterable_func(x):

A user-defined aggregate function (_UDAGG_) maps scalar values of multiple rows to a new scalar value.

**NOTE:** Currently the general user-defined aggregate function is only supported in the GroupBy aggregation and Group Window Aggregation of the blink planner in streaming mode. For batch mode, it's currently not supported and it is recommended to use the [Vectorized Aggregate Functions]({{< ref "docs/dev/python/table/udfs/vectorized_python_udfs" >}}#vectorized-aggregate-functions).
**NOTE:** Currently the general user-defined aggregate function is only supported in the GroupBy aggregation and Group Window Aggregation in streaming mode. For batch mode, it's currently not supported and it is recommended to use the [Vectorized Aggregate Functions]({{< ref "docs/dev/python/table/udfs/vectorized_python_udfs" >}}#vectorized-aggregate-functions).

The behavior of an aggregate function is centered around the concept of an accumulator. The _accumulator_
is an intermediate data structure that stores the aggregated values until a final aggregation result
Expand Down Expand Up @@ -417,8 +417,7 @@ A user-defined table aggregate function (_UDTAGG_) maps scalar values of multipl
The returned record may consist of one or more fields. If an output record consists of only a single field,
the structured record can be omitted, and a scalar value can be emitted that will be implicitly wrapped into a row by the runtime.

**NOTE:** Currently the general user-defined table aggregate function is only supported in the GroupBy aggregation
of the blink planner in streaming mode.
**NOTE:** Currently the general user-defined table aggregate function is only supported in the GroupBy aggregation in streaming mode.

Similar to an [aggregate function](#aggregate-functions), the behavior of a table aggregate is centered around the concept of an accumulator.
The accumulator is an intermediate data structure that stores the aggregated values until a final aggregation result is computed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,6 @@ to [the relevant documentation]({{< ref "docs/dev/table/tableApi" >}}?code_tab=p

<span class="label label-info">Note</span> Pandas UDAF does not support partial aggregation. Besides, all the data for a group or window will be loaded into memory at the same time during execution and so you must make sure that the data of a group or window could fit into the memory.

<span class="label label-info">Note</span> Pandas UDAF is only supported in Blink Planner.

The following example shows how to define your own vectorized Python aggregate function which computes mean,
and use it in `GroupBy Aggregation`, `GroupBy Window Aggregation` and `Over Window Aggregation`:

Expand Down
4 changes: 2 additions & 2 deletions docs/content/docs/dev/table/common.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,11 +183,11 @@ val tEnv = TableEnvironment.create(settings)
```python
from pyflink.table import EnvironmentSettings, TableEnvironment

# create a blink streaming TableEnvironment
# create a streaming TableEnvironment
env_settings = EnvironmentSettings.in_streaming_mode()
table_env = TableEnvironment.create(env_settings)

# create a blink batch TableEnvironment
# create a batch TableEnvironment
env_settings = EnvironmentSettings.in_batch_mode()
table_env = TableEnvironment.create(env_settings)

Expand Down
Loading

0 comments on commit 312fe4c

Please sign in to comment.