Skip to content

Commit

Permalink
[FLINK-19749][docs] Improve the documentation in 'Table API' page, e.…
Browse files Browse the repository at this point in the history
…g. typo, sync between the English and Chinese doc, etc

This closes apache#13791.
  • Loading branch information
RocMarshal authored and dianfu committed Oct 30, 2020
1 parent 0ff32f1 commit 9ee5d6d
Show file tree
Hide file tree
Showing 2 changed files with 92 additions and 88 deletions.
24 changes: 12 additions & 12 deletions docs/dev/table/tableApi.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ under the License.

The Table API is a unified, relational API for stream and batch processing. Table API queries can be run on batch or streaming input without modifications. The Table API is a super set of the SQL language and is specially designed for working with Apache Flink. The Table API is a language-integrated API for Scala, Java and Python. Instead of specifying queries as String values as common with SQL, Table API queries are defined in a language-embedded style in Java, Scala or Python with IDE support like autocompletion and syntax validation.

The Table API shares many concepts and parts of its API with Flink's SQL integration. Have a look at the [Common Concepts & API]({{ site.baseurl }}/dev/table/common.html) to learn how to register tables or to create a `Table` object. The [Streaming Concepts](./streaming) pages discuss streaming specific concepts such as dynamic tables and time attributes.
The Table API shares many concepts and parts of its API with Flink's SQL integration. Have a look at the [Common Concepts & API]({% link dev/table/common.md %}) to learn how to register tables or to create a `Table` object. The [Streaming Concepts](./streaming) pages discuss streaming specific concepts such as dynamic tables and time attributes.

The following examples assume a registered table called `Orders` with attributes `(a, b, c, rowtime)`. The `rowtime` field is either a logical [time attribute](./streaming/time_attributes.html) in streaming or a regular timestamp field in batch.

Expand Down Expand Up @@ -621,7 +621,7 @@ Table result = orders.addColumns(concat($("c"), "sunny"));
<span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span>
</td>
<td>
<p>Performs a field add operation. Existing fields will be replaced if add columns name is the same as the existing column name. Moreover, if the added fields have duplicate field name, then the last one is used. </p>
<p>Performs a field add operation. Existing fields will be replaced if the added column name is the same as the existing column name. Moreover, if the added fields have duplicate field name, then the last one is used. </p>
{% highlight java %}
Table orders = tableEnv.from("Orders");
Table result = orders.addOrReplaceColumns(concat($("c"), "sunny").as("desc"));
Expand Down Expand Up @@ -687,7 +687,7 @@ val result = orders.addColumns(concat($"c", "Sunny"))
<span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span>
</td>
<td>
<p>Performs a field add operation. Existing fields will be replaced if add columns name is the same as the existing column name. Moreover, if the added fields have duplicate field name, then the last one is used. </p>
<p>Performs a field add operation. Existing fields will be replaced if the added column name is the same as the existing column name. Moreover, if the added fields have duplicate field name, then the last one is used. </p>
{% highlight scala %}
val orders = tableEnv.from("Orders");
val result = orders.addOrReplaceColumns(concat($"c", "Sunny") as "desc")
Expand Down Expand Up @@ -755,7 +755,7 @@ result = orders.add_columns(concat(orders.c, 'sunny'))
<span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span>
</td>
<td>
<p>Performs a field add operation. Existing fields will be replaced if add columns name is the same as the existing column name. Moreover, if the added fields have duplicate field name, then the last one is used. </p>
<p>Performs a field add operation. Existing fields will be replaced if the added column name is the same as the existing column name. Moreover, if the added fields have duplicate field name, then the last one is used. </p>
{% highlight python %}
from pyflink.table.expressions import concat

Expand Down Expand Up @@ -836,7 +836,7 @@ Table result = orders.groupBy($("a")).select($("a"), $("b").sum().as("d"));
{% highlight java %}
Table orders = tableEnv.from("Orders");
Table result = orders
.window(Tumble.over(lit(5).minutes())).on($("rowtime")).as("w")) // define window
.window(Tumble.over(lit(5).minutes()).on($("rowtime")).as("w")) // define window
.groupBy($("a"), $("w")) // group by key and window
// access window properties and aggregate
.select(
Expand Down Expand Up @@ -895,7 +895,7 @@ Table groupByDistinctResult = orders
// Distinct aggregation on time window group by
Table groupByWindowDistinctResult = orders
.window(Tumble
.over(lit(5).minutes()))
.over(lit(5).minutes())
.on($("rowtime"))
.as("w")
)
Expand Down Expand Up @@ -1264,7 +1264,7 @@ Table result = left.join(right)
.where(
and(
$("a").isEqual($("d")),
$("ltime").isGreaterEqual($("rtime").minus(lit(5).minutes())),
$("ltime").isGreaterOrEqual($("rtime").minus(lit(5).minutes())),
$("ltime").isLess($("rtime").plus(lit(10).minutes()))
))
.select($("a"), $("b"), $("e"), $("ltime"));
Expand Down Expand Up @@ -2740,7 +2740,7 @@ A session window is defined by using the `Session` class as follows:

### Over Windows

Over window aggregates are known from standard SQL (`OVER` clause) and defined in the `SELECT` clause of a query. Unlike group windows, which are specified in the `GROUP BY` clause, over windows do not collapse rows. Instead over window aggregates compute an aggregate for each input row over a range of its neighboring rows.
Over window aggregates are known from standard SQL (`OVER` clause) and defined in the `SELECT` clause of a query. Unlike group windows, which are specified in the `GROUP BY` clause, over windows do not collapse rows. Instead over window aggregates compute an aggregate for each input row over a range of its neighboring rows.

Over windows are defined using the `window(w: OverWindow*)` clause (using `over_window(*OverWindow)` in Python API) and referenced via an alias in the `select()` method. The following example shows how to define an over window aggregation on a table.

Expand Down Expand Up @@ -3177,7 +3177,7 @@ Table table = input
</td>
<td>
<p>Similar to a <b>GroupBy Aggregation</b>. Groups the rows on the grouping keys with the following running table aggregation operator to aggregate rows group-wise. The difference from an AggregateFunction is that TableAggregateFunction may return 0 or more records for a group. You have to close the "flatAggregate" with a select statement. And the select statement does not support aggregate functions.</p>
<p>Instead of using <code>emitValue</code> to output results, you can also use the <code>emitUpdateWithRetract</code> method. Different from <code>emitValue</code>, <code>emitUpdateWithRetract</code> is used to emit values that have been updated. This method outputs data incrementally in retract mode, i.e., once there is an update, we have to retract old records before sending new updated ones. The <code>emitUpdateWithRetract</code> method will be used in preference to the <code>emitValue</code> method if both methods are defined in the table aggregate function, because the method is treated to be more efficient than <code>emitValue</code> as it can output values incrementally. See <a href="{{ site.baseurl }}/dev/table/functions/udfs.html#table-aggregation-functions">Table Aggregation Functions</a> for details.</p>
<p>Instead of using <code>emitValue</code> to output results, you can also use the <code>emitUpdateWithRetract</code> method. Different from <code>emitValue</code>, <code>emitUpdateWithRetract</code> is used to emit values that have been updated. This method outputs data incrementally in retract mode, i.e., once there is an update, we have to retract old records before sending new updated ones. The <code>emitUpdateWithRetract</code> method will be used in preference to the <code>emitValue</code> method if both methods are defined in the table aggregate function, because the method is treated to be more efficient than <code>emitValue</code> as it can output values incrementally. See <a href="{% link dev/table/functions/udfs.md %}#table-aggregation-functions">Table Aggregation Functions</a> for details.</p>
{% highlight java %}
/**
* Accumulator for Top2.
Expand Down Expand Up @@ -3403,7 +3403,7 @@ val table = input
</td>
<td>
<p>Similar to a <b>GroupBy Aggregation</b>. Groups the rows on the grouping keys with the following running table aggregation operator to aggregate rows group-wise. The difference from an AggregateFunction is that TableAggregateFunction may return 0 or more records for a group. You have to close the "flatAggregate" with a select statement. And the select statement does not support aggregate functions.</p>
<p>Instead of using <code>emitValue</code> to output results, you can also use the <code>emitUpdateWithRetract</code> method. Different from <code>emitValue</code>, <code>emitUpdateWithRetract</code> is used to emit values that have been updated. This method outputs data incrementally in retract mode, i.e., once there is an update, we have to retract old records before sending new updated ones. The <code>emitUpdateWithRetract</code> method will be used in preference to the <code>emitValue</code> method if both methods are defined in the table aggregate function, because the method is treated to be more efficient than <code>emitValue</code> as it can output values incrementally. See <a href="{{ site.baseurl }}/dev/table/functions/udfs.html#table-aggregation-functions">Table Aggregation Functions</a> for details.</p>
<p>Instead of using <code>emitValue</code> to output results, you can also use the <code>emitUpdateWithRetract</code> method. Different from <code>emitValue</code>, <code>emitUpdateWithRetract</code> is used to emit values that have been updated. This method outputs data incrementally in retract mode, i.e., once there is an update, we have to retract old records before sending new updated ones. The <code>emitUpdateWithRetract</code> method will be used in preference to the <code>emitValue</code> method if both methods are defined in the table aggregate function, because the method is treated to be more efficient than <code>emitValue</code> as it can output values incrementally. See <a href="{% link dev/table/functions/udfs.md %}#table-aggregation-functions">Table Aggregation Functions</a> for details.</p>
{% highlight scala %}
import java.lang.{Integer => JInteger}
import org.apache.flink.table.api.Types
Expand Down Expand Up @@ -3576,9 +3576,9 @@ Please see the dedicated page about [data types](types.html).

Generic types and (nested) composite types (e.g., POJOs, tuples, rows, Scala case classes) can be fields of a row as well.

Fields of composite types with arbitrary nesting can be accessed with [value access functions]({{ site.baseurl }}/dev/table/functions/systemFunctions.html#value-access-functions).
Fields of composite types with arbitrary nesting can be accessed with [value access functions]({% link dev/table/functions/systemFunctions.md %}#value-access-functions).

Generic types are treated as a black box and can be passed on or processed by [user-defined functions]({{ site.baseurl }}/dev/table/functions/udfs.html).
Generic types are treated as a black box and can be passed on or processed by [user-defined functions]({% link dev/table/functions/udfs.md %}).

{% top %}

Expand Down
Loading

0 comments on commit 9ee5d6d

Please sign in to comment.