forked from apache/flink
-
Notifications
You must be signed in to change notification settings - Fork 0
/
execution_configuration.html
36 lines (36 loc) · 3.13 KB
/
execution_configuration.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
<table class="configuration table table-bordered">
<thead>
<tr>
<th class="text-left" style="width: 20%">Key</th>
<th class="text-left" style="width: 15%">Default</th>
<th class="text-left" style="width: 10%">Type</th>
<th class="text-left" style="width: 55%">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><h5>execution.batch-shuffle-mode</h5></td>
<td style="word-wrap: break-word;">ALL_EXCHANGES_BLOCKING</td>
<td><p>Enum</p></td>
<td>Defines how data is exchanged between tasks in batch 'execution.runtime-mode' if the shuffling behavior has not been set explicitly for an individual exchange.<br />With pipelined exchanges, upstream and downstream tasks run simultaneously. In order to achieve lower latency, a result record is immediately sent to and processed by the downstream task. Thus, the receiver back-pressures the sender. The streaming mode always uses this exchange.<br />With blocking exchanges, upstream and downstream tasks run in stages. Records are persisted to some storage between stages. Downstream tasks then fetch these records after the upstream tasks finished. Such an exchange reduces the resources required to execute the job as it does not need to run upstream and downstream tasks simultaneously.<br /><br />Possible values:<ul><li>"ALL_EXCHANGES_PIPELINED": Upstream and downstream tasks run simultaneously. This leads to lower latency and more evenly distributed (but higher) resource usage across tasks.</li><li>"ALL_EXCHANGES_BLOCKING": Upstream and downstream tasks run subsequently. This reduces the resource usage as downstream tasks are started after upstream tasks finished.</li></ul></td>
</tr>
<tr>
<td><h5>execution.buffer-timeout</h5></td>
<td style="word-wrap: break-word;">100 ms</td>
<td>Duration</td>
<td>The maximum time frequency (milliseconds) for the flushing of the output buffers. By default the output buffers flush frequently to provide low latency and to aid smooth developer experience. Setting the parameter can result in three logical modes:<ul><li>A positive value triggers flushing periodically by that interval</li><li>0 triggers flushing after every record thus minimizing latency</li><li>-1 ms triggers flushing only when the output buffer is full thus maximizing throughput</li></ul></td>
</tr>
<tr>
<td><h5>execution.checkpointing.snapshot-compression</h5></td>
<td style="word-wrap: break-word;">false</td>
<td>Boolean</td>
<td>Tells if we should use compression for the state snapshot data or not</td>
</tr>
<tr>
<td><h5>execution.runtime-mode</h5></td>
<td style="word-wrap: break-word;">STREAMING</td>
<td><p>Enum</p></td>
<td>Runtime execution mode of DataStream programs. Among other things, this controls task scheduling, network shuffle behavior, and time semantics.<br /><br />Possible values:<ul><li>"STREAMING"</li><li>"BATCH"</li><li>"AUTOMATIC"</li></ul></td>
</tr>
</tbody>
</table>