Skip to content

Commit

Permalink
[FLINK-15999][doc] Move remaining text from runtime.md to flink-archi…
Browse files Browse the repository at this point in the history
…tecture.md

Also rename and reformat to get up-to-date with current naming.
  • Loading branch information
aljoscha committed Feb 21, 2020
1 parent 938b64d commit 3608a6d
Show file tree
Hide file tree
Showing 3 changed files with 68 additions and 178 deletions.
68 changes: 68 additions & 0 deletions docs/concepts/flink-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,71 @@ command line process `./bin/flink run ...`.
<img src="{{ site.baseurl }}/fig/processes.svg" alt="The processes involved in executing a Flink dataflow" class="offset" width="80%" />

{% top %}

## Tasks and Operator Chains

For distributed execution, Flink *chains* operator subtasks together into
*tasks*. Each task is executed by one thread. Chaining operators together into
tasks is a useful optimization: it reduces the overhead of thread-to-thread
handover and buffering, and increases overall throughput while decreasing
latency. The chaining behavior can be configured; see the [chaining docs]({{
site.baseurl }}{% link dev/stream/operators/index.md
%}#task-chaining-and-resource-groups) for details.

The sample dataflow in the figure below is executed with five subtasks, and
hence with five parallel threads.

<img src="{{ site.baseurl }}/fig/tasks_chains.svg" alt="Operator chaining into Tasks" class="offset" width="80%" />

{% top %}

## Task Slots and Resources

Each worker (TaskManager) is a *JVM process*, and may execute one or more
subtasks in separate threads. To control how many tasks a worker accepts, a
worker has so called **task slots** (at least one).

Each *task slot* represents a fixed subset of resources of the TaskManager. A
TaskManager with three slots, for example, will dedicate 1/3 of its managed
memory to each slot. Slotting the resources means that a subtask will not
compete with subtasks from other jobs for managed memory, but instead has a
certain amount of reserved managed memory. Note that no CPU isolation happens
here; currently slots only separate the managed memory of tasks.

By adjusting the number of task slots, users can define how subtasks are
isolated from each other. Having one slot per TaskManager means each task
group runs in a separate JVM (which can be started in a separate container, for
example). Having multiple slots means more subtasks share the same JVM. Tasks
in the same JVM share TCP connections (via multiplexing) and heartbeat
messages. They may also share data sets and data structures, thus reducing the
per-task overhead.

<img src="{{ site.baseurl }}/fig/tasks_slots.svg" alt="A TaskManager with Task Slots and Tasks" class="offset" width="80%" />

By default, Flink allows subtasks to share slots even if they are subtasks of
different tasks, so long as they are from the same job. The result is that one
slot may hold an entire pipeline of the job. Allowing this *slot sharing* has
two main benefits:

- A Flink cluster needs exactly as many task slots as the highest parallelism
used in the job. No need to calculate how many tasks (with varying
parallelism) a program contains in total.

- It is easier to get better resource utilization. Without slot sharing, the
non-intensive *source/map()* subtasks would block as many resources as the
resource intensive *window* subtasks. With slot sharing, increasing the
base parallelism in our example from two to six yields full utilization of
the slotted resources, while making sure that the heavy subtasks are fairly
distributed among the TaskManagers.

<img src="{{ site.baseurl }}/fig/slot_sharing.svg" alt="TaskManagers with shared Task Slots" class="offset" width="80%" />

The APIs also include a *[resource group]({{ site.baseurl }}{% link
dev/stream/operators/index.md %}#task-chaining-and-resource-groups)* mechanism
which can be used to prevent undesirable slot sharing.

As a rule-of-thumb, a good default number of task slots would be the number of
CPU cores. With hyper-threading, each slot then takes 2 or more hardware
thread contexts.

{% top %}
79 changes: 0 additions & 79 deletions docs/concepts/runtime.md

This file was deleted.

99 changes: 0 additions & 99 deletions docs/concepts/runtime.zh.md

This file was deleted.

0 comments on commit 3608a6d

Please sign in to comment.