Skip to content

Commit

Permalink
[FLINK-15999][doc] Move broadcast state explanation from api doc to c…
Browse files Browse the repository at this point in the history
…oncepts

This also reorders/massages the text a bit to make it fit in better.
  • Loading branch information
aljoscha committed Feb 21, 2020
1 parent 2ed9885 commit 1714379
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 15 deletions.
19 changes: 16 additions & 3 deletions docs/concepts/stateful-stream-processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,8 @@ as many Key Groups as the defined maximum parallelism. During execution each
parallel instance of a keyed operator works with the keys for one or more Key
Groups.

`TODO: potentially leave out Operator State and Broadcast State from concepts documentation`

## Operator State

*Operator State* (or *non-keyed state*) is state that is is bound to one
Expand All @@ -116,9 +118,20 @@ The Operator State interfaces support redistributing state among parallel
operator instances when the parallelism is changed. There can be different
schemes for doing this redistribution.

{% top %}

## State Types (Keyed State, Broadcast State)
## Broadcast State

*Broadcast State* is a special type of *Operator State*. It was introduced to
support use cases where some data coming from one stream is required to be
broadcasted to all downstream tasks, where it is stored locally and is used to
process all incoming elements on the other stream. As an example where
broadcast state can emerge as a natural fit, one can imagine a low-throughput
stream containing a set of rules which we want to evaluate against all elements
coming from another stream. Having the above type of use cases in mind,
broadcast state differs from the rest of operator states in that:
1. it has a map format,
2. it is only available to specific operators that have as inputs a
*broadcasted* stream and a *non-broadcasted* one, and
3. such an operator can have *multiple broadcast states* with different names.

{% top %}

Expand Down
15 changes: 3 additions & 12 deletions docs/dev/stream/state/broadcast_state.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,18 +25,9 @@ under the License.
* ToC
{:toc}

[Working with State](state.html) describes operator state which upon restore is either evenly distributed among the
parallel tasks of an operator, or unioned, with the whole state being used to initialize the restored parallel tasks.

A third type of supported *operator state* is the *Broadcast State*. Broadcast state was introduced to support use cases
where some data coming from one stream is required to be broadcasted to all downstream tasks, where it is stored locally
and is used to process all incoming elements on the other stream. As an example where broadcast state can emerge as a
natural fit, one can imagine a low-throughput stream containing a set of rules which we want to evaluate against all
elements coming from another stream. Having the above type of use cases in mind, broadcast state differs from the rest
of operator states in that:
1. it has a map format,
2. it is only available to specific operators that have as inputs a *broadcasted* stream and a *non-broadcasted* one, and
3. such an operator can have *multiple broadcast states* with different names.
In this section you will learn about how to use broadcast state in practise. Please refer to [Stateful Stream
Processing]({{site.baseurl}}{% link concepts/stateful-stream-processing.md %})
to learn about the concepts behind stateful stream processing.

## Provided APIs

Expand Down

0 comments on commit 1714379

Please sign in to comment.