[FLINK-6988] flink-connector-kafka-0.11 with exactly-once semantic #4239

pnowojski · 2017-06-30T15:55:38Z

~~First four commits are from #4557 and #4561.~~

tzulitai · 2017-07-19T06:36:42Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer011.java

+ *
+ * <p><b>NOTE:</b> The implementation currently accesses partition metadata when the consumer
+ * is constructed. That means that the client that submits the program needs to be able to
+ * reach the Kafka brokers or ZooKeeper.</p>


This NOTE is no longer valid and can be removed.

Is it also true for 0.10?

Yes. I have a separate PR which cleans that up for all Kafka versions.

tzulitai · 2017-07-19T06:41:09Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ *
+ * <p>Details about approach (a):
+ * Pre Kafka 0.11 producers only follow approach (a), allowing users to use the producer using the
+ * DataStream.addSink() method.


"Pre Kafka 0.11 producers only follow approach (a)" is incorrect.
Kafka 0.10 also supports hybrid.

tzulitai · 2017-07-19T06:42:15Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ */
+ private boolean logFailuresOnly;
+
+ private Semantic semantic;


nit: also include Javadoc for consistency.

tzulitai · 2017-07-19T06:42:29Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ /** Number of unacknowledged records. */
+ private final AtomicLong pendingRecords = new AtomicLong();
+
+ private final Map<String, KafkaMetricMuttableWrapper> previouslyCreatedMetrics = new HashMap<>();


nit: also include Javadoc for consistency.

tzulitai · 2017-07-19T06:46:10Z

.../src/main/java/org/apache/flink/streaming/api/functions/sink/TwoPhaseCommitSinkFunction.java

+ * @param <TXN> Transaction to store all of the information required to handle a transaction (must be Serializable)
+ */
+@PublicEvolving
+public abstract class TwoPhaseCommitSinkFunction<IN, TXN extends Serializable>


I really like the idea of introducing this abstraction :)

Overall, though, I would like to see unit tests specifically for this TwoPhaseCommitSinkFunction class.

tzulitai · 2017-07-19T06:56:05Z

.../src/main/java/org/apache/flink/streaming/api/functions/sink/TwoPhaseCommitSinkFunction.java

+ // was triggered) and because there can be concurrent overlapping checkpoints
+ // (a new one is started before the previous fully finished).
+ //
+ // ==> There should never be a case where we have no pending transaction here


Lets move this comment block as a Javadoc on the method.

Hmm, why do you think so? This is a purely implementation detail, nothing that should bother the user of this class.

Ok, makes sense. No strong objection here, can keep as is.

tzulitai · 2017-07-19T06:57:05Z

.../src/main/java/org/apache/flink/streaming/api/functions/sink/TwoPhaseCommitSinkFunction.java

+ // checkpoint (temporary outage in the storage system) but
+ // could persist a successive checkpoint (the one notified here)
+ //
+ // - other (non Pravega sink) tasks could not persist their status during


Is it required to mention Pravega here?

tzulitai · 2017-07-19T07:04:41Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ * Flink Sink to produce data into a Kafka topic. This producer is compatible with Kafka 0.11.x. By default producer
+ * will use {@link Semantic.EXACTLY_ONCE} semantic.
+ *
+ * <p>Implementation note: This producer is a hybrid between a regular regular sink function (a)


Does this implementation really support the hybrid modes?
As far as I can understand it, FlinkKafkaProducer011 only extends TwoPhaseCommitSinkFunction, which doesn't support the hybrid modes.

Yes, according to all of the tests it passes.

(b) version works by passing instance of FlinkKafkaProducer011 as aSinkFunction in the KafkaStreamSink<IN> extends StreamSink<IN> class. Under the hood this StreamSink makes some checking if SinkFunction actually implements various versions of checkpointing interfaces and in that way it calls the appropriate methods on FlinkKafkaProducer011.

tzulitai · 2017-07-19T07:09:45Z

.../src/main/java/org/apache/flink/streaming/api/functions/sink/TwoPhaseCommitSinkFunction.java

+ // for the reasons discussed in the 'notifyCheckpointComplete()' method.
+
+ pendingCommitTransactionsState = context.getOperatorStateStore().getSerializableListState("pendingCommitTransactions");
+ pendingTransactionsState = context.getOperatorStateStore().getSerializableListState("pendingTransactions");


getSerializableListState is deprecated and discouraged usage.
I would recommend that implementations may also pass in either the TypeInformation or their own TypeSerializer for the transaction state holder.

tzulitai · 2017-07-19T07:14:35Z

Thanks a lot for opening a pull request for this very important feature @pnowojski.
I did a rough first pass and had some comments I would like to clear out first (this is a big chunk of code, we would probably need to go through this quite a few times before it can be mergeable.)

Most notably, some comments so far:

I think we need UTs for the TwoPhaseCommitSinkFunction. It alone is a very important addition (I would even prefer a separate PR for it and try to merge that first.)
Serialization of the transaction state in TwoPhaseCommitSinkFunction needs to be changed
Is the FlinkKafkaProducer011 actually supporting hybrid (normal sink function and writeToKafkaWithTimestamps as a custom sink operator)? From the looks of it, it doesn't seem like it.

tzulitai · 2017-07-19T07:17:50Z

Regarding how I would proceed with this big contribution:
Lets first try to clean up the commits that are bundled all together here.

I would first try to merge [FLINK-7174] Bump Kafka 0.10 dependency to 0.10.2.1 #4321 (the first 4 commits) and [misc] Commit read offsets in Kafka integration tests #4310 (af7ed19) and get those out of the way. Could you also prioritize in commenting on those first (I've left new comments there)?
For a06cb94 (TwoPhaseCommitSinkFunction), could you open a separate PR with unit tests covered?
After the above is all sorted out, we rebase this again.

tzulitai · 2017-07-19T07:21:18Z

One other comment regarding the commits:
I would argue that df6d5e0 to 5ff8106 should not appear in the commit log history, since in the end we actually have a completely new producer for 011 anyways.
Also, 321a142 to 2cf5f3b should be squashed to a single commit for the addition of an "exactly-once producer for 011" (the new FlinkKafkaProducer implementation and exactly-once tests shouldn't stand alone as independent commits, IMO. FlinkKafkaProducer isn't used by other producer version, and the exactly-once producer addition wouldn't be valid without the tests).

What do you think?

pnowojski · 2017-07-19T07:45:12Z

df6d5e0 to 5ff8106 should definitely be squashed, I left them only to make it easier for reviewers to follow the changes made in 0.11 vs 0.10 connectors (those changes would be invisible in one blob commit).

For 321a142 to 2cf5f3b I'm not sure about the first one, FlinkKafkaProducer is that hacky that it could deserve separate commit. It would make it stand out more if anyone in the future would look at the commit history/changes (it could hide in larger change).

tzulitai · 2017-07-19T07:48:16Z

Ok :) I can agree that we keep 321a142 a separate commit.
For df6d5e0 to 5ff8106, I actually found it easier to ignore all that (because a lot of it is irrelevant in the end) and went straight to 41ad973.

pnowojski

I have added tests for TwoPhaseCommitFunction and opened new PR #4368 for that - however please check responses to your comments here before moving to that new PR.

pnowojski · 2017-07-19T08:16:44Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer011.java

+ *
+ * <p><b>NOTE:</b> The implementation currently accesses partition metadata when the consumer
+ * is constructed. That means that the client that submits the program needs to be able to
+ * reach the Kafka brokers or ZooKeeper.</p>


Is it also true for 0.10?

pnowojski · 2017-07-19T09:45:24Z

.../src/main/java/org/apache/flink/streaming/api/functions/sink/TwoPhaseCommitSinkFunction.java

+ // was triggered) and because there can be concurrent overlapping checkpoints
+ // (a new one is started before the previous fully finished).
+ //
+ // ==> There should never be a case where we have no pending transaction here


Hmm, why do you think so? This is a purely implementation detail, nothing that should bother the user of this class.

zentol · 2017-07-28T14:12:20Z

please add an entry to the MODULES_CONNECTORS variable in the tools/travis_mvn_watchdog sh script.

tzulitai · 2017-08-07T07:10:39Z

Now that the prerequisite PRs are merged, we can rebase this now :)

rangadi · 2017-08-09T01:08:39Z

How does exactly-once sink handle large gap between preCommit() and recoverAndCommit() in case of a recovery? The server seems to abort a transaction after a timeout max.transaction.timeout.ms.

pnowojski · 2017-08-09T08:13:34Z

I think there is no way we can to handle it in any different way then to increase the timeout to some very large value. Or is it?

rangadi · 2017-08-18T20:02:06Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ public void initializeState(FunctionInitializationContext context) throws Exception {
+ availableTransactionalIds.clear();
+ for (int i = 0; i < kafkaProducersPoolSize; i++) {
+ availableTransactionalIds.add(UUID.randomUUID().toString());


Probably better to reuse stored ids rather than creating new ones each time. I am thinking of a case where a task goes into crash loop dying even before first commit.

I think that makes sense, but I guess its mostly due to that the current code isn't differentiating between used and unused transaction ids in the state. If we differentiate that, it would be possible to reuse stored ids.

Piotr, what do you think?

That's a valid issue, however on it's one this solution would not be enough. It would not work for a case when we first (1) scale down, then we (2) scale up. On event (2), we would need to create new transactional ids, but we wouldn't know from which id we can start.

However I think we can deduce the starting point for new IDs using getUnionListState to track down globally what is the next available transactional id.

we wouldn't know from which id we can start.

Not sure if you need 'start id'. You can just abort all of them whether they are any open transactions or not (in fact if you open a new producer with the id, Kafka aborts any that are open). This is mainly a for clarification, will leave it to you guys to decide on specifics.

tzulitai

Thanks for the follow-up revision @pnowojski.
I think the latest approach we're going for seems sane.

I've only checked the code on the producer side. I'm assuming that other codes (consumer, table sink / sources) are mostly identical to the other versions. From what I see, I think this is almost mergeable, minus some comments I had left inline.

tzulitai · 2017-08-22T06:33:11Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ * <li>{@link #NONE}</li>
+ */
+ public enum Semantic {
+ /**


nit: empty line before comment block.

tzulitai · 2017-08-22T06:33:20Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ * <li>increase size of {@link FlinkKafkaProducer}s pool</li>
+ */
+ EXACTLY_ONCE,
+ /**


nit: empty line before comment block.

tzulitai · 2017-08-22T06:33:25Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ * to be acknowledged by the Kafka producer on a checkpoint.
+ */
+ AT_LEAST_ONCE,
+ /**


nit: empty line before comment block.

tzulitai · 2017-08-22T06:34:36Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ /**
+ * Default number of KafkaProducers in the pool. See {@link Semantic#EXACTLY_ONCE}.
+ */
+ public static final int DEFAULT_KAFKA_PRODUCERS_POOL_SIZE = 5;


Could you briefly describe the reason of the number 5?
Why not use numConcurrentCheckpoints + 1 (as we discussed offline)?

As I remember the reason was that it is not easy/not possible at the moment to get this information in the operator. It should be a follow up work. Regardless of this, code of this operator would look the same (because we don't have guarantees for the notifyCheckpointComplete to always reach us on time).

tzulitai · 2017-08-22T06:37:36Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ /**
+ * Pool of transacional ids backed up in state.
+ */
+ private ListState<String> transactionalIdsState;


We can probably make this transient also for documentation purposes.

tzulitai · 2017-08-22T07:02:32Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ availableTransactionalIds.add(UUID.randomUUID().toString());
+ }
+
+ super.initializeState(context);


Could we initialize the base TwoPhaseCommitSink first?

tzulitai · 2017-08-22T07:03:39Z

...ka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java

+ this.kafkaProducer = kafkaProducer;
+ }
+
+ // TODO: is this used anywhere?


I think this variant is used when using writeToKafkaWithTimestamps

tzulitai · 2017-08-22T07:04:23Z

...1/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/FlinkKafkaProducer.java

+ private static final Logger LOG = LoggerFactory.getLogger(FlinkKafkaProducer.class);
+
+ private final KafkaProducer<K, V> kafkaProducer;
+ @Nullable


nit: empty line before this field annotation.

tzulitai · 2017-08-22T07:04:50Z

...1/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/FlinkKafkaProducer.java

+ if (!(producerId >= 0 && epoch >= 0)) {
+ throw new IllegalStateException(String.format("Incorrect values for producerId [%s] and epoch [%s]", producerId, epoch));
+ }
+ LOG.info("Attempting to resume transaction with producerId [%s] and epoch [%s]", producerId, epoch);


{} instead of [%s]s

tzulitai · 2017-08-22T07:05:08Z

...1/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/FlinkKafkaProducer.java

+
+ public void resumeTransaction(long producerId, short epoch) {
+ if (!(producerId >= 0 && epoch >= 0)) {
+ throw new IllegalStateException(String.format("Incorrect values for producerId [%s] and epoch [%s]", producerId, epoch));


Can use Preconditions.checkState(...) here.

aljoscha · 2017-08-30T06:52:43Z

I did a first high-level review of the code. I think it's good so far!

Before we can merge this, however, we need a few more things around it:

A section in the Kafka doc about the new exactly-once mode, how it can be configured etc.
A big disclaimer (possibly in an "alert" box) about the interplay with the transaction timeout and what the caveats there are
A section in the Javadocs about the aforementioned caveats
A check that ensures that the transaction timeout is set to a reasonably high setting (say 1 hour) when exactly-once semantics are enabled. (With an override setting that allows the user to set a lower transaction time out if they want to.)

Also, this has interplay with #4616 but we can resolve that by merging them in any order and fixing up the later changes when merging.

pnowojski · 2017-09-04T13:54:28Z

@aljoscha I have addressed you high level comments and fixed some bugs. Please check 5 latest commits (one of them is a new dependency on another PR: #4631 )

aljoscha · 2017-09-06T15:42:25Z

What were the bugs that you fixed?

pnowojski · 2017-09-07T08:17:12Z

Bugs in tests (those that you can see in fixup commits)

ariskk · 2017-10-04T08:22:38Z

We are really looking forward to this 👍

…ctions recovery

…semantic

pnowojski · 2017-10-04T10:04:03Z

@aljoscha rebased on latest master and integrated your changes

aljoscha · 2017-10-09T17:00:23Z

Merged! 😃

Could you please close this PR?

pnowojski · 2017-10-09T18:08:29Z

Thanks :)

pnowojski force-pushed the kafka011 branch 4 times, most recently from 723a5be to 35ee552 Compare July 3, 2017 15:00

pnowojski force-pushed the kafka011 branch from 35ee552 to e8076f4 Compare July 13, 2017 12:43

pnowojski changed the title ~~[FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic~~ [FLINK-6988] flink-connector-kafka-0.11 with exactly-once semantic Jul 13, 2017

pnowojski force-pushed the kafka011 branch 3 times, most recently from 46b3b68 to 2cf5f3b Compare July 17, 2017 15:39

tzulitai reviewed Jul 19, 2017

View reviewed changes

pnowojski commented Jul 19, 2017

View reviewed changes

pnowojski force-pushed the kafka011 branch 3 times, most recently from a7a05e5 to 3e635d4 Compare July 20, 2017 16:29

pnowojski force-pushed the kafka011 branch from 31d951e to 59b3b3c Compare July 27, 2017 09:03

pnowojski mentioned this pull request Aug 1, 2017

[FLINK-7343][kafka] Increase Xmx for tests #4456

Closed

pnowojski force-pushed the kafka011 branch 5 times, most recently from cc48a21 to 35cc64f Compare August 8, 2017 12:37

pnowojski force-pushed the kafka011 branch from 0c7c27f to b8b0526 Compare August 18, 2017 08:47

rangadi reviewed Aug 18, 2017

View reviewed changes

tzulitai requested changes Aug 22, 2017

View reviewed changes

pnowojski force-pushed the kafka011 branch 4 times, most recently from 5ced4ea to dfb7e24 Compare August 29, 2017 10:15

pnowojski force-pushed the kafka011 branch 2 times, most recently from cbfc50d to e2d477f Compare September 4, 2017 13:11

pnowojski force-pushed the kafka011 branch from e2d477f to f3e17c3 Compare September 7, 2017 08:51

pnowojski force-pushed the kafka011 branch 2 times, most recently from ed98a07 to 32ff813 Compare October 4, 2017 09:20

aljoscha and others added 5 commits October 4, 2017 11:48

[FLINK-6988] Make TwoPhaseCommitSinkFunction work with Context

f3e6ef6

[FLINK-6988] Add Kafka 0.11 connector maven module

7bc6154

[FLINK-6988][kafka] Implement our own KafkaProducer class with transa…

3ee0047

…ctions recovery

[FLINK-6988][kafka] Add flink-connector-kafka-0.11 with exactly-once …

262eb5b

…semantic

[hotfix] Don't use deprecated writeWithTimestamps in Kafka 0.10 tests

489ad56

pnowojski force-pushed the kafka011 branch from 32ff813 to 489ad56 Compare October 4, 2017 09:59

pnowojski closed this Oct 9, 2017

pnowojski mentioned this pull request Oct 10, 2017

[FLINK-6988] Add additional tests coverage for Kafka 0.11 connector #4779

Closed

pnowojski deleted the kafka011 branch October 21, 2017 09:07

rmetzger added the component=Connectors/Kafka label Mar 14, 2019

[FLINK-6988] flink-connector-kafka-0.11 with exactly-once semantic #4239

[FLINK-6988] flink-connector-kafka-0.11 with exactly-once semantic #4239

Conversation

pnowojski commented Jun 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pnowojski Jul 19, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tzulitai commented Jul 19, 2017

tzulitai commented Jul 19, 2017 • edited Loading

tzulitai commented Jul 19, 2017 • edited Loading

pnowojski commented Jul 19, 2017

tzulitai commented Jul 19, 2017

pnowojski left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zentol commented Jul 28, 2017 • edited Loading

tzulitai commented Aug 7, 2017

rangadi commented Aug 9, 2017

pnowojski commented Aug 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rangadi Aug 23, 2017 • edited Loading

Choose a reason for hiding this comment

tzulitai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aljoscha commented Aug 30, 2017

pnowojski commented Sep 4, 2017 • edited Loading

aljoscha commented Sep 6, 2017

pnowojski commented Sep 7, 2017

ariskk commented Oct 4, 2017

pnowojski commented Oct 4, 2017

aljoscha commented Oct 9, 2017

pnowojski commented Oct 9, 2017

pnowojski commented Jun 30, 2017 •

edited

Loading

pnowojski Jul 19, 2017 •

edited

Loading

tzulitai commented Jul 19, 2017 •

edited

Loading

tzulitai commented Jul 19, 2017 •

edited

Loading

zentol commented Jul 28, 2017 •

edited

Loading

rangadi Aug 23, 2017 •

edited

Loading

pnowojski commented Sep 4, 2017 •

edited

Loading