Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-720] Enable WindowedWordCountIT on Flink runner in presubmit #2188

Merged
merged 1 commit into from
Mar 23, 2017

Conversation

kennknowles
Copy link
Member

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

@kennknowles
Copy link
Member Author

@aljoscha when I ran this locally with the in-process Flink it passed, so we might as well have it enabled to try to get some more coverage. Would be nice to have actual Flink cluster coverage of this.

@kennknowles
Copy link
Member Author

Also: if I understand correctly, --streaming=true still matters for FlinkRunner, so we should have two executions here, right?

@asfbot
Copy link

asfbot commented Mar 8, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8187/

Build result: FAILURE

[...truncated 1.12 MB...] ^/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/StateSpecFunctions.java:59: error: reference not found * A helper class that is essentially a {@link Serializable} {@link AbstractFunction3}. ^Command line was: /usr/local/asfpackages/java/jdk1.8.0_121/jre/../bin/javadoc @options @packagesRefer to the generated Javadoc files in '/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/runners/spark/target/apidocs' dir. at org.apache.maven.plugin.javadoc.AbstractJavadocMojo.executeJavadocCommandLine(AbstractJavadocMojo.java:5188) at org.apache.maven.plugin.javadoc.AbstractJavadocMojo.executeReport(AbstractJavadocMojo.java:2075) at org.apache.maven.plugin.javadoc.JavadocJar.execute(JavadocJar.java:188) ... 33 more2017-03-08T00:49:57.486 [ERROR] 2017-03-08T00:49:57.486 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-03-08T00:49:57.486 [ERROR] 2017-03-08T00:49:57.486 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-03-08T00:49:57.486 [ERROR] [Help 1] https://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException2017-03-08T00:49:57.486 [ERROR] 2017-03-08T00:49:57.486 [ERROR] After correcting the problems, you can resume the build with the command2017-03-08T00:49:57.486 [ERROR] mvn -rf :beam-runners-sparkchannel stoppedSetting status of 5ee960c to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8187/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@aljoscha
Copy link
Contributor

aljoscha commented Mar 8, 2017

tl;dr: LGTM!

Yes, the --streaming flag is considered but if you omit it the Flink runner will choose batch/streaming based on the whether there are any unbounded sources or not. WindowedWordCountIT would be executed as batch since there are no unbounded source.

However, even when you execute this job using --streaming=true it will still terminate because we only have bounded sources and the Flink streaming runner will terminate once all operators finish.

I would like to have a proper "streaming" test with unbounded sources where we manually cancel the job once the stopping condition is met. (Like the TestDataflowRunner does by waiting for the PAssert metrics to show that all PAsserts passed.) For this, however, we need metrics support in the Flink Runner and we need to extend the TestFlinkRunner in a similar way. By the way, how does the Dataflow runner know when the WindowedWordCountIT has finished?

Also, the success matcher set in WindowedWordCountIT is ignored by the TestFlinkRunner.

In the end, this still LGTM! Because some coverage is better than none.

@kennknowles
Copy link
Member Author

Since you mention a proper "streaming" test, this is most easy to do only with triggers that yield deterministic output, unless a runner implements TestStream which offers control over watermarks and processing time, and waiting for quiescence. The Spark runner has CreateStream which is almost the same but tweaked for Spark and we should unify those. For Dataflow, my thought was to try to implement TestStream fairly soon. For Flink, what do you think?

@aljoscha
Copy link
Contributor

aljoscha commented Mar 9, 2017

I think it will be quite hard to do for Flink because there is no upstream communication between operators. Maybe we can do it via instrumentation (via aggregators) and listening to them. How will it work in the Dataflow Runner?

@kennknowles
Copy link
Member Author

I don't have a plan yet, but I'll let you know what I learn :-)

@kennknowles kennknowles changed the title Enable WindowedWordCountIT on Flink runner in presubmit [BEAM-720] Enable WindowedWordCountIT on Flink runner in presubmit Mar 13, 2017
@kennknowles
Copy link
Member Author

R: @jasonkuster @aljoscha

Officially tagging with JIRA and reviewers. Sorry for the mix-up.

@kennknowles
Copy link
Member Author

retest this please

@coveralls
Copy link

Coverage Status

Coverage increased (+0.09%) to 70.143% when pulling 5ee960c on kennknowles:Flink-WindowedWordCount into 0fa1d90 on apache:master.

@asfbot
Copy link

asfbot commented Mar 13, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8358/
--none--

@jasonkuster
Copy link
Contributor

Jenkins passed and inspection of the logs confirms windowedwordcount executed.

LGTM

@davorbonaci
Copy link
Member

Is this merge-ready?

@jasonkuster
Copy link
Contributor

My understanding was yes.

@kennknowles
Copy link
Member Author

Yes. I will merge shortly.

@asfgit asfgit merged commit 5ee960c into apache:master Mar 23, 2017
asfgit pushed a commit that referenced this pull request Mar 23, 2017
@kennknowles kennknowles deleted the Flink-WindowedWordCount branch March 31, 2017 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants