-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-7407] Adds portable wordcount test for Python 3 with Flink runner. #8745
Conversation
a58d1e9
to
b0a58b8
Compare
R: @markflyhigh @angoenka |
Ping please, @markflyhigh @angoenka. CC: @aaltay . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tvalentyn !
Some minor comments.
'has Python %d.%d interpreter. See also: BEAM-7474.' % ( | ||
sys.version_info[0], sys.version_info[1])) | ||
|
||
return ('{user}-docker-apache.bintray.io/beam/python' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we log the image URL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
sdks/python/build.gradle
Outdated
@@ -77,11 +77,14 @@ task preCommitPy2() { | |||
dependsOn "lint" | |||
} | |||
|
|||
task portablePreCommit() { | |||
addPortableWordCountTask('BatchPy2', false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can infer batch or streaming from the flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, however, the portableWordCountExample definition would be more complicated. Do we meed that task at all? how is it used?
I can remove it if you prefer and remove one of the prefix parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can certainly remove portableWordCountExample but we would like to keep portableWordCount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -1815,6 +1815,58 @@ class BeamModulePlugin implements Plugin<Project> { | |||
} | |||
} | |||
} | |||
|
|||
project.ext.addPortableWordCountTask = { String nameSuffix, boolean isStreaming -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use implicit argument to get nameSuffix and isStreaming as we do in https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L1649
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, thanks. I find explicit way more reader-friendly. Unless there is a reason (long list of optional parameters, etc), I think explicit is better than implicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have similar feeling but I think in groovy land, implicit is more popular because we don't have to change method signature to add new arguments or remove arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL. I couldn't find such recommendation in Groovy style guide https://groovy-lang.org/style-guide.html. I moved the streaming parameter into an internal helper function, and it only has 1 parameter right now. Making it implicit requires adding another "Configuration" class, which at this point seems unnecessary boilerplate. I agree that passing maps is better when we have large number of parameters, similar how service requests/responses are JSON / ProtoBufs, but I don't think we should follow this everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @tvalentyn! General lgtm. Some minor thoughts.
|
||
project.task('portableWordCount' + nameSuffix) { | ||
dependsOn 'installGcpTest' | ||
dependsOn = ['installGcpTest'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch.
else: | ||
version_suffix = '3' | ||
# TODO(BEAM-7474): Use an image which has correct Python minor version. | ||
logging.warning('Make sure that locally built Python SDK docker image ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it have to be locally built
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a release process for this image yet, so we build it locally when we run the test.
@@ -1815,6 +1815,58 @@ class BeamModulePlugin implements Plugin<Project> { | |||
} | |||
} | |||
} | |||
|
|||
project.ext.addPortableWordCountTask = { String nameSuffix, boolean isStreaming -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay to have this task in BeamModulePlugin
for now. But my concern is that this may not be scalable if we want to add more variants (e.g. pipelines, configurations) in the future. We probably can make it more configurable or put it in separate groovy if more variants we are end up with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is my concern as well, this is being discussed on the mailing list, let's continue the discussion on https://lists.apache.org/thread.html/7c48200678c570c6e42d4933f8ea4f9324d82f7ae720b591a93ffcda@%3Cdev.beam.apache.org%3E.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, @angoenka @markflyhigh . PTAL.
else: | ||
version_suffix = '3' | ||
# TODO(BEAM-7474): Use an image which has correct Python minor version. | ||
logging.warning('Make sure that locally built Python SDK docker image ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a release process for this image yet, so we build it locally when we run the test.
'has Python %d.%d interpreter. See also: BEAM-7474.' % ( | ||
sys.version_info[0], sys.version_info[1])) | ||
|
||
return ('{user}-docker-apache.bintray.io/beam/python' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
project.task('portableWordCount' + nameSuffix) { | ||
dependsOn 'installGcpTest' | ||
dependsOn = ['installGcpTest'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch.
@@ -1815,6 +1815,58 @@ class BeamModulePlugin implements Plugin<Project> { | |||
} | |||
} | |||
} | |||
|
|||
project.ext.addPortableWordCountTask = { String nameSuffix, boolean isStreaming -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, thanks. I find explicit way more reader-friendly. Unless there is a reason (long list of optional parameters, etc), I think explicit is better than implicit.
@@ -1815,6 +1815,58 @@ class BeamModulePlugin implements Plugin<Project> { | |||
} | |||
} | |||
} | |||
|
|||
project.ext.addPortableWordCountTask = { String nameSuffix, boolean isStreaming -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is my concern as well, this is being discussed on the mailing list, let's continue the discussion on https://lists.apache.org/thread.html/7c48200678c570c6e42d4933f8ea4f9324d82f7ae720b591a93ffcda@%3Cdev.beam.apache.org%3E.
sdks/python/build.gradle
Outdated
@@ -77,11 +77,14 @@ task preCommitPy2() { | |||
dependsOn "lint" | |||
} | |||
|
|||
task portablePreCommit() { | |||
addPortableWordCountTask('BatchPy2', false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, however, the portableWordCountExample definition would be more complicated. Do we meed that task at all? how is it used?
I can remove it if you prefer and remove one of the prefix parameter.
Thanks @tvalentyn |
0a346bd
to
1e96741
Compare
1e96741
to
9f2419b
Compare
Thanks, @angoenka, PTAL. |
Run Java_Examples_Dataflow PreCommit |
Run Python PreCommit |
Run Java PreCommit |
This PR extends the portable python precommit test suite with a Python 3 scenario.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.