Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-11752] [dist] Move flink-python to opt #7843

Merged
merged 2 commits into from
Feb 27, 2019

Conversation

uce
Copy link
Contributor

@uce uce commented Feb 26, 2019

What is the purpose of the change

Move flink-python from /lib to /opt directory of distribution.

Verifying this change

$ find lib opt
lib
lib/log4j-1.2.17.jar
lib/slf4j-log4j12-1.7.15.jar
lib/flink-dist_2.11-1.9-SNAPSHOT.jar
opt
opt/flink-metrics-prometheus-1.9-SNAPSHOT.jar
opt/flink-cep-scala_2.11-1.9-SNAPSHOT.jar
opt/flink-sql-client_2.11-1.9-SNAPSHOT.jar
opt/flink-oss-fs-hadoop-1.9-SNAPSHOT.jar
opt/flink-swift-fs-hadoop-1.9-SNAPSHOT.jar
opt/flink-queryable-state-runtime_2.11-1.9-SNAPSHOT.jar
opt/flink-gelly-scala_2.11-1.9-SNAPSHOT.jar
opt/flink-metrics-statsd-1.9-SNAPSHOT.jar
opt/flink-metrics-datadog-1.9-SNAPSHOT.jar
opt/flink-ml_2.11-1.9-SNAPSHOT.jar
opt/flink-metrics-influxdb-1.9-SNAPSHOT.jar
opt/flink-s3-fs-presto-1.9-SNAPSHOT.jar
opt/flink-gelly_2.11-1.9-SNAPSHOT.jar
opt/flink-streaming-python_2.11-1.9-SNAPSHOT.jar
opt/flink-cep_2.11-1.9-SNAPSHOT.jar
opt/flink-metrics-slf4j-1.9-SNAPSHOT.jar
opt/flink-table_2.11-1.9-SNAPSHOT.jar
opt/flink-s3-fs-hadoop-1.9-SNAPSHOT.jar
opt/flink-metrics-graphite-1.9-SNAPSHOT.jar
opt/flink-python_2.11-1.9-SNAPSHOT.jar

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no (but moves dependency in distribution)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  • The S3 file system connector: no

@uce uce requested a review from zentol February 26, 2019 21:38
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 26, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ✅ 1. The [description] looks good.
  • ✅ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ✅ 4. The change fits into the overall [architecture].
  • ✅ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@uce
Copy link
Contributor Author

uce commented Feb 27, 2019

The test failure seems unrelated to this change. I retriggered the build.

@zentol zentol self-assigned this Feb 27, 2019
@zentol
Copy link
Contributor

zentol commented Feb 27, 2019

@flinkbot approve description consensus attention

How does the pyflink.sh script react to the jar not being present?
It is not documented that you have to copy the jar from opt to lib anywhere. There might be a template for this in python streaming API docs.

@uce
Copy link
Contributor Author

uce commented Feb 27, 2019

@zentol I did not find any reference to the opt dependency in the Python streaming API docs.

For the scripts, we have the following situation:

  1. bin/pyflink.bat and bin/pyflink.sh expect flink-python in lib
  2. bin/pyflink-stream.sh expects flink-streaming-python in opt

For what did we actually need the flink-python dependency in lib? Doesn't it look like user code from Flink's perspective? If the current approach for pyflink-stream.sh works, I'd suggest to update the pyflink.sh and pyflink.bat script to expect the dependency in opt accordingly.

If we do need to have it in lib I'll update the docs to mention the required copy step and leave the scripts untouched.

@zentol
Copy link
Contributor

zentol commented Feb 27, 2019

If pyflink-stream.sh loads a jar from /opt then the script should be changed. Nothing in /opt should be accessed; it only serves as a convenient location for additional jars. It is also inconsistent with other libraries such as the sql-cli.

From a technical perspective, yes, flink-python behaves like user-code. However, from a users-perspective it's part of the system; after all they never pass the jar. One also shouldn't expect python users to be aware of this distinction.

@uce
Copy link
Contributor Author

uce commented Feb 27, 2019

But this means that it's unnecessary for users to copy the JAR to lib in the first place. And from a user's perspective, it's more convenient to simply use the JAR instead of requiring it to be copied somewhere. I would actually go the other way with this and adjust the pyflink scripts to use the jar in opt. This way, no user intervention is required and it works out of the box. Is that OK for you?

Actually, looking at sql-client.sh it also searches the JAR in opt.

@zentol
Copy link
Contributor

zentol commented Feb 27, 2019

I still think we shouldn't willy-nilly load classes from /opt; but if there's precedence go ahead and change the script as you suggested.

Copy link
Contributor

@zentol zentol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flinkbot approve all

@uce uce merged commit d637d86 into apache:master Feb 27, 2019
@uce uce deleted the flink_11752-python_opt branch February 27, 2019 14:03
HuangZhenQiu pushed a commit to HuangZhenQiu/flink that referenced this pull request Apr 22, 2019
* [FLINK-11752] [dist] Move flink-python to opt

* [FLINK-11752] [dist] Point pyflink scripts to opt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants