-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-4118] The docker-flink image is outdated (1.0.2) and can be slimmed down #2176
Conversation
Some of the changes include: - Remove unneeded dependencies (nano, wget) - Remove apt lists to reduce image size - Reduce number of layers on the docker image (best docker practice) - Remove useless variables and base the code in generic ones e.g. FLINK_HOME - Change the default JDK from oracle to openjdk-8-jre-headless, based on two reasons: 1. You cannot legally repackage the oracle jdk in docker images 2. The open-jdk headless is more appropriate for a server image (no GUI stuff) - Return port assignation to the standard FLINK one: Variable: docker-flink -> flink taskmanager.rpc.port: 6121 -> 6122 taskmanager.data.port: 6122 -> 6121 jobmanager.web.port: 8080 -> 8081
The docker images script was simplified and the image size was reduced. Previous image: Image after FLINK-4118 |
Sorry I had to rebase my previous PR but this is the definitive one. |
We don't use the conf folder anymore for the docker image.
|
||
parallelization.degree.default: %parallelism% | ||
# general configuration | ||
sed -i -e "s/taskmanager.numberOfTaskSlots: 1/taskmanager.numberOfTaskSlots: `grep -c ^processor /proc/cpuinfo`/g" $FLINK_HOME/conf/flink-conf.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only be executed on the JobManager since it will print taskmanager_1 | sed: can't create temp file '/usr/local/flink/conf/flink-conf.yamlXXXXXX': Read-only file system
which might worry some users.
Nice work! I didn't know how to use docker but I managed to set it up and use the new version on OS X without a problem. So it seems to work well, and the code is a lot simpler and the image is smaller. LGTM minus the one comment I had about the config file. |
…ager Done as suggested in the Pull Request
Nice, I just fixed as you suggested. I have three questions:
|
|
||
Images are based on Ubuntu Trusty 14.04 and run Supervisord to stay alive when running containers. | ||
docker build -t "flink" flink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my machine this doesn't work, but this does: docker build -t "flink" .
Is this maybe a leftover from the earlier version where there was a flink
directory?
(Earlier I was using sh build.sh
, that's why I didn't spot the problem.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, just fixed :)
Re 1. I think this should be fine. There is some dynamic code generation but this uses Janino as a library so that shouldn't be a problem. Re 2. I'm not aware of any but it should be easy to add them if we find any in the future. Re 3. This is not easily possible since the
I'd be very happy if you'd like to work on adding a flag or setting to make the daemons start non-daemonized. It should not be too hard to add that, IMHO. |
Awasome, thanks @aljoscha, let's merge ! |
One last thing I would like to try is running a job from an existing Flink installation using I suspect it has something to do with setting the right IP or network setting because Akka is very particular about the IP to which it is bound. Did you get this to work? I'm only managing to access the Web Dashboard. |
Hi, I tested it with the basic word count and with the beam pipeline example that @ecesena put for his flink/beam demo. I don't know if it does not work because you are running on docker for mac, but check two things:
The kinglear file must be in all the nodes. I copy those like this:
I haven't seen that you can even execute that example without arguments:
|
Yes, this is exactly what I was trying on OS X. I'm quickly setting up a ubuntu VM to see if it works there. |
You were right, I did exactly the same thing I did on OS X on a new Ubuntu 16.04 installation and it worked. 😃 |
|
||
- Upload a jar to the cluster | ||
|
||
`scp -P 220 <your_jar> root@localhost:/<your_path>` | ||
for i in $(docker ps --filter name=flink --format={{.ID}}); do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jar only needs to be uploaded to the JobManager container, so something like this should suffice:
docker cp <your_jar> $(docker ps --filter name=flink_jobmanager --format={{.ID}}):/<your_path>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice I didn't know that flink took care of this, fix in mins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the TaskManagers pull it from the JobManager which keeps it in a component called BlobManager.
I had two more comments about the README but after that it should be good to merge. |
This should be ok now. In further PRs I expect to fix the daemon thing + maybe add a HA version using zookeeper of the docker-compose file. |
That's great to hear! I'll write something on the Beam ML thread. |
Great, thanks for your review. |
I merged it, thanks again for your work! |
Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the How To Contribute guide.
In addition to going through the list, please provide a meaningful description of your changes.
mvn clean verify
has been executed successfully locally or a Travis build has passed