Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-1605] Bundle all hadoop dependencies and shade guava away #454

Closed
wants to merge 6 commits into from

Conversation

rmetzger
Copy link
Contributor

@rmetzger rmetzger commented Mar 4, 2015

I don't have a working eclipse setup right now, so I didn't test this change with eclipse.

I would be very interested in some feedback from there.
Its working with IntelliJ.

@StephanEwen
Copy link
Contributor

Looks good, but also like a lot of magic. Can you briefly summarize here (and copy it to the wiki) how the setup works? I am especially confused by the extra YARN shaded versions (two of them).

Some remaining questions:

  • Why give up the ship directory? I think it was good to differentiate between ship and lib.
  • What is included in the final flink-dist fat jar? All streaming connectors? The hbase connector and hadoop compatibility?

@rmetzger
Copy link
Contributor Author

rmetzger commented Mar 5, 2015

I know, the stuff is very complicated.

I've documented the setup here: https://cwiki.apache.org/confluence/display/FLINK/Dependency+Shading+in+Flink
Please let me know if there are details missing in the text.

Regarding ship and lib: If you want I can undo the change (I've spend some time implementing the feature, so I'm actually in favor of keeping it).
I've removed it because I thought non-yarn users might be confused by this directory (i combined the hadoop2 and hadoop2-yarn binaries).

The final flink-dist fat jar contains:

  • the streaming connectors (my java6 pr will remove them from there
  • no hbase
  • no hadoopcompat

please don't merge the pull request! I've found an issue while writing the documentation. I'm currently verifying my fix.

@rmetzger
Copy link
Contributor Author

rmetzger commented Mar 9, 2015

The pull request has been rebased to master.

I'm looking for some feedback for this change.

<version>${hadoop.version}</version>
<exclusions>
<exclusion>
<groupId>asm</groupId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this groupId correct, or should it be org.ow2.asm ?

@StephanEwen
Copy link
Contributor

Here are some points to double check

  • You exclude all the flink-clients web docs from the uberjar. I would guess this breaks the webclient, because it uses the JAR file as its web root, or am I overlooking something?
  • The uberjar has the YARN session as the main class. Seems strange to make this one the main class of the Uberjar, when there are several main classes (JobManager, TaskManager, CliFrontend, WebClient, AppMaster, ...)
  • Previously, the shade plugin was deactivated for the quickstarts. Is that no longer necessary?
  • Comments inline about the asm groupId

If none of these points is actually an issue, then +1 to merge

@rmetzger
Copy link
Contributor Author

Thank you for reviewing the change.

  • The exclusions for the flink-clients were indeed wrong. Good catch!
  • It doesn't make much sense to set a main class for the uberjar. We don't use that property anywhere. I removed it.
  • I've deactivated shading again for the quickstarts ..
  • It seems that the asm package changed the groupId around the 4.0 release. I'll add both groupIds to the exclusions.

I've updated the pull requests. Lets see what travis says about it: https://travis-ci.org/rmetzger/flink/builds/53795738

@StephanEwen
Copy link
Contributor

Looks good to me.

+1 to merge, if Travis (or Bob the Builder) agrees

<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<!-- NOTE: We do not exclude hadoop's dependencies so that the tests start property (in particular jetty) -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: properly

@tillrohrmann
Copy link
Contributor

That are some massive changes to the pom files. As far as I can tell it looks good to me. The Akka changes look good.

@rmetzger
Copy link
Contributor Author

Thanks for the review.
I'll address your comments and then merge it.

@hsaputra
Copy link
Contributor

How much time does this new shading add to the total compile? It used to be around 16-18mins for me using mvn clean install -DskipTests.

I just did merge today and it has been more than 25 mins and has not complete the build =(

EDIT:
Looks like false alarm bc my Linux box can compile this in about 8 mins, something wrong with my MBP setup/ env =(

@rmetzger
Copy link
Contributor Author

Hey Henry,
I'm sorry that the change is making your builds so much slower. My linux machine needs 5 minutes to build the current master.
I suspect there is something wrong on your MBP setup.

marthavk pushed a commit to marthavk/flink that referenced this pull request Jun 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants