Skip to content

Commit

Permalink
[FLINK-1056] Use maven-assembly to build fat-jar in quickstarts
Browse files Browse the repository at this point in the history
Also update documentation accordingly.
  • Loading branch information
aljoscha committed Sep 25, 2014
1 parent d2c2cc3 commit c3c7e2d
Show file tree
Hide file tree
Showing 8 changed files with 353 additions and 179 deletions.
13 changes: 10 additions & 3 deletions docs/java_api_quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,11 @@ Start working on your Flink Java program in a few simple steps.


## Requirements

The only requirements are working __Maven 3.0.4__ (or higher) and __Java 6.x__ (or higher) installations.

## Create Project

Use one of the following commands to __create a project__:

<ul class="nav nav-tabs" style="border-bottom: none;">
Expand All @@ -36,6 +38,7 @@ Use one of the following commands to __create a project__:
</div>

## Inspect Project

There will be a new directory in your working directory. If you've used the _curl_ approach, the directory is called `quickstart`. Otherwise, it has the name of your artifactId.

The sample project is a __Maven project__, which contains two classes. _Job_ is a basic skeleton program and _WordCountJob_ a working example. Please note that the _main_ method of both classes allow you to start Flink in a development/testing mode.
Expand All @@ -46,9 +49,12 @@ We recommend to __import this project into your IDE__ to develop and test it. If
A note to Mac OS X users: The default JVM heapsize for Java is too small for Flink. You have to manually increase it. Choose "Run Configurations" -> Arguments and write into the "VM Arguments" box: "-Xmx800m" in Eclipse.

## Build Project
If you want to __build your project__, go to your project directory and issue the `mvn clean package` command. You will __find a jar__ that runs on every Flink cluster in `target/flink-project-0.1-SNAPSHOT.jar`.

If you want to __build your project__, go to your project directory and issue the`mvn clean package` command. You will __find a jar__ that runs on every Flink cluster in __target/your-artifact-id-1.0-SNAPSHOT.jar__. There is also a fat-jar, __target/your-artifact-id-1.0-SNAPSHOT-flink-fat-jar.jar__. This
also contains all dependencies that get added to the maven project.

## Next Steps

Write your application!

The quickstart project contains a WordCount implementation, the "Hello World" of Big Data processing systems. The goal of WordCount is to determine the frequencies of words in a text, e.g., how often do the terms "the" or "house" occurs in all Wikipedia texts.
Expand Down Expand Up @@ -121,6 +127,7 @@ public class LineSplitter extends FlatMapFunction<String, Tuple2<String, Integer
}
~~~

{% gh_link /flink-examples/flink-java-examples/src/main/java/org/apache/flink/example/java/wordcount/WordCount.java "Check GitHub" %} for the full example code.
{% gh_link /flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java "Check GitHub" %} for the full example code.

For a complete overview over our API, have a look at the [Programming Guide](programming_guide.html) and [further example programs](examples.html). If you have any trouble, ask on our [Mailing List](http:https://mail-archives.apache.org/mod_mbox/incubator-flink-dev/). We are happy to provide help.

For a complete overview over our Java API, have a look at the [API Documentation](java_api_guide.html) and [further example programs](java_api_examples.html). If you have any trouble, ask on our [Mailing List](http:https://mail-archives.apache.org/mod_mbox/incubator-flink-dev/). We are happy to provide help.
62 changes: 57 additions & 5 deletions docs/scala_api_quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@ title: "Quickstart: Scala API"
Start working on your Flink Scala program in a few simple steps.

## Requirements

The only requirements are working __Maven 3.0.4__ (or higher) and __Java 6.x__ (or higher) installations.


## Create Project

Use one of the following commands to __create a project__:

<ul class="nav nav-tabs" style="border-bottom: none;">
Expand All @@ -37,9 +39,10 @@ $ mvn archetype:generate \


## Inspect Project
There will be a __new directory in your working directory__. If you've used the _curl_ approach, the directory is called `quickstart`. Otherwise, it has the name of your artifactId.

The sample project is a __Maven project__, which contains a sample scala _job_ that implements Word Count. Please note that the _RunJobLocal_ and _RunJobRemote_ objects allow you to start Flink in a development/testing mode.</p>
There will be a new directory in your working directory. If you've used the _curl_ approach, the directory is called `quickstart`. Otherwise, it has the name of your artifactId.

The sample project is a __Maven project__, which contains two classes. _Job_ is a basic skeleton program and _WordCountJob_ a working example. Please note that the _main_ method of both classes allow you to start Flink in a development/testing mode.

We recommend to __import this project into your IDE__. For Eclipse, you need the following plugins, which you can install from the provided Eclipse Update Sites:

Expand All @@ -57,10 +60,59 @@ The IntelliJ IDE also supports Maven and offers a plugin for Scala development.

## Build Project

If you want to __build your project__, go to your project directory and issue the`mvn clean package` command. You will __find a jar__ that runs on every Flink cluster in __target/flink-project-0.1-SNAPSHOT.jar__.
If you want to __build your project__, go to your project directory and issue the`mvn clean package` command. You will __find a jar__ that runs on every Flink cluster in __target/your-artifact-id-1.0-SNAPSHOT.jar__. There is also a fat-jar, __target/your-artifact-id-1.0-SNAPSHOT-flink-fat-jar.jar__. This
also contains all dependencies that get added to the maven project.

## Next Steps

__Write your application!__
If you have any trouble, ask on our [Jira page](https://issues.apache.org/jira/browse/FLINK) (open an issue) or on our Mailing list. We are happy to provide help.
Write your application!

The quickstart project contains a WordCount implementation, the "Hello World" of Big Data processing systems. The goal of WordCount is to determine the frequencies of words in a text, e.g., how often do the terms "the" or "house" occurs in all Wikipedia texts.

__Sample Input__:

~~~bash
big data is big
~~~

__Sample Output__:

~~~bash
big 2
data 1
is 1
~~~

The following code shows the WordCount implementation from the Quickstart which processes some text lines with two operators (FlatMap and Reduce), and writes the prints the resulting words and counts to std-out.

~~~scala
object WordCountJob {
def main(args: Array[String]) {

// set up the execution environment
val env = ExecutionEnvironment.getExecutionEnvironment

// get input data
val text = env.fromElements("To be, or not to be,--that is the question:--",
"Whether 'tis nobler in the mind to suffer", "The slings and arrows of outrageous fortune",
"Or to take arms against a sea of troubles,")

val counts = text.flatMap { _.toLowerCase.split("\\W+") }
.map { (_, 1) }
.groupBy(0)
.sum(1)

// emit result
counts.print()

// execute program
env.execute("WordCount Example")
}
}
~~~

{% gh_link /flink-examples/flink-scala-examples/src/main/scala/org/apache/flink/examples/scala/wordcount/WordCount.scala "Check GitHub" %} for the full example code.

For a complete overview over our API, have a look at the [Programming Guide](programming_guide.html) and [further example programs](examples.html). If you have any trouble, ask on our [Mailing List](http:https://mail-archives.apache.org/mod_mbox/incubator-flink-dev/). We are happy to provide help.


Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,20 @@ specific language governing permissions and limitations
under the License.
-->

<archetype-descriptor xmlns="http:https://maven.apache.org/plugins/maven-archetype-plugin/archetype-descriptor/1.0.0" xmlns:xsi="http:https://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http:https://maven.apache.org/plugins/maven-archetype-plugin/archetype-descriptor/1.0.0 http:https://maven.apache.org/xsd/archetype-descriptor-1.0.0.xsd"
name="flink-quickstart-java">
<fileSets>
<fileSet filtered="true" packaged="true" encoding="UTF-8">
<directory>src/main/java</directory>
<includes>
<include>**/*.java</include>
</includes>
</fileSet>
</fileSets>
<archetype-descriptor
xmlns="http:https://maven.apache.org/plugins/maven-archetype-plugin/archetype-descriptor/1.0.0"
xmlns:xsi="http:https://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http:https://maven.apache.org/plugins/maven-archetype-plugin/archetype-descriptor/1.0.0 http:https://maven.apache.org/xsd/archetype-descriptor-1.0.0.xsd"
name="flink-quickstart-java">
<fileSets>
<fileSet filtered="true" packaged="true" encoding="UTF-8">
<directory>src/main/java</directory>
<includes>
<include>**/*.java</include>
</includes>
</fileSet>
<fileSet encoding="UTF-8">
<directory>src/assembly</directory>
</fileSet>
</fileSets>
</archetype-descriptor>
Original file line number Diff line number Diff line change
Expand Up @@ -61,22 +61,35 @@ under the License.
</dependency>
</dependencies>

<!-- We use the maven-jar-plugin to generate a runnable jar that you can
submit to your Flink cluster. -->
<!-- We use the maven-assembly plugin to create a fat jar that contains all dependencies
except flink and it's transitive dependencies. The resulting fat-jar can be executed
on a cluster. Change the value of Program-Class if your program entry point changes. -->
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.4</version>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4.1</version>
<configuration>
<descriptors>
<descriptor>src/assembly/flink-fat-jar.xml</descriptor>
</descriptors>
<archive>
<manifestEntries>
<program-class>${package}.Job</program-class>
<Program-Class>${package}.Job</Program-Class>
</manifestEntries>
</archive>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http:https://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->

<assembly xmlns="http:https://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2"
xmlns:xsi="http:https://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http:https://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2 http:https://maven.apache.org/xsd/assembly-1.1.2.xsd">
<id>flink-fat-jar</id>
<formats>
<format>jar</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<dependencySets>
<dependencySet>
<outputDirectory>/</outputDirectory>
<useProjectArtifact>true</useProjectArtifact>
<excludes>
<exclude>org.apache.flink:*</exclude>
</excludes>
<useTransitiveFiltering>true</useTransitiveFiltering>
<unpack>true</unpack>
<scope>runtime</scope>
</dependencySet>
</dependencySets>
</assembly>
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
Expand All @@ -17,15 +16,21 @@ KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<archetype-descriptor xsi:schemaLocation="http:https://maven.apache.org/plugins/maven-archetype-plugin/archetype-descriptor/1.0.0 http:https://maven.apache.org/xsd/archetype-descriptor-1.0.0.xsd" name="prj-scala-only"
xmlns="http:https://maven.apache.org/plugins/maven-archetype-plugin/archetype-descriptor/1.0.0"
xmlns:xsi="http:https://www.w3.org/2001/XMLSchema-instance">
<fileSets>
<fileSet encoding="UTF-8" filtered="true" packaged="true">
<directory>src/main/scala</directory>
<includes>
<include>**/*.scala</include>
</includes>
</fileSet>
</fileSets>

<archetype-descriptor
xmlns="http:https://maven.apache.org/plugins/maven-archetype-plugin/archetype-descriptor/1.0.0"
xmlns:xsi="http:https://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http:https://maven.apache.org/plugins/maven-archetype-plugin/archetype-descriptor/1.0.0 http:https://maven.apache.org/xsd/archetype-descriptor-1.0.0.xsd"
name="flink-quickstart-java">
<fileSets>
<fileSet filtered="true" packaged="true" encoding="UTF-8">
<directory>src/main/scala</directory>
<includes>
<include>**/*.scala</include>
</includes>
</fileSet>
<fileSet encoding="UTF-8">
<directory>src/assembly</directory>
</fileSet>
</fileSets>
</archetype-descriptor>
Loading

0 comments on commit c3c7e2d

Please sign in to comment.