Skip to content

Commit

Permalink
[FLINK-16455][hive] Introduce flink-sql-connector-hive modules to pro…
Browse files Browse the repository at this point in the history
…vide hive uber jars


This closes apache#11328
  • Loading branch information
JingsongLi authored Mar 11, 2020
1 parent 5afe1b5 commit bd18b87
Show file tree
Hide file tree
Showing 11 changed files with 557 additions and 97 deletions.
80 changes: 31 additions & 49 deletions docs/dev/table/hive/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,32 @@ to make the integration work in Table API program or SQL in SQL Client.
Alternatively, you can put these dependencies in a dedicated folder, and add them to classpath with the `-C`
or `-l` option for Table API program or SQL Client respectively.

Apache Hive is built on Hadoop, so you need Hadoop dependency first, please refer to
[Providing Hadoop classes]({{ site.baseurl }}/ops/deployment/hadoop.html#providing-hadoop-classes).

There are two ways to add Hive dependencies. First is to use Flink's bundled Hive jars. You can choose a bundled Hive jar according to the version of the metastore you use. Second is to add each of the required jars separately. The second way can be useful if the Hive version you're using is not listed here.

#### Using bundled hive jar

The following tables list all available bundled hive jars. You can pick one to the `/lib/` directory in Flink distribution.

{% if site.is_stable %}

| Metastore version | Maven dependency | SQL Client JAR |
| :---------------- | :--------------------------- | :----------------------|
| 1.0.0 - 1.2.2 | `flink-connector-hive-1.2.2` | [Download](https://central.maven.org/maven2/org/apache/flink/flink-sql-connector-hive-1.2.2{{site.scala_version_suffix}}/{{site.version}}/flink-sql-connector-hive-1.2.2{{site.scala_version_suffix}}-{{site.version}}.jar) |
| 2.0.0 - 2.2.0 | `flink-connector-hive-2.2.0` | [Download](https://central.maven.org/maven2/org/apache/flink/flink-sql-connector-hive-2.2.0{{site.scala_version_suffix}}/{{site.version}}/flink-sql-connector-hive-2.2.0{{site.scala_version_suffix}}-{{site.version}}.jar) |
| 2.3.0 - 2.3.6 | `flink-connector-hive-2.3.6` | [Download](https://central.maven.org/maven2/org/apache/flink/flink-sql-connector-hive-2.3.6{{site.scala_version_suffix}}/{{site.version}}/flink-sql-connector-hive-2.3.6{{site.scala_version_suffix}}-{{site.version}}.jar) |
| 3.0.0 - 3.1.2 | `flink-connector-hive-3.1.2` | [Download](https://central.maven.org/maven2/org/apache/flink/flink-sql-connector-hive-3.1.2{{site.scala_version_suffix}}/{{site.version}}/flink-sql-connector-hive-3.1.2{{site.scala_version_suffix}}-{{site.version}}.jar) |

{% else %}

These tables are only available for stable releases.

{% endif %}

#### User defined dependencies

Please find the required dependencies for different Hive major versions below.


Expand All @@ -105,12 +131,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector.Contains flink-hadoop-compatibility and flink-orc jars
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.7.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-2.3.4.jar

Expand All @@ -125,12 +145,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.6.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-metastore-1.0.0.jar
hive-exec-1.0.0.jar
Expand All @@ -151,12 +165,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.6.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-metastore-1.1.0.jar
hive-exec-1.1.0.jar
Expand All @@ -177,12 +185,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.6.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-metastore-1.2.1.jar
hive-exec-1.2.1.jar
Expand All @@ -203,12 +205,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.7.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-2.0.0.jar

Expand All @@ -223,12 +219,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.7.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-2.1.0.jar

Expand All @@ -243,12 +233,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.7.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-2.2.0.jar

Expand All @@ -267,12 +251,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.8.3-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-3.1.0.jar
libfb303-0.9.3.jar // libfb303 is not packed into hive-exec in some versions, need to add it separately
Expand All @@ -281,6 +259,11 @@ Please find the required dependencies for different Hive major versions below.
</div>
</div>

If you use the hive version of HDP or CDH, you need to refer to the dependency in the previous section and select a similar version.

And you need to specify selected and supported "hive-version" in yaml, HiveCatalog and HiveModule.

### Program maven

If you are building your own program, you need the following dependencies in your mvn file.
It's recommended not to include these dependencies in the resulting jar file.
Expand Down Expand Up @@ -386,4 +369,3 @@ DDL to create Hive tables, views, partitions, functions within Flink will be sup
## DML

Flink supports DML writing to Hive tables. Please refer to details in [Reading & Writing Hive Tables]({{ site.baseurl }}/dev/table/hive/read_write_hive.html)

79 changes: 31 additions & 48 deletions docs/dev/table/hive/index.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,32 @@ to make the integration work in Table API program or SQL in SQL Client.
Alternatively, you can put these dependencies in a dedicated folder, and add them to classpath with the `-C`
or `-l` option for Table API program or SQL Client respectively.

Apache Hive is built on Hadoop, so you need Hadoop dependency first, please refer to
[Providing Hadoop classes]({{ site.baseurl }}/ops/deployment/hadoop.html#providing-hadoop-classes).

There are two ways to add Hive dependencies. First is to use Flink's bundled Hive jars. You can choose a bundled Hive jar according to the version of the metastore you use. Second is to add each of the required jars separately. The second way can be useful if the Hive version you're using is not listed here.

#### Using bundled hive jar

The following tables list all available bundled hive jars. You can pick one to the `/lib/` directory in Flink distribution.

{% if site.is_stable %}

| Metastore version | Maven dependency | SQL Client JAR |
| :---------------- | :--------------------------- | :----------------------|
| 1.0.0 - 1.2.2 | `flink-connector-hive-1.2.2` | [Download](https://central.maven.org/maven2/org/apache/flink/flink-sql-connector-hive-1.2.2{{site.scala_version_suffix}}/{{site.version}}/flink-sql-connector-hive-1.2.2{{site.scala_version_suffix}}-{{site.version}}.jar) |
| 2.0.0 - 2.2.0 | `flink-connector-hive-2.2.0` | [Download](https://central.maven.org/maven2/org/apache/flink/flink-sql-connector-hive-2.2.0{{site.scala_version_suffix}}/{{site.version}}/flink-sql-connector-hive-2.2.0{{site.scala_version_suffix}}-{{site.version}}.jar) |
| 2.3.0 - 2.3.6 | `flink-connector-hive-2.3.6` | [Download](https://central.maven.org/maven2/org/apache/flink/flink-sql-connector-hive-2.3.6{{site.scala_version_suffix}}/{{site.version}}/flink-sql-connector-hive-2.3.6{{site.scala_version_suffix}}-{{site.version}}.jar) |
| 3.0.0 - 3.1.2 | `flink-connector-hive-3.1.2` | [Download](https://central.maven.org/maven2/org/apache/flink/flink-sql-connector-hive-3.1.2{{site.scala_version_suffix}}/{{site.version}}/flink-sql-connector-hive-3.1.2{{site.scala_version_suffix}}-{{site.version}}.jar) |

{% else %}

These tables are only available for stable releases.

{% endif %}

#### User defined dependencies

Please find the required dependencies for different Hive major versions below.


Expand All @@ -105,12 +131,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector.Contains flink-hadoop-compatibility and flink-orc jars
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.7.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-2.3.4.jar

Expand All @@ -125,12 +145,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.6.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-metastore-1.0.0.jar
hive-exec-1.0.0.jar
Expand All @@ -151,12 +165,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.6.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-metastore-1.1.0.jar
hive-exec-1.1.0.jar
Expand All @@ -177,12 +185,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.6.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-metastore-1.2.1.jar
hive-exec-1.2.1.jar
Expand All @@ -203,12 +205,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.7.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-2.0.0.jar

Expand All @@ -223,12 +219,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.7.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-2.1.0.jar

Expand All @@ -243,12 +233,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.7.5-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-2.2.0.jar

Expand All @@ -267,12 +251,6 @@ Please find the required dependencies for different Hive major versions below.
// Flink's Hive connector
flink-connector-hive{{ site.scala_version_suffix }}-{{ site.version }}.jar

// Hadoop dependencies
// You can pick a pre-built Hadoop uber jar provided by Flink, alternatively
// you can use your own hadoop jars. Either way, make sure it's compatible with your Hadoop
// cluster and the Hive version you're using.
flink-shaded-hadoop-2-uber-2.8.3-{{ site.shaded_version }}.jar

// Hive dependencies
hive-exec-3.1.0.jar
libfb303-0.9.3.jar // libfb303 is not packed into hive-exec in some versions, need to add it separately
Expand All @@ -281,6 +259,11 @@ Please find the required dependencies for different Hive major versions below.
</div>
</div>

If you use the hive version of HDP or CDH, you need to refer to the dependency in the previous section and select a similar version.

And you need to specify selected and supported "hive-version" in yaml, HiveCatalog and HiveModule.

#### Program maven

If you are building your own program, you need the following dependencies in your mvn file.
It's recommended not to include these dependencies in the resulting jar file.
Expand Down
Loading

0 comments on commit bd18b87

Please sign in to comment.