From 7d52ddc4d5f9c61d4b4795960450457712d93b4f Mon Sep 17 00:00:00 2001
From: Timo Walther <twalthr@apache.org>
Date: Wed, 8 Jul 2020 11:41:30 +0200
Subject: [PATCH] [hotfix][docs] Improve data type documentation for Scala
 users

---
 docs/dev/table/functions/udfs.md |  8 +++++++-
 docs/dev/table/types.md          | 12 +++++++++---
 docs/dev/table/types.zh.md       |  8 +++++++-
 3 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/docs/dev/table/functions/udfs.md b/docs/dev/table/functions/udfs.md
index e212ea5ac6749..70516b7af76b8 100644
--- a/docs/dev/table/functions/udfs.md
+++ b/docs/dev/table/functions/udfs.md
@@ -207,6 +207,10 @@ Regular JVM method calling semantics apply. Therefore, it is possible to:
 - use object inheritance such as `eval(Object)` that takes both `LocalDateTime` and `Integer`,
 - and combinations of the above such as `eval(Object...)` that takes all kinds of arguments.
 
+If you intend to implement functions in Scala, please add the `scala.annotation.varargs` annotation in
+case of variable arguments. Furthermore, it is recommended to use boxed primitives (e.g. `java.lang.Integer`
+instead of `Int`) to support `NULL`.
+
 The following snippets shows an example of an overloaded function:
 
 <div class="codetabs" markdown="1">
@@ -240,6 +244,8 @@ public static class SumFunction extends ScalarFunction {
 <div data-lang="Scala" markdown="1">
 {% highlight scala %}
 import org.apache.flink.table.functions.ScalarFunction
+import java.lang.Integer
+import java.lang.Double
 import scala.annotation.varargs
 
 // function with overloaded evaluation methods
@@ -280,7 +286,7 @@ If more advanced type inference logic is required, an implementer can explicitly
 
 The automatic type inference inspects the function's class and evaluation methods to derive data types for the arguments and result of a function. `@DataTypeHint` and `@FunctionHint` annotations support the automatic extraction.
 
-For a full list of classes that can be implicitly mapped to a data type, see the [data type section]({% link dev/table/types.md %}#data-type-annotations).
+For a full list of classes that can be implicitly mapped to a data type, see the [data type extraction section]({% link dev/table/types.md %}#data-type-extraction).
 
 **`@DataTypeHint`**
 
diff --git a/docs/dev/table/types.md b/docs/dev/table/types.md
index 698deca089d2c..f06e4dff353f2 100644
--- a/docs/dev/table/types.md
+++ b/docs/dev/table/types.md
@@ -1348,15 +1348,21 @@ DataTypes.NULL()
 |`java.lang.Object` | X     | X      | *Default*                            |
 |*any class*        |       | (X)    | Any non-primitive type.              |
 
-Data Type Annotations
----------------------
+Data Type Extraction
+--------------------
 
 At many locations in the API, Flink tries to automatically extract data type from class information using
 reflection to avoid repetitive manual schema work. However, extracting a data type reflectively is not always
 successful because logical information might be missing. Therefore, it might be necessary to add additional
 information close to a class or field declaration for supporting the extraction logic.
 
-The following table lists classes that can be implicitly mapped to a data type without requiring further information:
+The following table lists classes that can be implicitly mapped to a data type without requiring further information.
+
+If you intend to implement classes in Scala, *it is recommended to use boxed types* (e.g. `java.lang.Integer`)
+instead of Scala's primitives. Scala's primitives (e.g. `Int` or `Double`) are compiled to JVM primitives (e.g.
+`int`/`double`) and result in `NOT NULL` semantics as shown in the table below. Furthermore, Scala primitives that
+are used in generics (e.g. `java.lang.Map[Int, Double]`) are erased during compilation and lead to class
+information similar to `java.lang.Map[java.lang.Object, java.lang.Object]`.
 
 | Class                       | Data Type                           |
 |:----------------------------|:------------------------------------|
diff --git a/docs/dev/table/types.zh.md b/docs/dev/table/types.zh.md
index 4542db05dddaa..e9b4dcc7be7d8 100644
--- a/docs/dev/table/types.zh.md
+++ b/docs/dev/table/types.zh.md
@@ -1245,7 +1245,13 @@ DataTypes.NULL()
 
 Flink API 经常尝试使用反射自动从类信息中提取数据类型，以避免重复的手动定义模式工作。然而以反射方式提取数据类型并不总是成功的，因为可能会丢失逻辑信息。因此，可能有必要在类或字段声明附近添加额外信息以支持提取逻辑。
 
-下表列出了可以隐式映射到数据类型而无需额外信息的类：
+下表列出了可以隐式映射到数据类型而无需额外信息的类。
+
+If you intend to implement classes in Scala, *it is recommended to use boxed types* (e.g. `java.lang.Integer`)
+instead of Scala's primitives. Scala's primitives (e.g. `Int` or `Double`) are compiled to JVM primitives (e.g.
+`int`/`double`) and result in `NOT NULL` semantics as shown in the table below. Furthermore, Scala primitives that
+are used in generics (e.g. `java.lang.Map[Int, Double]`) are erased during compilation and lead to class
+information similar to `java.lang.Map[java.lang.Object, java.lang.Object]`.
 
 | 类                          | 数据类型                            |
 |:----------------------------|:------------------------------------|