Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37533][SQL][FOLLOWUP] try_element_at should throw an error on 0 array index #34833

Closed

Conversation

gengliangwang
Copy link
Member

What changes were proposed in this pull request?

Change the new function try_element_at to throw an error on 0 array index

Why are the changes needed?

As @srielau points out in f705e52#commitcomment-61541454, array index can't be 0 in element_at/try_element_at and we should throw an error to remind users.

Does this PR introduce any user-facing change?

No, the new function is not released yet.

How was this patch tested?

Unit tests

case class TryElementAt(left: Expression, right: Expression, child: Expression)
extends RuntimeReplaceable {
def this(left: Expression, right: Expression) =
this(left, right, TryEval(ElementAt(left, right, failOnError = false)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need TryEval any more. Moving the class next to ElementAt

@@ -1996,9 +1996,10 @@ case class ArrayPosition(left: Expression, right: Expression)
*/
@ExpressionDescription(
usage = """
_FUNC_(array, index) - Returns element of array at given (1-based) index. If index < 0,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mention that 0 index will throw an error in the function description.

@gengliangwang gengliangwang changed the title [SPARK-37533][SQL] try_element_at should throw an error on 0 array index [SPARK-37533][SQL][FOLLOWUP] try_element_at should throw an error on 0 array index Dec 8, 2021
@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50481/

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50481/

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Test build #146005 has finished for PR 34833 at commit b4f6bcb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class TryElementAt(left: Expression, right: Expression, child: Expression)

@HyukjinKwon
Copy link
Member

Merged to master.

gengliangwang referenced this pull request Dec 8, 2021
### What changes were proposed in this pull request?

Add New SQL functions `try_element_at`, which is identical to the `element_at` except that it returns null if error occurs

### Why are the changes needed?

After some data science on real world SQL queries, we found that some users are using the following queries
```
select ... where mapCol.key is not null
select ... where arrayCol[index] is not null
```

### Does this PR introduce _any_ user-facing change?

Yes, an alternative function `try_element_at` which will return null on array index out of bound or map key doesn't exist under ANSI mode.

### How was this patch tested?

Unit tests

Closes #34796 from gengliangwang/try_element_at.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants