Add parser option for parsing SQL numeric literals as decimal #4102

andygrove · 2022-11-03T22:43:38Z

Which issue does this PR close?

Part of #4072

Rationale for this change

Main motivation is to get benchmark query q6 returning the correct results but this is also the correct behavior for SQL

What changes are included in this PR?

Add config to enable parsing numbers as decimal. We cannot enable by default until decimal support is farther along

Are there any user-facing changes?

andygrove · 2022-11-03T23:52:21Z

@kmitchener @viirya @Dandandan PTAL

datafusion/sql/src/planner.rs

viirya · 2022-11-04T01:39:56Z

datafusion/sql/src/planner.rs

+ "SELECT 1, 1.0, 0.1, .1, 12.34",
+ "Projection: Int64(1), Decimal128(Some(10),2,1), Decimal128(Some(1),2,1), Decimal128(Some(1),1,1), Decimal128(Some(1234),4,2)\


0.1 should be Decimal(1, 1), I think.

scala> sql("select 0.1") res0: org.apache.spark.sql.DataFrame = [0.1: decimal(1,1)]

we should try to use the min precision to represent the value in the decimal literal

viirya · 2022-11-04T21:36:48Z

datafusion/sql/src/planner.rs

+ let number = n.parse::<i128>().map_err(|_| {
+ DataFusionError::from(ParserError(format!(
+ "Cannot parse {} as i128 when building decimal",
+ n
+ )))
+ })?;
+ Ok(Expr::Literal(ScalarValue::Decimal128(Some(number), 38, 0)))


Can we get the minimum precision to handle this number?

scala> sql("select 10000000000000000000") res10: org.apache.spark.sql.DataFrame = [10000000000000000000: decimal(20,0)]

@viirya Could you take another look when you have time?

I'd love to do it. Let me take another look.

viirya · 2022-11-15T00:02:57Z

datafusion/sql/src/planner.rs

+}
+
+/// Convert SQL data type to relational representation of data type
+pub fn convert_data_type(sql_type: &SQLDataType) -> Result<DataType> {


Is this committed in accidentally?

Not sure why this change adds this and convert_simple_data_type. Otherwise other change looks good to me.

Yeah, this is from a merge conflict. I will fix.

ursabot · 2022-11-15T23:04:06Z

Benchmark runs are scheduled for baseline = ebb24c5 and contender = 406c108. 406c108 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

viirya · 2022-11-15T23:16:57Z

🎉

github-actions bot added sql core Core datafusion crate labels Nov 3, 2022

andygrove mentioned this pull request Nov 3, 2022

The percentile argument for ApproxPercentileCont must be Float64, not Decimal128(2, 1) #4103

Closed

fix regression

046da82

andygrove force-pushed the sql-decimals branch from 435e059 to 046da82 Compare November 3, 2022 23:35

github-actions bot removed the core Core datafusion crate label Nov 3, 2022

andygrove added 2 commits November 3, 2022 17:39

fix

0f2a4cf

test

d9ce88e

andygrove changed the title ~~Parse SQL numeric literals as decimal for compliance with ANSI SQL~~ Add parser option for parsing SQL numeric literals as decimal Nov 3, 2022

andygrove marked this pull request as ready for review November 3, 2022 23:51

viirya reviewed Nov 4, 2022

View reviewed changes

datafusion/sql/src/planner.rs Show resolved Hide resolved

viirya reviewed Nov 4, 2022

View reviewed changes

andygrove added 2 commits November 14, 2022 15:54

upmerge

571834b

address feedback

cd6c86b

andygrove force-pushed the sql-decimals branch from 8f2e730 to cd6c86b Compare November 14, 2022 23:31

match spark

94bb8a5

viirya reviewed Nov 15, 2022

View reviewed changes

fix merge conflict

b1857e6

viirya approved these changes Nov 15, 2022

View reviewed changes

clippy

6b97745

andygrove merged commit 406c108 into apache:master Nov 15, 2022

andygrove deleted the sql-decimals branch November 15, 2022 22:59

alamb mentioned this pull request Dec 11, 2023

feat: make parse_float_as_decimal work on scientific notaion #8486

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add parser option for parsing SQL numeric literals as decimal #4102

Add parser option for parsing SQL numeric literals as decimal #4102

andygrove commented Nov 3, 2022 •

edited

Loading

andygrove commented Nov 3, 2022

viirya Nov 4, 2022

liukun4515 Nov 4, 2022

liukun4515 Nov 4, 2022

viirya Nov 4, 2022

andygrove Nov 14, 2022

viirya Nov 14, 2022

viirya Nov 15, 2022

viirya Nov 15, 2022

andygrove Nov 15, 2022

ursabot commented Nov 15, 2022

viirya commented Nov 15, 2022

		"SELECT 1, 1.0, 0.1, .1, 12.34",
		"Projection: Int64(1), Decimal128(Some(10),2,1), Decimal128(Some(1),2,1), Decimal128(Some(1),1,1), Decimal128(Some(1234),4,2)\

Add parser option for parsing SQL numeric literals as decimal #4102

Add parser option for parsing SQL numeric literals as decimal #4102

Conversation

andygrove commented Nov 3, 2022 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

andygrove commented Nov 3, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ursabot commented Nov 15, 2022

viirya commented Nov 15, 2022

andygrove commented Nov 3, 2022 •

edited

Loading