Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect SQL generated for WHERE clause when using Native Query and DateTime Field Filter Variables: DATE_TRUNC #38384

Closed
mayanala-coxauto opened this issue Feb 2, 2024 · 6 comments
Labels
Database/Athena Priority:P1 Security holes w/o exploit, crashing, setup/upgrade, login, broken common features, correctness Querying/Native The SQL/native query editor Querying/Parameters & Variables Filter widgets, field filters, variables etc. Querying/Processor .Team/QueryProcessor :hammer_and_wrench: Type:Bug Product defects

Comments

@mayanala-coxauto
Copy link

Describe the bug

We have an Athena table with column 'partition_date_hour' string datatype. Data for that column looks like this in YYYY-MM-DD HH:00 format

partition_date_hour

2024-01-14 23:00
2024-01-14 23:00
2024-01-14 23:00
2024-01-14 23:00
2024-01-14 23:00
2024-01-14 23:00

We did casting at column level to Field Type -> Creation Timestamp and Cast to a specific data type -> Coercion/ISO8601 ->DateTime.

We created a query with variable p_date with Variable type -> Field Filter and Filter widget type: Date Filter

image

As you can see we get 0 results and following is the preview of the query,
SELECT
COUNT(*) AS "count"
FROM
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"
WHERE
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."account_id" = '183385486431'
AND "ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."action" = 'ALLOW'
AND "ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."httpsourcename" in ('ALB','APIGW','APPRUNNER','APPSYNC','CF','COGNITOIDP')
AND DATE_TRUNC('day', CAST("ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp)) BETWEEN timestamp '2024-01-15 18:00:00.000' AND timestamp '2024-01-15 19:00:00.000'

If we take DATE_TRUNC generated by metabase, it returns results.

To prove that, we created same query with 2 variables start_time and end time with Variable type -> Date

image

** Note: Above selection do not show the Hour selected but that is another bug I guess **

As you can see it returns results as expected 56,556 rows.

Preview of the above query is as follows

SELECT
COUNT(*) AS "count"
FROM
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"
WHERE
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."account_id" = '183385486431'
AND "ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."action" = 'ALLOW'
AND "ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."httpsourcename" in ('ALB','APIGW','APPRUNNER','APPSYNC','CF','COGNITOIDP')
AND CAST("ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour"
AS timestamp)
BETWEEN timestamp '2024-01-15 18:00:00.000' and timestamp '2024-01-15 19:00:00.000'

So DATE_TRUNC is making the difference of results here.

To Reproduce

I'm not able to reproduce this issue in Sample Database as there is no column with Datetime stored as string YYYY-MM-DD HH:00 format. I'll try to think of other ways to reproduce it.

Expected behavior

Explained in Describe Bug section

Logs

No response

Information about your Metabase installation

{
  "browser-info": {
    "language": "en-US",
    "platform": "Win32",
    "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36",
    "vendor": "Google Inc."
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.21+9-LTS",
    "java.vendor": "Red Hat, Inc.",
    "java.vendor.url": "https://www.redhat.com/",
    "java.version": "11.0.21",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.21+9-LTS",
    "os.name": "Linux",
    "os.version": "4.14.273-207.502.amzn2.x86_64",
    "user.language": "en",
    "user.timezone": "UTC"
  },
  "metabase-info": {
    "databases": [
      "h2",
      "athena"
    ],
    "hosting-env": "unknown",
    "application-database": "mysql",
    "application-database-details": {
      "database": {
        "name": "MySQL",
        "version": "5.7.12"
      },
      "jdbc-driver": {
        "name": "MariaDB Connector/J",
        "version": "2.7.6"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2024-01-05",
      "tag": "v0.47.11",
      "branch": "?",
      "hash": "51935b1"
    },
    "settings": {
      "report-timezone": null
    }
  }
}

Severity

Blocking usage of timestamp in all queries as it returns incorrect results due to DATE_TRUNC

Additional context

No response

@cabarria
Copy link

Here is an example of the issue. Using the Notebook editor as shown below with a date-time range.
image
It produces this Query:
SELECT
DATE_TRUNC(
'hour',
CAST(
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp
)
) AS "partition_date_hour",
COUNT(*) AS "count"
FROM
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"
WHERE
CAST(
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp
) BETWEEN timestamp '2024-01-14 12:30 UTC'

AND timestamp '2024-01-16 12:30 UTC'
GROUP BY
DATE_TRUNC(
'hour',
CAST(
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp
)
)
ORDER BY
DATE_TRUNC(
'hour',
CAST(
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp
)
) ASC

Now using Native SQL with a field-filter variable for the date-time range as shown here:

image
image

It produces this query:
SELECT
COUNT(*) AS "count", DATE_TRUNC('hour', CAST("ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp)) AS "partition_date_hour"
FROM
"ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"
WHERE
1=1
AND DATE_TRUNC('day', CAST("ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp)) BETWEEN timestamp '2024-01-15 09:30:00.000' AND timestamp '2024-01-16 12:30:00.000'
GROUP BY DATE_TRUNC('hour', CAST("ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp))
ORDER BY DATE_TRUNC('hour', CAST("ca_secdatalakenp_waf_logs_fh_test"."waf_logs_fh_test_raw_h_v13"."partition_date_hour" AS timestamp)) ASC

Where is the DATE_TRUNC('day' coming from?

Here is the field metadata for partition_date_hour
image

@mayanala-coxauto mayanala-coxauto changed the title Using Variable type: Field Filter and Filter widget type: Date Filter returns 0 rows due to DATE_TRUNC added by Metabase SQL. Incorrect SQL generated for WHERE clause when using Native Query and DateTime Field Filter Variables: DATE_TRUNC Feb 14, 2024
@mayanala-coxauto
Copy link
Author

@flamber

@paoliniluis paoliniluis added Querying/Processor Database/Athena Priority:P1 Security holes w/o exploit, crashing, setup/upgrade, login, broken common features, correctness Querying/Parameters & Variables Filter widgets, field filters, variables etc. Querying/Native The SQL/native query editor and removed .Needs Triage labels Feb 14, 2024
@paoliniluis
Copy link
Contributor

was this working in a previous version?

@mayanala-coxauto
Copy link
Author

@paoliniluis We are not sure if it worked before as this is the first time we are trying such queries.

@Tony-metabase
Copy link
Contributor

Can you upgrade to 1.48.8 ... I think you might be hitting this one #38037

@camsaul camsaul added the .Team/QueryProcessor :hammer_and_wrench: label Mar 21, 2024
@perivamsi
Copy link
Contributor

@mayanala-coxauto are you still facing this issue? We are unable to reproduce it.

@perivamsi perivamsi reopened this Jun 27, 2024
@perivamsi perivamsi closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Database/Athena Priority:P1 Security holes w/o exploit, crashing, setup/upgrade, login, broken common features, correctness Querying/Native The SQL/native query editor Querying/Parameters & Variables Filter widgets, field filters, variables etc. Querying/Processor .Team/QueryProcessor :hammer_and_wrench: Type:Bug Product defects
Projects
None yet
Development

No branches or pull requests

6 participants