-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(looker): represent looker views dependencies from derived tables #21689
feat(looker): represent looker views dependencies from derived tables #21689
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @rexledesma and the rest of your teammates on Graphite |
3596c68
to
a6a5d38
Compare
0ea1fd8
to
a4e4013
Compare
f24a790
to
e02258d
Compare
776a176
to
5adb9a7
Compare
5adb9a7
to
3f96520
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks good to me - would be good to test some of the failure cases (unparsable SQL and also templated SQL that can't be parsed) to make sure we get the expected fallback/no dependencies & error message.
# https://cloud.google.com/looker/docs/derived-tables | ||
derived_table_sql = view.get("derived_table", {}).get("sql") | ||
|
||
if derived_table_sql and "$" in derived_table_sql: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we can more definitively check if this is a $
operator vs e.g. a $
embedded in a string? Can't imagine it'd be common, but there's probably some cases where it's appearing in a query that's not indicative of templating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silly example being some column where a user is concatenating a dollar sign string to an integer value to represent a cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is the tricky part... Looker SQL is potentially parameterized by two differing mechanisms:
- The
${...}
substitution operator: https://cloud.google.com/looker/docs/sql-and-referring-to-lookml - Liquid variable referencing: https://cloud.google.com/looker/docs/liquid-variable-reference
We can probably detect the presence of (1) using regex, instead of a naive "$" in sql
check.
Unsure how to resolve (2) other than actually rendering the templates with an environment of variables.
# https://cloud.google.com/looker/docs/derived-tables | ||
derived_table_sql = view.get("derived_table", {}).get("sql") | ||
|
||
if derived_table_sql and "$" in derived_table_sql: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we can more definitively check if this is a $
operator vs e.g. a $
embedded in a string? Can't imagine it'd be common, but there's probably some cases where it's appearing in a query that's not indicative of templating.
python_modules/libraries/dagster-looker/dagster_looker/asset_decorator.py
Outdated
Show resolved
Hide resolved
python_modules/libraries/dagster-looker/dagster_looker/asset_decorator.py
Outdated
Show resolved
Hide resolved
python_modules/libraries/dagster-looker/dagster_looker/asset_decorator.py
Show resolved
Hide resolved
python_modules/libraries/dagster-looker/dagster_looker/asset_decorator.py
Show resolved
Hide resolved
3f96520
to
89027bf
Compare
89027bf
to
2331be9
Compare
…dagster-io#21689) ## Summary & Motivation Introduce `sqlglot` to parse Looker views that are defined with a `derived_table`. The annoying part here is that the SQL here is potentially templated: https://cloud.google.com/looker/docs/sql-and-referring-to-lookml#substitution_operator_. In this case, don't give it upstream dependencies. In a future PR, we can potentially look at a way to use regex to template the names into file, using a context dictionary assembled from the total aggregate of `*.view.lkml` files. ## How I Tested These Changes pytest - SQL with the `$` operator will have an empty set of upstream dependencies - SQL without the `$` operator but cannot be parsed by `sqlglot` will have an empty set of upstream dependencies
Summary & Motivation
Introduce
sqlglot
to parse Looker views that are defined with aderived_table
.The annoying part here is that the SQL here is potentially templated: https://cloud.google.com/looker/docs/sql-and-referring-to-lookml#substitution_operator_. In this case, don't give it upstream dependencies.
In a future PR, we can potentially look at a way to use regex to template the names into file, using a context dictionary assembled from the total aggregate of
*.view.lkml
files.How I Tested These Changes
pytest
$
operator will have an empty set of upstream dependencies$
operator but cannot be parsed bysqlglot
will have an empty set of upstream dependencies