Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make GoogleCloudServiceAccountDictProfileMapping dataset profile argument optional #839

Conversation

oliverrmaa
Copy link
Contributor

@oliverrmaa oliverrmaa commented Feb 7, 2024

Description

This PR follows the methodology in #683 by modifying the GCP GoogleCloudServiceAccountDictProfileMapping() profile_args not require dataset as a required argument.

Related Issue(s)

Closes #837

Breaking Change?

Not to my knowledge.

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@oliverrmaa oliverrmaa requested a review from a team as a code owner February 7, 2024 06:54
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Feb 7, 2024
Copy link

netlify bot commented Feb 7, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit ef78fe9
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/6656ef226083a500096deac0

@dosubot dosubot bot added area:profile Related to ProfileConfig, like Athena, BigQuery, Clickhouse, Spark, Trino, etc profile:bigquery Related to BigQuery ProfileConfig labels Feb 7, 2024
@oliverrmaa
Copy link
Contributor Author

tagging @astro-cosmos-admins @tatiana for review, I wasn't able to tag anyone as reviewer since I don't have permissions so hopefully writing it as a comment is okay. Thank you

Copy link
Collaborator

@jbandoro jbandoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @oliverrmaa!

I didn't realize "dataset"/"schema" was not required for the profiles file. I've changed it before by overriding the generate_schema_name as described in the docs where it by default concatenates the default target schema with the custom one which we found annoying.

If the default target schema from the profiles file is null do you still have to change the generate_schema_name macro?

Comment on lines +52 to +55
if "dataset" in self.profile_args:
profile["dataset"] = self.profile_args["dataset"]

return profile
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed since if dataset is already in profile_args it will be merged above in:

       {
            **self.mapped_params,
            "threads": 1,
            **self.profile_args,
        }

"type": "bigquery",
"method": "service-account-json",
"project": "my_project",
"dataset": None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if having None in the profile is OK when this is formatted to yaml in BaseProfile.get_profile_contents this would result in profile like:

dataset: null
keyfile_json:
  private_key: '{{ env_var(''COSMOS_CONN_GOOGLE_CLOUD_PLATFORM_PRIVATE_KEY'') }}'
  private_key_id: '{{ env_var(''COSMOS_CONN_GOOGLE_CLOUD_PLATFORM_PRIVATE_KEY_ID'')
    }}'
  type: service_account
method: service-account-json
project: my_project
threads: 1
type: bigquery

Would the dataset: null line above cause any issue?

Copy link

codecov bot commented Feb 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.40%. Comparing base (cb2a27a) to head (87d2e72).
Report is 6 commits behind head on main.

Current head 87d2e72 differs from pull request most recent head ef78fe9

Please upload reports for the commit ef78fe9 to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #839      +/-   ##
==========================================
- Coverage   95.72%   95.40%   -0.33%     
==========================================
  Files          60       57       -3     
  Lines        2926     2762     -164     
==========================================
- Hits         2801     2635     -166     
- Misses        125      127       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jbandoro jbandoro added the status:awaiting-author Issue/PR is under discussion and waiting for author's input label Feb 21, 2024
Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oliverrmaa thank you very much for your contribution! Is there any chance you could address @jbandoro 's feedback? We may be able to add this as part of the 1.4.0 release later this week if you are.

@tatiana tatiana added this to the 1.5.0 milestone May 17, 2024
@tatiana tatiana added the triage-needed Items need to be reviewed / assigned to milestone label May 17, 2024
@pankajastro
Copy link
Contributor

Hey @oliverrmaa, Thanks for your contribution! Could you please rebase and respond to @jbandoro's feedback? we're excited to see this move forward.

@pankajastro
Copy link
Contributor

Hi @oliverrmaa, I was wondering if you had a chance to read my previous comments. Are you still working on it, or would it be alright if I take it forward from here? Thank you!

@tatiana tatiana added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jun 4, 2024
tatiana pushed a commit that referenced this pull request Jun 5, 2024
#1017)

This PR follows the methodology in
#683 by modifying
the GCP `GoogleCloudServiceAccountDictProfileMapping()` `profile_args`
not require a dataset as a required argument.

DAG RUN
<img width="1473" alt="Screenshot 2024-06-05 at 6 50 04 PM"
src="https://github.com/astronomer/astronomer-cosmos/assets/98807258/81b127b9-c5e5-4983-9efe-bbf00d81914f">

Co-authored-by: "Ollie Ma" <[email protected]> 
Original PR by @oliverrmaa: #839

Closes #837
@tatiana
Copy link
Collaborator

tatiana commented Jun 5, 2024

Since we consider this change very relevant, we ended up rebasing and addressing conflicts in a separate PR and adding @oliverrmaa as a co-author: #1017

@tatiana tatiana closed this Jun 5, 2024
@dosubot dosubot bot removed this from the Cosmos 1.5.0 milestone Jun 5, 2024
tatiana pushed a commit that referenced this pull request Jun 6, 2024
#1017)

This PR follows the methodology in
#683 by modifying
the GCP `GoogleCloudServiceAccountDictProfileMapping()` `profile_args`
not require a dataset as a required argument.

DAG RUN
<img width="1473" alt="Screenshot 2024-06-05 at 6 50 04 PM"
src="https://github.com/astronomer/astronomer-cosmos/assets/98807258/81b127b9-c5e5-4983-9efe-bbf00d81914f">

Co-authored-by: "Ollie Ma" <[email protected]> 
Original PR by @oliverrmaa: #839

Closes #837
@tatiana tatiana mentioned this pull request Jun 6, 2024
tatiana added a commit that referenced this pull request Jun 6, 2024
Bug fixes

* Fix the invocation mode for ``ExecutionMode.VIRTUALENV`` by @marco9663
in #1023
* Fix Cosmos ``enable_cache`` setting by @tatiana in #1025
* Make ``GoogleCloudServiceAccountDictProfileMapping`` dataset profile
arg optional by @oliverrmaa and @pankajastro in #839 and #1017
* Athena profile mapping set ``aws_session_token`` in profile only if it
exists by @pankajastro in #1022

Others

* Update dbt and Airflow conflicts matrix by @tatiana in #1026
* Enable Python 3.12 unittest by @pankajastro in #1018
* Improve error logging in ``DbtLocalBaseOperator`` by @davidsteinar in
#1004
* Add GitHub issue templates for bug reports and feature request by
@pankajkoti in #1009
* Add more fields in bug template to reduce turnaround in issue triaging
by @pankajkoti in #1027
* Fix ``dev/Dockerfile`` + Add ``uv pip install`` for faster build time
by @dwreeves in #997
* Drop support for Airflow 2.3 by @pankajkoti in #994
* Update Astro Runtime image by @RNHTTR in #988 and #989
* Enable ruff F linting by @pankajastro in #985
* Move Cosmos Airflow configuration to settings.py by @pankajastro in
#975
* Fix CI Issues by @tatiana in #1005
* Pre-commit hook updates in #1000, #1019

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Reeves <[email protected]>
Co-authored-by: Pankaj Koti <[email protected]>
Co-authored-by: davidsteinar <[email protected]>
Co-authored-by: David Steinar Asgrimsson <[email protected]>
Co-authored-by: Marco Yuen <[email protected]>
Co-authored-by: Pankaj Singh <[email protected]>
Co-authored-by: Ollie Ma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:profile Related to ProfileConfig, like Athena, BigQuery, Clickhouse, Spark, Trino, etc epic-assigned profile:bigquery Related to BigQuery ProfileConfig size:M This PR changes 30-99 lines, ignoring generated files. stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed status:awaiting-author Issue/PR is under discussion and waiting for author's input triage-needed Items need to be reviewed / assigned to milestone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make GoogleCloudServiceAccountDictProfileMapping dataset profile argument optional
4 participants