[2/n][dagster-fivetran] Update DagsterFivetranTranslator and related classes for rework #25751

maximearmstrong · 2024-11-05T18:38:21Z

Summary & Motivation

~~Builds out a very barebones translator class for the new version of the Fivetran integration.~~

The implementation for this translator will be inspired by the DagsterFivetranTranslator introduced in #25557, but a new implementation is required to leverage the workspace context and state-backed definitions, which is incompatible with the current translator and way of building assets.

Edit after Ben's comment here:

Move things around under translator.py. This PR leverages the DagsterFivetranTranslator introduced in introduced in #25557. FivetranWorkspaceData will implement the method to_fivetran_connector_table_props_data in a subsequent PR, that will map raw connector and destination data fetched using the Fivetran API into FivetranConnectorTableProps objects, that are compatible with the translator. This process will match what we currently do.

How I Tested These Changes

Tests will be added in subsequent PRs.

maximearmstrong · 2024-11-05T18:38:37Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

maximearmstrong · 2024-11-05T18:54:44Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/v2/__init__.py

@@ -1 +1,2 @@
 from dagster_fivetran.v2.resources import FivetranWorkspace as FivetranWorkspace
+from dagster_fivetran.v2.translator import DagsterFivetranTranslator as DagsterFivetranTranslator


Until the previous functions and resource are deprecated and removed, users would import the reworked integration with the following:

from dagster_fivetran.v2 import DagsterFivetranTranslator, FivetranWorkspace

The new version of the integration won't actually be v2, so perhaps another import path would be better?

This is mainly to avoid name collisions between both previous and new translators. We could also name this translator differently if we really want users to import with from dagster_fivetran import FivetranWorkspace, but Dagster[IntegrationName]Translator is the naming convention for new integrations.

I think we should just call it something different than *.v2. Maybe experimental for a release -> 1.0 the integration, then we switch the old stuff to be under .legacy and make the new stuff exist under the top level package. @PedramNavid and @C00ldudeNoonan would probably have thoughts.

That makes sense - I updated the folder name for experimental in the previous PR in this stack.

dpeng817

to your queue for some design questions

dpeng817 · 2024-11-05T22:06:15Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/v2/translator.py

+    def get_connector_asset_key(self, data: FivetranContentData) -> AssetKey:
+        raise NotImplementedError()
+
+    def get_connector_spec(self, data: FivetranContentData) -> AssetSpec:


I'm assuming this might be inherited from existing patterns, but I do find this kinda awkward. Creates a weird footgun where you can implement get_connector_spec in a way that does not utilize get_connector_asset_key, right?

I wonder if we should just have get_connector_spec and then force people to use get_connector_spec(...).asset_key.

To make this more efficient, we could cache the get_connector_spec result per datum.

I'm assuming this might be inherited from existing patterns, but I do find this kinda awkward.

Yes exactly. A similar discussion came up here.

I'm in favor of this change - I agree that this pattern is error prone.

It was kinda easier for a user to override the get_asset_key function, to add a prefix for instance. We could recommend doing something like the following instead, which looks a bit awkward but is less error prone.

class CustomDagsterTranslator(DagsterTranslator): def get_asset_spec(self, props) -> AssetSpec: asset_spec = super().get_asset_spec(props) return asset_spec._replace(asset_key=asset_spec.asset_key.with_prefix("my_prefix"))

@benpankow Any thoughts? I can update the other translators if we all agree on this.

I think we have better fxns for updating params on asset specs now too: replace_attributes top level fxn. But yea I think this pattern is good. Having an example people can copy paste, even if it looks slightly awkward, is much preferable to introducing a footgun.

Have we made the other integrations where this fxn exists public yet @maximearmstrong ? / would like to understand the status / experimentality there.

I'm on board with this change - I think our initial push for get_asset_key was to avoid doing extra work to construct an entire asset spec when we would discard most of it & just retrieve the key, but Chris made a good point that any asset whose key we're generating we'll also be generating a spec for, either before or after. As long as we can cache the spec generation, we don't end up doing extra work in that case.

That's fair - we can remove the get_asset_key method in favor of the replace_attributes fnx for customization.

Have we made the other integrations where this fxn exists public yet

The get_asset_spec method is used in the translator of each BI integrations, and these translators are not marked as experimental. I think that method is pretty stable at this point.

The get_asset_key method is also not marked as experimental for the BI integrations, so that implies a deprecation cycle, but the risk seems pretty low here.

Yea if we're going to do it, we should probably do so soon. Want to post a public discussion about it to make sure we get buy in from @schrockn @benpankow ; do we have anyone customizing BI translators to our knowledge yet?

To my knowledge, no one is customizing the Tableau and Power BI translators yet. @benpankow Do you know if anyone is customizing the Sigma and Looker ones?

I can post the public discussion with code snippets to get consensus on this.

For this PR, I will remove the get_asset_key method - we can easily add it depending on the outcome of the discussion.

yep that sounds good to me!

dpeng817 · 2024-11-05T22:07:05Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/v2/translator.py

+        return self._context
+
+    def get_asset_key(self, data: FivetranContentData) -> AssetKey:
+        if data.content_type == FivetranContentType.CONNECTOR:


shouldn't we also have get_destination_asset_key?

We will leverage the previous translator instead. See updated PR description.

dpeng817 · 2024-11-05T22:07:55Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/v2/translator.py

+            check.assert_never(data.content_type)
+
+    def get_asset_spec(self, data: FivetranContentData) -> AssetSpec:
+        if data.content_type == FivetranContentType.CONNECTOR:


shouldn't we also have get_destination_spec?

benpankow · 2024-11-05T22:33:18Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/v2/translator.py

+        else:
+            check.assert_never(data.content_type)
+
+    def get_asset_spec(self, data: FivetranContentData) -> AssetSpec:


I'm not sure this matches the way that we model Fivetran assets, at least in the current form - each connector can sync one or many tables, each of which we represent as an asset. Destinations may not make sense to model as an asset at all, since they just represent a storage destination but not a specific e.g. table

Re-reading the code after reading you comment, that makes a lot of sense. I will adjust the code.

I moved things around so that we leverage the translator as it was implemented, while saving the state with the raw API data. See updated PR description for the details.

graphite-app · 2024-11-05T23:14:05Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/experimental/translator.py

+    def get_asset_key(self, data: FivetranContentData) -> AssetKey:
+        if data.content_type == FivetranContentType.CONNECTOR:
+            return self.get_connector_asset_key(data)
+        else:
+            check.assert_never(data.content_type)
+
+    def get_asset_spec(self, data: FivetranContentData) -> AssetSpec:
+        if data.content_type == FivetranContentType.CONNECTOR:
+            return self.get_connector_spec(data)
+        else:
+            check.assert_never(data.content_type)


Both FivetranContentType.CONNECTOR and FivetranContentType.DESTINATION need to be handled in the get_asset_key and get_asset_spec methods. Consider adding elif branches for FivetranContentType.DESTINATION that call corresponding get_destination_asset_key and get_destination_spec methods. These methods should be added as NotImplemented stubs, similar to the existing connector methods. The check.assert_never should only be hit for invalid enum values that aren't part of FivetranContentType.

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

re-requesting

benpankow · 2024-11-07T19:20:04Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/translator.py

+
+
+@whitelist_for_serdes
+@record


Do we still want/need this class, FivetranContentType etc if it's not getting fed into the translator?

In particular I think having the content type isn't necessary, since we aren't exposing this data to the user directly

I think it's worth keeping FivetranContentType for a few reasons:

The content data for Fivetran is a JSON object returned by the API for both connectors and destinations. Using FivetranContentType, it's easier to keep track of the type of the FivetranContentData.

This pattern is used for Power BI and Tableau, where the content data is also a JSON object returned by an API.

The implementation of fetch_fivetran_workspace_data leverages the FivetranContentType enum to create the FivetranWorkspaceData.

We use FivetranContentType when creating FivetranContentData objects and the FivetranWorkspaceData object, to differentiate connectors and destinations. See usage in subsequent PR, in [4/n][dagster-fivetran] Implement fetch_fivetran_workspace_data #25788 and [5/n][dagster-fivetran] Implement FivetranWorkspaceData to FivetranConnectorTableProps method #25797

Overall, I'm in favor of keeping this to keep our design pattern for XWorkspaceData and fetch_x_workspace_data as consistent as possible across integrations.

dpeng817 · 2024-11-08T02:59:27Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/translator.py

+    Subclass this class to implement custom logic for each type of Fivetran content.
+    """
+
+    def get_asset_key(self, props: FivetranConnectorTableProps) -> AssetKey:


I think we want to delete this now right?

Yes, done in 6eb1099

maximearmstrong mentioned this pull request Nov 5, 2024

[1/n][dagster-fivetran] Scaffold FivetranWorkspace for rework #25750

Merged

maximearmstrong marked this pull request as ready for review November 5, 2024 18:47

maximearmstrong commented Nov 5, 2024

View reviewed changes

maximearmstrong self-assigned this Nov 5, 2024

maximearmstrong requested review from benpankow, OwenKephart and dpeng817 November 5, 2024 18:55

dpeng817 previously requested changes Nov 5, 2024

View reviewed changes

benpankow reviewed Nov 5, 2024

View reviewed changes

maximearmstrong mentioned this pull request Nov 5, 2024

[3/n][dagster-fivetran] Implement FivetranClient for rework #25756

Merged

maximearmstrong force-pushed the maxime/rework-fivetran-1 branch from 8c291f8 to 4dfe34d Compare November 5, 2024 23:13

maximearmstrong force-pushed the maxime/rework-fivetran-2 branch from 7c954b2 to bbaf99d Compare November 5, 2024 23:13

graphite-app bot reviewed Nov 5, 2024

View reviewed changes

maximearmstrong force-pushed the maxime/rework-fivetran-1 branch from 4dfe34d to 24ed962 Compare November 6, 2024 21:23

maximearmstrong force-pushed the maxime/rework-fivetran-2 branch from bbaf99d to ed343b0 Compare November 6, 2024 21:23

maximearmstrong changed the title ~~[dagster-fivetran] Scaffold DagsterFivetranTranslator for rework~~ [dagster-fivetran] Update DagsterFivetranTranslator and related classes for rework Nov 6, 2024

maximearmstrong requested review from dpeng817 and benpankow November 6, 2024 22:32

maximearmstrong changed the title ~~[dagster-fivetran] Update DagsterFivetranTranslator and related classes for rework~~ [2/n][dagster-fivetran] Update DagsterFivetranTranslator and related classes for rework Nov 7, 2024

maximearmstrong force-pushed the maxime/rework-fivetran-1 branch from 24ed962 to 1c392d7 Compare November 7, 2024 18:15

maximearmstrong force-pushed the maxime/rework-fivetran-2 branch from 67c556e to 7197ca1 Compare November 7, 2024 18:15

maximearmstrong mentioned this pull request Nov 7, 2024

[4/n][dagster-fivetran] Implement fetch_fivetran_workspace_data #25788

Merged

benpankow reviewed Nov 7, 2024

View reviewed changes

maximearmstrong force-pushed the maxime/rework-fivetran-1 branch from 1c392d7 to c03162d Compare November 7, 2024 22:22

maximearmstrong force-pushed the maxime/rework-fivetran-2 branch from 7197ca1 to e21e3f0 Compare November 7, 2024 22:22

maximearmstrong mentioned this pull request Nov 7, 2024

[5/n][dagster-fivetran] Implement FivetranWorkspaceData to FivetranConnectorTableProps method #25797

Merged

dpeng817 reviewed Nov 8, 2024

View reviewed changes

dpeng817 approved these changes Nov 8, 2024

View reviewed changes

maximearmstrong force-pushed the maxime/rework-fivetran-1 branch from c03162d to 0d105eb Compare November 8, 2024 14:02

maximearmstrong force-pushed the maxime/rework-fivetran-2 branch from e21e3f0 to 6eb1099 Compare November 8, 2024 14:02

maximearmstrong force-pushed the maxime/rework-fivetran-1 branch from 0d105eb to e702849 Compare November 8, 2024 16:43

maximearmstrong force-pushed the maxime/rework-fivetran-2 branch from 6eb1099 to 5d21367 Compare November 8, 2024 16:44

This was referenced Nov 8, 2024

[6/n][dagster-fivetran] Implement FivetranWorkspaceDefsLoader #25807

Merged

[7/n][dagster-fivetran] Implement load_fivetran_asset_specs #25808

Merged

Base automatically changed from maxime/rework-fivetran-1 to master November 8, 2024 17:00

maximearmstrong added 6 commits November 8, 2024 12:00

[dagster-fivetran] Scaffold DagsterFivetranTranslator for rework

6ca6343

Lint

abc0107

Update translator post review

3f2e8a4

Update folder structure

f709655

Lint

bb5ddc0

Remove get_asset_key in DagsterFivetranTranslator

fed3154

maximearmstrong force-pushed the maxime/rework-fivetran-2 branch from 5d21367 to fed3154 Compare November 8, 2024 17:00

maximearmstrong merged commit b3e4e7c into master Nov 8, 2024
1 check passed

maximearmstrong deleted the maxime/rework-fivetran-2 branch November 8, 2024 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2/n][dagster-fivetran] Update DagsterFivetranTranslator and related classes for rework #25751

[2/n][dagster-fivetran] Update DagsterFivetranTranslator and related classes for rework #25751

maximearmstrong commented Nov 5, 2024 •

edited

Loading

maximearmstrong commented Nov 5, 2024 •

edited

Loading

maximearmstrong Nov 5, 2024

dpeng817 Nov 5, 2024

maximearmstrong Nov 5, 2024

dpeng817 left a comment

dpeng817 Nov 5, 2024

maximearmstrong Nov 6, 2024 •

edited

Loading

dpeng817 Nov 7, 2024

dpeng817 Nov 7, 2024

benpankow Nov 7, 2024

maximearmstrong Nov 7, 2024

maximearmstrong Nov 7, 2024

dpeng817 Nov 7, 2024

maximearmstrong Nov 8, 2024

dpeng817 Nov 8, 2024

dpeng817 Nov 5, 2024

maximearmstrong Nov 6, 2024

dpeng817 Nov 5, 2024

benpankow Nov 5, 2024

maximearmstrong Nov 5, 2024

maximearmstrong Nov 6, 2024

graphite-app bot Nov 5, 2024

benpankow Nov 7, 2024

maximearmstrong Nov 8, 2024 •

edited

Loading

dpeng817 Nov 8, 2024

maximearmstrong Nov 8, 2024

		@@ -1 +1,2 @@
		from dagster_fivetran.v2.resources import FivetranWorkspace as FivetranWorkspace
		from dagster_fivetran.v2.translator import DagsterFivetranTranslator as DagsterFivetranTranslator



		@whitelist_for_serdes
		@record

[2/n][dagster-fivetran] Update DagsterFivetranTranslator and related classes for rework #25751

[2/n][dagster-fivetran] Update DagsterFivetranTranslator and related classes for rework #25751

Conversation

maximearmstrong commented Nov 5, 2024 • edited Loading

Summary & Motivation

How I Tested These Changes

maximearmstrong commented Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dpeng817 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maximearmstrong Nov 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

graphite-app bot Nov 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maximearmstrong Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maximearmstrong commented Nov 5, 2024 •

edited

Loading

maximearmstrong commented Nov 5, 2024 •

edited

Loading

maximearmstrong Nov 6, 2024 •

edited

Loading

maximearmstrong Nov 8, 2024 •

edited

Loading