forked from dagster-io/dagster
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add batch fetching of data version records (dagster-io#21798)
## Summary & Motivation Asset graphs with large fan-in can incur a hefty data-fetching cost when used with data versions. This PR fetches the asset record for a batched set of asset keys. The asset record has the last materialization record, and potentially the last observation record (in Plus), reducing the number of serial fetches we have to make to get the input data versions. This batching of calls is only possible because we're not filtering the records (obs/mats) that we're fetching (either by partition or by storage id). ## How I Tested These Changes Added an explicit fan-in data version test that checks the underlying data fetching calls. It went from 200 calls to `get_event_records` => 1 call of `get_asset_records`.
- Loading branch information
Showing
2 changed files
with
107 additions
and
43 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters