-
Notifications
You must be signed in to change notification settings - Fork 899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change "free outputs" to also return MemoryDataSet
entries from the catalog
#1900
Comments
1 task
The last line - are those explicitly defined MemoryDataSets or implicit ones? |
I think this is a bug fix, rather than any behavioral changes. This happens if someone put MemoryDataSet in the catalog and free_outputs = pipeline.outputs() - set(catalog.list())
# This will be changed to
free_outputs = pipelines.outputs() - set(catalog_excluding_memory_dataset.list()) |
noklam
modified the milestones:
Improve the Interactive Jupyter notebook workflow,
Improving the debugging experience with Notebook
Mar 22, 2023
7 tasks
Already implemented in #3475. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
session.run()
currently returns "free outputs". Whatfree_output
means in the code is just "output that's not defined in the catalog", which is a subset of "output that's not a MemoryDataSet".kedro/kedro/runner/runner.py
Lines 78 to 91 in f491420
There was agreement that the "free outputs" output from session isn't very clear. It was suggested to simply return all output from nodes that is not consumed, even if it's defined in the catalog. However, this could lead to very large amounts of data being returned. Instead we'll change it to return all free outputs and additionally any MemoryDataSets that are defined in the catalog.
Context
#1802
The text was updated successfully, but these errors were encountered: