-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bnb/dev #95
Conversation
…_index precomputed to data handler
…allel passes on single nodes.
…e when only 1 flist chunk.
…t ordering. needed to correct for this in the transform + rotate operation.
|
||
Parameters | ||
---------- | ||
feature : str | ||
Dataset name to collect. | ||
masked_meta : pd.DataFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method docstring is confusing. What is the difference between masked_meta
and masked_target_meta
? Is the file paths a subset (chunk) of the full list of file paths? Is target_final_meta_file
really a file path? (also target_final_meta_file is misnamed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Masked meta is for the file list chunk and masked target meta is for the entire file list. Good catch on the misnaming, here it's not a file.
extract_workers=None, compute_workers=None, load_workers=None, | ||
norm_workers=None, ti_workers=None): | ||
norm_workers=None, ti_workers=None, handle_features=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring for handle_features? why provide separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to enable precomputing for the forward pass pipeline. When I was getting threads working these repeated calls to xarray caused issues.
Added stats cli and got process pool for passes on single nodes working.