Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add masking helpers to validation framework #140

Merged
merged 8 commits into from
Jul 23, 2018

Conversation

awst-baum
Copy link
Collaborator

Extends the existing MaskingAdapter so that it can be used reduce multi-column masking datasets to one column and do "not equal" comparisons. Adds a SelfMaskingAdapter that wraps a data reader and allows to mask the data while reading, based on a comparison operation on one column (column, operation, and threshold can be specified).

@sebhahn sebhahn requested a review from cpaulik July 18, 2018 19:49
threshold:
value to use as the threshold combined with the operator
column_name: string
name of the column to cut the read masking dataset to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be something like name of the column to apply the threshold to?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds better (although the given column will be the only one remaining in the DataFrame after masking).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? How I read the code the complete dataframe will be returned but masked by the boolean mask

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we talking about the changes in MaskingAdapter or about the SelfMaskingAdapter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelfMaskingAdapter. Line 101 of the diff.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, my bad. I've changed it.

return self.operator(data, self.threshold)
return self.__mask(data)

class SelfMaskingAdapter(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a small paragraph to the documentation about the SelfMaskingAdapter and the differences to the MaskingAdapter. I think that would help us in making it clearer what they both do. I think we might be able to combine them into one class since they are very similar. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re documentation, sure, as soon as we've sorted out what we actually want to do. Re combining: pro: avoiding code repetition, contra: both adapters do something different and they yield a quite different result. My personal taste as a dev would be to have two separate classes for two separate concerns - but I could derive them from a common base class to avoid code repetition. If I do that, should I use an https://docs.python.org/2/library/abc.html ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the documentation I think we should edit the ipython notebooks and automatically include them in the sphinx documentation. For now editing https://github.com/TUW-GEO/pytesmo/blob/master/docs/validation_framework.ipynb and exporting it as rst should also work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep the classes separate for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've updated the docs - what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'll merge this after a quick rebase if all the tests still pass.

@cpaulik cpaulik merged commit 027fe55 into TUW-GEO:master Jul 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants