Add dataframe filter to filter tabular data #51

aiakide · 2023-07-07T15:25:33Z

📥 Pull Request Description

A DataframeFilter is used to filter tabular data in a DfDataset.
Besides the abstract implementation, a NaNDataframeFilter has also been implemented, which removes rows with NaN values in the input and target columns of the data description.

👀 Affected Areas

DataframeFilter (new)

📝 Checklist

Please make sure you've completed the following tasks before submitting this pull request:

Pre-commit hooks were executed
Changes have been reviewed by at least one other developer
Tests have been added or updated to cover the changes (only necessary if the changes affect the executable code)
All tests ran successfully
All merge conflicts are resolved
Documentation has been updated to reflect the changes
Any necessary migrations have been run

📌 Related Issues

None

🔗 Links

None

📷 Screenshots

None

## 📥 Pull Request Description The following features and fixes will be part of the next release (`v0.6.0`) - feat: Add lockfile name as attribute of `FileChecksumProcessor ` (#46) - fix: Remove temp directory from hydra search path. Add hydra config mapping factory (#47) - fix: save result files from `tensorgraphanalyzer` at the correct place and implemented validation for that (#50) - feat: Add dagster op for dataframe normalization (#48) - feat: Add `NanDataframeFilter` to drop nan values of feature columns (#51) - fix: Adjust supported python versions in `Getting Started` docs section Additionally, there are several adjustments to the project organization - Pull request template added - Bug Report template added - Code of Conduct added ## 👀 Affected Areas - `FileChecksumProcessor ` - dagster ops - `df_normalization` - `DataframeFilter` - `NanDataframeFilter` - `tensorgraphanalyzer` - docs ## 📝 Checklist Please make sure you've completed the following tasks before submitting this pull request: - [X] Pre-commit hooks were executed - [ ] Changes have been reviewed by at least one other developer - [X] Tests have been added or updated to cover the changes (only necessary if the changes affect the executable code) - [X] All tests ran successfully - [X] All merge conflicts are resolved - [X] Documentation has been updated to reflect the changes - [X] Any necessary migrations have been run ## 📌 Related Issues _None_ ## 🔗 Links _None_ ## 📷 Screenshots _None_

aiakide added 2 commits July 7, 2023 17:13

feat: Add abstract dataframe filter

cd6af21

feat: Add NanDataframeFilter to drop nan values of feature columns

ab50399

aiakide requested review from dstalzjohn and ankeko July 7, 2023 15:25

dstalzjohn approved these changes Jul 7, 2023

View reviewed changes

aiakide and others added 2 commits July 7, 2023 17:34

feat: Add test for NanDataframeFilter

0673ac5

Merge branch 'develop' into feature/dataframe-filter

8c143f0

aiakide merged commit 3561200 into develop Jul 7, 2023
6 checks passed

aiakide deleted the feature/dataframe-filter branch July 7, 2023 20:54

aiakide mentioned this pull request Jul 10, 2023

Release v0.6.0 #52

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dataframe filter to filter tabular data #51

Add dataframe filter to filter tabular data #51

aiakide commented Jul 7, 2023 •

edited

Loading

Add dataframe filter to filter tabular data #51

Add dataframe filter to filter tabular data #51

Conversation

aiakide commented Jul 7, 2023 • edited Loading

📥 Pull Request Description

👀 Affected Areas

📝 Checklist

📌 Related Issues

🔗 Links

📷 Screenshots

aiakide commented Jul 7, 2023 •

edited

Loading