-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] Fix legacy do_write #32661
[Data] Fix legacy do_write #32661
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing! Could we add a test for this case?
How would you recommend adding a test for this? I don't see existing test cases for CustomDatasource to base off of. |
You may extend the |
Dataset tests seem to be failing (@jianoaix ) |
Apologies, I will investigate. Quick sanity check, how do you run the data tests locally? I am able to run basic tests with |
Yes, that's how we run tests locally. Can you try with `python python/ray/data/tests/test_dataset.py'? |
Running that in the same directory (top level ray) gives the same error:
|
The test failed is actually this one: https://buildkite.com/ray-project/oss-ci-build-pr/builds/12645 |
Signed-off-by: Matthew Tang <[email protected]>
73e8852
to
85f7b40
Compare
Signed-off-by: Matthew Tang <[email protected]>
To my understanding, the current issue is that various front-facing Datasources do not implement The original problem still stands that Still investigating a solution to this. |
Yeah, with the fix this PR, those subclasses of |
Signed-off-by: Matthew Tang <[email protected]>
Just pushed a new version which checks if the |
That LGTM! |
It looks tests passing now. @matthew29tang Can you update the PR description based on current state? |
Just edited the PR description. |
Merged, thanks! |
With the new `write` added (from ray-project#32015 and ray-project#32440), Ray Data intends to support both the `write` and `do_write` functions for now. The check currently uses the `hasattr()` function to ensure the datasource object has a `write` method before using it. However, this is insufficient for a custom datasource that inherits from `Datasource` since `Datasource` has the `write` method implemented. If the custom datasource only has `do_write` implemented, `hasattr(datasource, "write")` will return True since `hasattr()` will detect methods via inheritance. The solution is to check if the `write` method was overwritten from `Datasource.write`. Any class that has not implemented `write` will have the equality check return True
With the new `write` added (from ray-project#32015 and ray-project#32440), Ray Data intends to support both the `write` and `do_write` functions for now. The check currently uses the `hasattr()` function to ensure the datasource object has a `write` method before using it. However, this is insufficient for a custom datasource that inherits from `Datasource` since `Datasource` has the `write` method implemented. If the custom datasource only has `do_write` implemented, `hasattr(datasource, "write")` will return True since `hasattr()` will detect methods via inheritance. The solution is to check if the `write` method was overwritten from `Datasource.write`. Any class that has not implemented `write` will have the equality check return True Signed-off-by: Edward Oakes <[email protected]>
With the new `write` added (from ray-project#32015 and ray-project#32440), Ray Data intends to support both the `write` and `do_write` functions for now. The check currently uses the `hasattr()` function to ensure the datasource object has a `write` method before using it. However, this is insufficient for a custom datasource that inherits from `Datasource` since `Datasource` has the `write` method implemented. If the custom datasource only has `do_write` implemented, `hasattr(datasource, "write")` will return True since `hasattr()` will detect methods via inheritance. The solution is to check if the `write` method was overwritten from `Datasource.write`. Any class that has not implemented `write` will have the equality check return True
With the new `write` added (from ray-project#32015 and ray-project#32440), Ray Data intends to support both the `write` and `do_write` functions for now. The check currently uses the `hasattr()` function to ensure the datasource object has a `write` method before using it. However, this is insufficient for a custom datasource that inherits from `Datasource` since `Datasource` has the `write` method implemented. If the custom datasource only has `do_write` implemented, `hasattr(datasource, "write")` will return True since `hasattr()` will detect methods via inheritance. The solution is to check if the `write` method was overwritten from `Datasource.write`. Any class that has not implemented `write` will have the equality check return True
With the new `write` added (from ray-project#32015 and ray-project#32440), Ray Data intends to support both the `write` and `do_write` functions for now. The check currently uses the `hasattr()` function to ensure the datasource object has a `write` method before using it. However, this is insufficient for a custom datasource that inherits from `Datasource` since `Datasource` has the `write` method implemented. If the custom datasource only has `do_write` implemented, `hasattr(datasource, "write")` will return True since `hasattr()` will detect methods via inheritance. The solution is to check if the `write` method was overwritten from `Datasource.write`. Any class that has not implemented `write` will have the equality check return True
With the new `write` added (from ray-project#32015 and ray-project#32440), Ray Data intends to support both the `write` and `do_write` functions for now. The check currently uses the `hasattr()` function to ensure the datasource object has a `write` method before using it. However, this is insufficient for a custom datasource that inherits from `Datasource` since `Datasource` has the `write` method implemented. If the custom datasource only has `do_write` implemented, `hasattr(datasource, "write")` will return True since `hasattr()` will detect methods via inheritance. The solution is to check if the `write` method was overwritten from `Datasource.write`. Any class that has not implemented `write` will have the equality check return True Signed-off-by: elliottower <[email protected]>
Why are these changes needed?
With the new
write
added (from #32015 and #32440), Ray Data intends to support both thewrite
anddo_write
functions for now. The check currently uses thehasattr()
function to ensure the datasource object has awrite
method before using it.However, this is insufficient for a custom datasource that inherits from
Datasource
sinceDatasource
has thewrite
method implemented. If the custom datasource only hasdo_write
implemented,hasattr(datasource, "write")
will return True sincehasattr()
will detect methods via inheritance.The solution is to check if the
write
method was overwritten fromDatasource.write
. Any class that has not implementedwrite
will have the equality check return TrueChecks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.