-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgrade pyarrow from 12 to 14 #2089
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I think you forgot to update it in some of the components: api, rows,...
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2089 +/- ##
==========================================
- Coverage 90.49% 87.05% -3.45%
==========================================
Files 248 132 -116
Lines 15202 5245 -9957
==========================================
- Hits 13757 4566 -9191
+ Misses 1445 679 -766
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
in fact, they were requiring pyarrow 11, which is why it was stuck to a lower version. But... it seems like the upgrade will not be that simple: https://github.com/huggingface/datasets-server/actions/runs/6823050840/job/18556219132 Looking at it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But... it seems like the upgrade will not be that simple
I see... I'm having a look at datasets
side...
EDIT:
I wrongly understood that it may be caused by datasets
, but that is not the case: in datasets
we already support pyarrow-14 (I did the PR 😅 ):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, the issue is in our viewer_utils
...
yes, to reproduce:
same if we create the content with
The trace is:
|
Should I open an issue in datasets? Or am I misusing it? It's a bit too cryptic for me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I open an issue in datasets?
Yes, I understand now: they have disabled their PyExtensionType
and we use it in datasets
for arrays...
OK, I opened huggingface/datasets#6396 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reactivity !
I'm fine with disabling ArrayND types for now using 14.0.1
what does it mean? letting the error propagate, and remove it from the tests? |
Let's comment the failing tests for now. I guess |
(or maybe they won't, but in this case |
5b94feb
to
d298732
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are no longer supporting datasets with arrays features...
Yes until |
Exactly. I think that it's better to remove support for array features than to let the vulnerability. |
Thanks for the comments and reviews! |
We temporarily stop supporting array features until
datasets
fixes the issue on their side.