-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why/when does np.something remove the mask of a np.ma array ? #18675
Comments
I would not consider it either a feature or a bug... But there is a fundamental problem and unfortunately there is no solution for "fixing" masked arrays. Masked arrays work well for many things, but the limitations are of course very clear to you. Maybe the point is that There is currently consensus that these issues cannot be fixed in NumPy. The first thought might be to fix The good news is that we are in a pretty good position right now for a "better" There are NumPy core-devs interested in such a project, for example @ahaldane. I am not sure how far they are along or if you are interested in contributing/looking at it. As to what to do in NumPy more concretely and explaining the "state". Many NumPy functions will call One set of functions that always works I think, are the NumPy ufuncs/math functions The only thing I could think of to actually improve the situation would be to tag on a warning when something calls |
Thank you for the clarification! I thought masked arrays were more tightly integrated in numpy than that. And we are probably all convinced that it is better to handle explicitly missing values than relying on Seen from outside (my user point of view), it seems that we are almost there. We should be working seamlessly with np and np.ma data, which is the case most of the time, but the side effects we get when masked data is suddenly cast back to non masked data can be both surprising and dangerous (if they go unnoticed) As a regular user of np and np.ma, I can now spot errors when they seem to come from a lost mask, but it's not always easy I hope something gets done about this at some point. In the meantime, mentioning something (without having it seem like a bug) in all the appropriate places in the documentation may help, if people read it. Or something that can be found with google. Mentioning clearly masked arrays in NumPy user guide could help Also, adding a warning when |
@jypeter, for some suggested action items here, it would be great if you could share one of your use cases with the explanation included above. Your experience would help many new and seasoned NumPy users. Would you be willing to add either:
A longer endeavor would be the suggested warnings when |
Thanks for the explanations. From a user perspective I also think that it would be much safer to have a warning whenever I ended up getting burned by using |
This is actually quite a footgun, so maybe we should just deprecate and then remove implicit conversion? When a user knows what they are doing, they can use the |
If only users knew what they were doing and read the documentation! We have to make sure that things work implicitly as expected (in our case, masks are used and carried on, if present), or that there are builtin safeguards (i.e. warning messages that will help the users get what they want). Things should be bulletproof (idiot-proof, for some of our lazy users...) But I'm not the person who can do the required coding As I said, masks (and masked arrays) are a great feature (I hate |
This is obviously more a feature than a bug, otherwise it would have been corrected (I'm using numpy 1.20.1). But it has been bothering me for a very long while, and been the indirect source of several bugs in my (and other colleagues') scripts
There must be some logic behind it, but I have not found it in the documentation. The closest issue I have found is #8881 (an open issue from 4 years ago)
I have a masked array. If I work on it with
np.ma
functions, things will be fine, but the equivalent function straight fromnp
will silently ignore and remove the mask!The example below is with
hstack
, but I get the same problem withvstack
,repeat
, and probably many other numpy functionsOn the other hand, some functions fortunately use and keep the mask, regardless of being taken from
np
ornp.ma
So, is this a bug or a feature? What is the logic (so that I can tell our students), and where is it clearly (for beginners) explained?
Even if this is not a bug, I think it would be much safer if numpy functions working on masked array would always use the mask and return a masked array
The text was updated successfully, but these errors were encountered: