-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace dask 'compute()' usage with a common realisation call. (#2) #2447
Conversation
import numpy as np | ||
import numpy.ma as ma | ||
|
||
from iris.analysis import VARIANCE | ||
import iris.cube | ||
from iris.coords import DimCoord | ||
from iris._lazy_data import as_lazy_data, as_concrete_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to agree on an order for importing hidden modules, atm we're inconsistent with it.
Unfortunately, a quick google didn't unearth any standard practice
I do like this approach of putting them in alphabetical, regardless of whether or not they are private though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to go with alphabetical (which we are) then _
sorts before all lower case letters, so IMHO private modules should go first.
lib/iris/_lazy_data.py
Outdated
If the data is lazy, return the realised result. | ||
|
||
Where lazy data contains NaNs these are translated by filling or conversion | ||
to masked data, using the :func:`convert_nans_array` function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:func:
convert_nans_array
Should this be
:func:`iris._lazy_data.convert_nans_array`
or is okay to be relative if it's in the same module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could always make it
:func:`~iris._lazy_data.convert_nans_array`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember to add some minimal test coverage
lib/iris/_lazy_data.py
Outdated
""" | ||
Return the actual content of a lazy array, as a numpy array. | ||
|
||
If the data is a NumPy array, return it unchanged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer This also applies to ~numpy.ma.core.MaskedArray
.
lib/iris/_merge.py
Outdated
@@ -33,7 +33,8 @@ | |||
import numpy as np | |||
import numpy.ma as ma | |||
|
|||
from iris._lazy_data import as_lazy_data, is_lazy_data, multidim_lazy_stack | |||
from iris._lazy_data import (as_lazy_data, is_lazy_data, multidim_lazy_stack, | |||
as_concrete_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer The sort order needs changes here ... make as_concrete_data
first ...
import numpy as np | ||
import numpy.ma as ma | ||
|
||
from iris.analysis import VARIANCE | ||
import iris.cube | ||
from iris.coords import DimCoord | ||
from iris._lazy_data import as_lazy_data, as_concrete_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to go with alphabetical (which we are) then _
sorts before all lower case letters, so IMHO private modules should go first.
lib/iris/_lazy_data.py
Outdated
If the data is lazy, return the realised result. | ||
|
||
Where lazy data contains NaNs these are translated by filling or conversion | ||
to masked data, using the :func:`convert_nans_array` function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could always make it
:func:`~iris._lazy_data.convert_nans_array`
lib/iris/coords.py
Outdated
@@ -39,6 +39,7 @@ | |||
|
|||
import iris.aux_factory | |||
import iris.exceptions | |||
from iris._lazy_data import as_concrete_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import order ...
nans_replacement=np.ma.masked) | ||
# NOTE: we probably don't have full support for masked aux-coords. | ||
# We certainly *don't* handle a _FillValue attribute (and possibly | ||
# the loader will throw one away ?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer We should raise a ticket to consider how we deal with masked integral data on coordinates and cell measures. At the moment we don't keep the result dtype ... this is lost in translation, which is bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Did you create an issue to cover this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I'll do that now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/iris/cube.py
Outdated
@@ -43,7 +43,8 @@ | |||
import iris._constraints | |||
from iris._deprecation import warn_deprecated | |||
from iris._lazy_data import (array_masked_to_nans, as_lazy_data, | |||
convert_nans_array, is_lazy_data) | |||
convert_nans_array, is_lazy_data, | |||
as_concrete_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import sort order ...
lib/iris/fileformats/pp.py
Outdated
@@ -43,7 +43,8 @@ | |||
import iris.fileformats.rules | |||
import iris.fileformats.pp_rules | |||
import iris.coord_systems | |||
from iris._lazy_data import array_masked_to_nans, as_lazy_data, is_lazy_data | |||
from iris._lazy_data import (array_masked_to_nans, as_lazy_data, is_lazy_data, | |||
as_concrete_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import sort order ...
@@ -23,7 +23,7 @@ | |||
# importing anything else. | |||
import iris.tests as tests | |||
|
|||
from iris._lazy_data import as_lazy_data | |||
from iris._lazy_data import as_lazy_data, as_concrete_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import sort order
@@ -26,7 +26,7 @@ | |||
import numpy as np | |||
import dask.array as da | |||
|
|||
from iris._lazy_data import as_lazy_data, multidim_lazy_stack | |||
from iris._lazy_data import as_lazy_data, multidim_lazy_stack, as_concrete_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import sort order
@bjlittle Hopefully you are happy with the changes? |
lib/iris/_lazy_data.py
Outdated
Where lazy data contains NaNs these are translated by filling or conversion | ||
to masked data, using the :func:`~iris._lazy_data.convert_nans_array` | ||
function. | ||
See there for usage of the 'nans_replacement' and 'result_dtype' keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "there" --> "their".
Alternatively you could change the signature to
def as_concrete_data(data, **kwargs)
as the kwargs
are not used at all by this function and are just passed straight to convert_nans_array
. At that point you could replace this description of the kwargs
with something like
"Kwargs are passed straight to :func:~iris._lazy_data.convert_nans_array
."
It is a more typical way of doing such things in Python but also less readable...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "See there for usage" was meant as in "See over there for usage".
I like your idea of **kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "See there for usage" was meant as in "See over there for usage".
Ah yes, on a re-reading you're quite right!
if (ma.isMaskedArray(merged_data) and | ||
ma.count_masked(merged_data) == 0): | ||
not ma.is_masked(merged_data)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice spot 😉
self.assertEqual(sentinel, result) | ||
|
||
# Check call to convert_nans_array | ||
conv_nans.assert_called_once() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reckon you can do assert_called_once_with
, which may be useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkillick I don't think that assert_called_once_with
will cope with comparing numpy arrays, so checking the call count and unpacking the call args seems okay ... but you could give @dkillick's suggestion a try.
@lbdreyer Although, you could combine this test and the above test_lazy_data
test into one and also use sentinels for the nans_replacement
and result_dtype
... that would make it sufficiently generic.
arg, = args | ||
self.assertFalse(is_lazy_data(arg)) | ||
self.assertArrayEqual(arg, data) | ||
self.assertEqual(kwargs, {'result_dtype': None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_called_once_with
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't work when one of the args is a numpy array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably would if you also mocked the array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in this case, neither of them are NumPy arrays...
|
||
import numpy as np | ||
import numpy.ma as ma | ||
import dask.array as da |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Sorry, but could you fix this import order ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah man! I didn't see this one!
👍 Hoo-rah! Awesome, thanks @lbdreyer |
Thanks @bjlittle !! |
…ciTools#2447) Replace dask 'compute()' usage with a common realisation call.
Replaces #2422