Replace dask 'compute()' usage with a common realisation call. (#2) #2447

lbdreyer · 2017-03-20T14:44:48Z

Replaces #2422

lbdreyer · 2017-03-20T14:57:31Z

lib/iris/tests/unit/analysis/test_VARIANCE.py

 import numpy as np
 import numpy.ma as ma

 from iris.analysis import VARIANCE
 import iris.cube
 from iris.coords import DimCoord
+from iris._lazy_data import as_lazy_data, as_concrete_data


We do need to agree on an order for importing hidden modules, atm we're inconsistent with it.
Unfortunately, a quick google didn't unearth any standard practice
I do like this approach of putting them in alphabetical, regardless of whether or not they are private though

If we're going to go with alphabetical (which we are) then _ sorts before all lower case letters, so IMHO private modules should go first.

lbdreyer · 2017-03-20T15:00:11Z

lib/iris/_lazy_data.py

+    If the data is lazy, return the realised result.
+
+    Where lazy data contains NaNs these are translated by filling or conversion
+    to masked data, using the :func:`convert_nans_array` function.


:func:convert_nans_array

Should this be

:func:`iris._lazy_data.convert_nans_array`

or is okay to be relative if it's in the same module?

You could always make it

:func:`~iris._lazy_data.convert_nans_array`

bjlittle

Remember to add some minimal test coverage

bjlittle · 2017-03-20T15:35:22Z

lib/iris/_lazy_data.py

+    """
+    Return the actual content of a lazy array, as a numpy array.
+
+    If the data is a NumPy array, return it unchanged.


@lbdreyer This also applies to ~numpy.ma.core.MaskedArray.

bjlittle · 2017-03-20T15:36:34Z

lib/iris/_merge.py

@@ -33,7 +33,8 @@
 import numpy as np
 import numpy.ma as ma

-from iris._lazy_data import as_lazy_data, is_lazy_data, multidim_lazy_stack
+from iris._lazy_data import (as_lazy_data, is_lazy_data, multidim_lazy_stack,
+                             as_concrete_data)


@lbdreyer The sort order needs changes here ... make as_concrete_data first ...

bjlittle · 2017-03-20T15:38:40Z

lib/iris/tests/unit/analysis/test_VARIANCE.py

 import numpy as np
 import numpy.ma as ma

 from iris.analysis import VARIANCE
 import iris.cube
 from iris.coords import DimCoord
+from iris._lazy_data import as_lazy_data, as_concrete_data


If we're going to go with alphabetical (which we are) then _ sorts before all lower case letters, so IMHO private modules should go first.

bjlittle · 2017-03-20T15:41:13Z

lib/iris/_lazy_data.py

+    If the data is lazy, return the realised result.
+
+    Where lazy data contains NaNs these are translated by filling or conversion
+    to masked data, using the :func:`convert_nans_array` function.


You could always make it

:func:`~iris._lazy_data.convert_nans_array`

bjlittle · 2017-03-20T15:44:15Z

lib/iris/coords.py

@@ -39,6 +39,7 @@

 import iris.aux_factory
 import iris.exceptions
+from iris._lazy_data import as_concrete_data


Import order ...

bjlittle · 2017-03-20T15:47:22Z

lib/iris/coords.py

+                                            nans_replacement=np.ma.masked)
+            # NOTE: we probably don't have full support for masked aux-coords.
+            # We certainly *don't* handle a _FillValue attribute (and possibly
+            # the loader will throw one away ?)


@lbdreyer We should raise a ticket to consider how we deal with masked integral data on coordinates and cell measures. At the moment we don't keep the result dtype ... this is lost in translation, which is bad.

@lbdreyer Did you create an issue to cover this?

No, but I'll do that now

@bjlittle I've just raised #2449
not sure if that's quite what you were after. Feel free to edit the issue

bjlittle · 2017-03-20T15:47:41Z

lib/iris/cube.py

@@ -43,7 +43,8 @@
 import iris._constraints
 from iris._deprecation import warn_deprecated
 from iris._lazy_data import (array_masked_to_nans, as_lazy_data,
-                             convert_nans_array, is_lazy_data)
+                             convert_nans_array, is_lazy_data,
+                             as_concrete_data)


Import sort order ...

bjlittle · 2017-03-20T15:48:16Z

lib/iris/fileformats/pp.py

@@ -43,7 +43,8 @@
 import iris.fileformats.rules
 import iris.fileformats.pp_rules
 import iris.coord_systems
-from iris._lazy_data import array_masked_to_nans, as_lazy_data, is_lazy_data
+from iris._lazy_data import (array_masked_to_nans, as_lazy_data, is_lazy_data,
+                             as_concrete_data)


Import sort order ...

bjlittle · 2017-03-20T15:49:35Z

lib/iris/tests/unit/analysis/test_STD_DEV.py

@@ -23,7 +23,7 @@
 # importing anything else.
 import iris.tests as tests

-from iris._lazy_data import as_lazy_data
+from iris._lazy_data import as_lazy_data, as_concrete_data


Import sort order

bjlittle · 2017-03-20T15:50:16Z

lib/iris/tests/unit/lazy_data/test_multidim_lazy_stack.py

@@ -26,7 +26,7 @@
 import numpy as np
 import dask.array as da

-from iris._lazy_data import as_lazy_data, multidim_lazy_stack
+from iris._lazy_data import as_lazy_data, multidim_lazy_stack, as_concrete_data


Import sort order

lbdreyer · 2017-03-20T17:57:22Z

@bjlittle Hopefully you are happy with the changes?
(I am happy for you to squash and merge)

DPeterK · 2017-03-21T09:29:17Z

lib/iris/_lazy_data.py

+    Where lazy data contains NaNs these are translated by filling or conversion
+    to masked data, using the :func:`~iris._lazy_data.convert_nans_array`
+    function.
+    See there for usage of the 'nans_replacement' and 'result_dtype' keys.


Typo: "there" --> "their".

Alternatively you could change the signature to

def as_concrete_data(data, **kwargs)

as the kwargs are not used at all by this function and are just passed straight to convert_nans_array. At that point you could replace this description of the kwargs with something like

"Kwargs are passed straight to :func:~iris._lazy_data.convert_nans_array."

It is a more typical way of doing such things in Python but also less readable...

I think "See there for usage" was meant as in "See over there for usage".

I like your idea of **kwargs

I think "See there for usage" was meant as in "See over there for usage".

Ah yes, on a re-reading you're quite right!

DPeterK · 2017-03-21T09:30:09Z

lib/iris/_merge.py

                if (ma.isMaskedArray(merged_data) and
-                        ma.count_masked(merged_data) == 0):
+                        not ma.is_masked(merged_data)):


Nice spot 😉

DPeterK · 2017-03-21T09:39:36Z

lib/iris/tests/unit/lazy_data/test_as_concrete_data.py

+        self.assertEqual(sentinel, result)
+
+        # Check call to convert_nans_array
+        conv_nans.assert_called_once()


I reckon you can do assert_called_once_with, which may be useful here.

@dkillick I don't think that assert_called_once_with will cope with comparing numpy arrays, so checking the call count and unpacking the call args seems okay ... but you could give @dkillick's suggestion a try.

@lbdreyer Although, you could combine this test and the above test_lazy_data test into one and also use sentinels for the nans_replacement and result_dtype ... that would make it sufficiently generic.

DPeterK · 2017-03-21T09:39:56Z

lib/iris/tests/unit/lazy_data/test_as_concrete_data.py

+        arg, = args
+        self.assertFalse(is_lazy_data(arg))
+        self.assertArrayEqual(arg, data)
+        self.assertEqual(kwargs, {'result_dtype': None,


assert_called_once_with?

Doesn't work when one of the args is a numpy array

It probably would if you also mocked the array.

Also, in this case, neither of them are NumPy arrays...

bjlittle · 2017-03-21T09:43:04Z

lib/iris/tests/unit/lazy_data/test_as_concrete_data.py

+
+import numpy as np
+import numpy.ma as ma
+import dask.array as da


@lbdreyer Sorry, but could you fix this import order ...

Ah man! I didn't see this one!

bjlittle · 2017-03-21T11:27:20Z

👍 Hoo-rah! Awesome, thanks @lbdreyer

lbdreyer · 2017-03-21T11:39:32Z

Thanks @bjlittle !!

…ciTools#2447) Replace dask 'compute()' usage with a common realisation call.

Replace dask 'compute()' usage with a common realisation call.

498180e

lbdreyer added the dask label Mar 20, 2017

lbdreyer added this to the dask milestone Mar 20, 2017

lbdreyer self-assigned this Mar 20, 2017

lbdreyer requested a review from bjlittle March 20, 2017 14:44

QuLogic added the Status: Work in Progress label Mar 20, 2017

lbdreyer mentioned this pull request Mar 20, 2017

Replace dask "computes" with common call #2422

Closed

lbdreyer commented Mar 20, 2017

View reviewed changes

bjlittle requested changes Mar 20, 2017

View reviewed changes

Add tests for as_concrete_data; reorder imports.

871bbb4

DPeterK requested changes Mar 21, 2017

View reviewed changes

bjlittle reviewed Mar 21, 2017

View reviewed changes

lbdreyer added 2 commits March 21, 2017 10:44

Use **kwargs; fix more import ordersss.

dc996a2

Edit to docstring.

a7c7042

DPeterK approved these changes Mar 21, 2017

View reviewed changes

bjlittle approved these changes Mar 21, 2017

View reviewed changes

bjlittle merged commit 4805e3b into SciTools:dask Mar 21, 2017

bjlittle removed the Status: Work in Progress label Mar 21, 2017

bjlittle pushed a commit to bjlittle/iris that referenced this pull request May 31, 2017

Replace dask 'compute()' usage with a common realisation call. (#2) (S…

e1ff306

…ciTools#2447) Replace dask 'compute()' usage with a common realisation call.

QuLogic modified the milestones: dask, v2.0 Aug 2, 2017

lbdreyer deleted the patrick_dask_concrete branch July 23, 2018 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace dask 'compute()' usage with a common realisation call. (#2) #2447

Replace dask 'compute()' usage with a common realisation call. (#2) #2447

lbdreyer commented Mar 20, 2017

lbdreyer Mar 20, 2017

bjlittle Mar 20, 2017

lbdreyer Mar 20, 2017 •

edited

Loading

bjlittle Mar 20, 2017

bjlittle left a comment

bjlittle Mar 20, 2017

bjlittle Mar 20, 2017

bjlittle Mar 20, 2017

bjlittle Mar 20, 2017

bjlittle Mar 20, 2017

bjlittle Mar 20, 2017

bjlittle Mar 21, 2017

lbdreyer Mar 21, 2017

lbdreyer Mar 21, 2017

bjlittle Mar 20, 2017

bjlittle Mar 20, 2017

bjlittle Mar 20, 2017

bjlittle Mar 20, 2017

lbdreyer commented Mar 20, 2017

DPeterK Mar 21, 2017

lbdreyer Mar 21, 2017

DPeterK Mar 21, 2017

DPeterK Mar 21, 2017

DPeterK Mar 21, 2017

bjlittle Mar 21, 2017

DPeterK Mar 21, 2017

lbdreyer Mar 21, 2017

DPeterK Mar 21, 2017

DPeterK Mar 21, 2017

bjlittle Mar 21, 2017

lbdreyer Mar 21, 2017

bjlittle commented Mar 21, 2017

lbdreyer commented Mar 21, 2017

Replace dask 'compute()' usage with a common realisation call. (#2) #2447

Replace dask 'compute()' usage with a common realisation call. (#2) #2447

Conversation

lbdreyer commented Mar 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lbdreyer Mar 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjlittle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lbdreyer commented Mar 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjlittle commented Mar 21, 2017

lbdreyer commented Mar 21, 2017

lbdreyer Mar 20, 2017 •

edited

Loading