Minimise the cube API #3429

pp-mo · 2019-10-01T10:35:36Z

In my opinion, there is far too much in cube.py, and particularly the Cube class.

I would like to see this minimised by removing purely "convenience" instance methods : By this, I mean things which don't rely (?much) on private class implementation details, so they could just as well be cast as independent operations, which would take the cube as an argument.
As 'utility' functions, these would then exist at the 'cube' package level or even in the generic iris.utils package.

In my mind, this means making a distinction between core operations which define what a cube "is" and are heavily concerned with the implementation, like add_dim_coord and __getitem__, and those which are secondary operations that just "use" cubes.
Certainly, if an operation can be rewritten to work entirely via the public Cube API, then it does not belong in the class. Surprisingly, even functions like 'transpose' may fall into this category.

Specifically, I think all of the following could potentially be moved outside of the Cube class:

aggregated_by
collapsed
interpolate
intersection
is_compatible
regrid
replace_coord
rolling_window
slices
slices_over
subset

There is another set which could also be addressed, but which are currently in-place operations (see also #3430). If they were rewritten as functions which always return a new cube, then these could also be "taken out" :

transpose :
rename
convert_units

There are some interesting parallels here with the numpy API.
For instance you could remove all those array instance methods, like array.astype, array.reshape, array.transpose etc, when you can instead just use np.astype(array, X) etc.
Most of those do simply redirect to static methods.
I have also seen a comment on the numpy project that they "wish" they had reduced the proliferation of unnecessary instance methods, but it is now too late !

@njsmith If we could go back in time then I'd probably argue for not having these methods on ndarray at all, and using functions exclusively ... Too bad we don't have a time machine :-)

The text was updated successfully, but these errors were encountered:

pp-mo · 2019-10-01T10:37:25Z

NOTE: I've labelled this for Iris 3.0, because it suggests a serious breaking change.
Of course, we won't be doing this in Iris 3. Maybe I should start up an Iris 4.0 milestone ??

rcomer · 2019-10-01T14:32:29Z

😨

I think I can see where you're coming from, but from a pure user perspective this might just look like "making us change our code for no good reason". I note that the comment you link specifically says that removing the numpy.ndarray methods is "impossible because it would break everyone's existing code that uses methods".

One big difference between the numpy and iris APIs is that all the numpy function are available in the top level package, so I am only ever 3 keystrokes (np.) away from the function I want to use. Iris has a proliferation of modules, making it less obvious where to find things. Having methods attached to the cube (and cubelist) makes them easy to find!

rcomer · 2019-10-01T16:00:31Z

Note that is has also previously been suggested that there is too much in iris.util.

pp-mo · 2019-10-01T18:18:12Z

I think I can see where you're coming from

TBH this is all about making the code tidier, which is basically for developers, not users !

A counter-example though is how hard it is to navigate the iris.cube docspage
That is pretty bloated, is it not ?!?

pp-mo · 2019-10-01T18:20:30Z

Though, re:

proliferation of modules

I do enjoy a good multilevel hierachy, don't you ? 😉

re:

there is too much in iris.util.

But that is really the same problem + I think supports my case, because that is just what's wrong with iris.util : it has no structure + is just a random collection of stuff.

rcomer · 2019-10-02T07:47:03Z

I do enjoy a good multilevel hierachy, don't you ? 😉

I couldn’t possibly comment 😆

I think it’s fair to say that a lot of functionality is not where you would put it if you were starting from scratch. But if we’re going to move things around, we need to consider that an awful lot of code has been written on top of Iris, so it should be as easy as possible to update said code to work with new versions. Perhaps consider creating tools to automate this? Like how 2to3 automatically changes next methods to functions.

rcomer · 2019-10-02T09:16:31Z

Or we could borrow the time machine from the numpy folks....

rcomer · 2019-10-02T09:45:30Z

NOTE: I've labelled this for Iris 3.0, because it suggests a serious breaking change.

BTW, are we still doing the deprecation thing? https://scitools.org.uk/iris/docs/latest/developers_guide/deprecations.html

pp-mo · 2019-10-02T22:16:53Z

are we still doing the deprecation thing?

Well, I still think so.
But I also fear that we have often been over-cautious in the past 😨 ,
when rigid rules have made it unreasonably hard to make simple changes for the better

For specific instance : did this issue today introduce a breaking change,
and will that actually matter to anyone ? It's a judgement call.

rcomer · 2019-10-03T08:45:20Z

did this issue today introduce a breaking change

that link isn't working for me 😕

pp-mo · 2019-10-03T11:53:24Z

link isn't working

Sorry, I fixed it, meant to link #3431.

The form [link text](#issue-num) does not work !

pp-mo · 2019-10-03T13:49:38Z

borrow the time machine from the numpy folks

😁
🐔 ⏲️ ⚙️ ⏳ ⚙️ 🐣

vsherratt · 2021-01-21T12:13:29Z

Can I just raise that this is a perfectly valid way to define a method:

def collapsed(self, coords, aggregator, **kwargs):
    pass  # actual contents of Cube.collapsed here...

class Cube:
    collapsed = collapsed

In fact, so is this (unless you've done something particularly egregious with metaclasses):

class Cube:
    pass

Cube.collapsed = collapsed

Basically there's nothing special about a def block inside a class block, or about naming a function argument self; the "magic" behind methods, ie automatically passing the caller instance in as the first argument, is inherent to the actual function type.

I think moving functions away from the class definition and into dedicated modules is a good idea regardless, 4000 lines is far too much for one file. So really I'd say the choices are:

Completely drop these functions from the Cube API, instead exposing them as functions across various modules - probably mostly iris.cube.* submodules, but I suppose some functions may fit better in other existing modules. As a breaking change, this would force a major version, and ideally deprecation warnings well beforehand.
Attach the functions as Cube methods, without putting the functions themselves in the public API. The change would purely be for developer convenience in dealing with more manageable files.
Expose both function and method options, similar to numpy. This is contentious. Certainly I wouldn't support writing a lot of one-line wrapper methods - that would be terrible for maintainability - but reusing the same function object in both situations is a strong option.

On one hand, it does break the "one obvious way to do something" guideline, and muddies the Cube namespace. Documentation could be more confusing, because sphinx will find and render exactly the same docstring in both cases; it will generate an appropriate function signature, but the hand-written argument lists and descriptions would either mention a cube argument that isn't there, or not mention one that is. Potentially addressable with a decorator that cleverly alters the docstring on a copy of the function, or just being more careful when writing them.

On the other, it doesn't force the user into either "always use methods" or "always use functions", nor force them to check all the time which to use because the package itself is inconsistent - they can choose whichever they prefer or fits in best with the rest of their project. Ambivalence on this choice of styles is arguably a good thing and a reasonable exception to the guideline, especially for numpy as such a core package, but just as applicable to iris.

A sub-option is to only do this temporarily, to provide an easier transition into the first option.

pp-mo added this to the v3.0.0 milestone Oct 1, 2019

pp-mo mentioned this issue Oct 1, 2019

Reduce in-place cube operations. #3430

Closed

pp-mo added Status: Decision Required Type: Question labels Oct 2, 2019

rcomer mentioned this issue Nov 6, 2020

New cube.grouped_by method? #3218

Closed

bjlittle modified the milestones: v3.0.0, v3.1.0 Feb 4, 2021

trexfeathers mentioned this issue Oct 5, 2021

Proposal: move iris.cube.Cube to iris.Cube #891

Open

bjlittle modified the milestones: v3.1.0, v3.3.0 Nov 1, 2021

trexfeathers mentioned this issue Jun 14, 2022

Review the Iris user experience #4799

Open

stephenworsley modified the milestones: v3.3.0, Candidate for next release Jun 29, 2022

SciTools locked and limited conversation to collaborators Sep 30, 2022

bjlittle converted this issue into discussion #5006 Sep 30, 2022

bjlittle removed this from the Candidate for next release milestone Sep 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Minimise the cube API #3429

Minimise the cube API #3429

pp-mo commented Oct 1, 2019 •

edited by rcomer

Loading

pp-mo commented Oct 1, 2019

rcomer commented Oct 1, 2019

rcomer commented Oct 1, 2019

pp-mo commented Oct 1, 2019

pp-mo commented Oct 1, 2019 •

edited

Loading

rcomer commented Oct 2, 2019

rcomer commented Oct 2, 2019

rcomer commented Oct 2, 2019

pp-mo commented Oct 2, 2019

rcomer commented Oct 3, 2019

pp-mo commented Oct 3, 2019

pp-mo commented Oct 3, 2019 •

edited

Loading

vsherratt commented Jan 21, 2021 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

Minimise the cube API #3429

Minimise the cube API #3429

Comments

pp-mo commented Oct 1, 2019 • edited by rcomer Loading

pp-mo commented Oct 1, 2019

rcomer commented Oct 1, 2019

rcomer commented Oct 1, 2019

pp-mo commented Oct 1, 2019

pp-mo commented Oct 1, 2019 • edited Loading

rcomer commented Oct 2, 2019

rcomer commented Oct 2, 2019

rcomer commented Oct 2, 2019

pp-mo commented Oct 2, 2019

rcomer commented Oct 3, 2019

pp-mo commented Oct 3, 2019

pp-mo commented Oct 3, 2019 • edited Loading

vsherratt commented Jan 21, 2021 • edited Loading

This issue was moved to a discussion.

pp-mo commented Oct 1, 2019 •

edited by rcomer

Loading

pp-mo commented Oct 1, 2019 •

edited

Loading

pp-mo commented Oct 3, 2019 •

edited

Loading

vsherratt commented Jan 21, 2021 •

edited

Loading