Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI lockfiles #4108

Merged
merged 30 commits into from
May 4, 2021
Merged

CI lockfiles #4108

merged 30 commits into from
May 4, 2021

Conversation

jamesp
Copy link
Member

@jamesp jamesp commented Apr 27, 2021

🚀 Pull Request

Description

Here's a proof of concept for adding locked dependencies and using them in the nox testing framework.

  • Lock files are found in requirements/ci/nox.lock. These are explicit conda environment files which means they do not require resolving, they are the resolved manifestation of the equivalent requirements/ci/py3X.yml. This means that they can be installed using conda create -n my_new_env --file requirements/ci/nox.lock/py3X-linux-64.lock.
  • noxfile.py updated to use these resolved environments and avoid rebuilding / updating until the lockfile changes. This should dramatically reduce the second time the nox test suite runs. It should probably also greatly reduce the time the first run takes too, as there is no need for conda to resolve or update the environment.
  • A github action workflow to generate new lockfiles every Saturday morning, and create a PR with the changed files. This can then be included/ignored/deleted at our leisure.

There's quite a lot in this, so whoever feels like reviewing, lets go through it together 👍

This should resolve #4105


Consult Iris pull request check list

@jamesp jamesp requested a review from bjlittle April 27, 2021 18:44
@jamesp jamesp changed the title Ci lock CI lockfiles Apr 27, 2021
@jamesp
Copy link
Member Author

jamesp commented Apr 27, 2021

You can see an example of the GH Actions run here: https://github.com/jamesp/iris/actions/runs/790009248

And the PR it produces here: jamesp#3

@bjlittle bjlittle self-assigned this Apr 28, 2021
@bjlittle
Copy link
Member

@jamesp Did you try using conda-lock with mamba in miniforge to revolve in realtime?

Or would that also struggle in cirrus-ci?

@bjlittle
Copy link
Member

We could also rationalise the requirements themselves, as conda-lock can resolve multiple yml files at once, which is super handy. i.e., the testing jobs don't need the documentation requirements, and I suspect that the minimal testing jobs don't need the optional requirements.

There are options to make this less bloaty, package wise... but maybe that can come later.

@jamesp
Copy link
Member Author

jamesp commented Apr 28, 2021

No I didn't but I'm sure that would work too and would be a viable alternative to this setup. But then you are still doing a resolve on every run of the CI.

We first need to decide if lockfiles belong in version control. If yes, this scheme provides a mechanism for keeping them up to date. If no, this scheme should be discarded. we could instead keep the lockfiles cached in CI to reduce the number of resolves

@trexfeathers
Copy link
Contributor

We first need to decide if lockfiles belong in version control. If yes, this scheme provides a mechanism for keeping them up to date. If no, this scheme should be discarded. we could instead keep the lockfiles cached in CI to reduce the number of resolves

I'm 100% in favour of the former and against the latter. As I said before, this can solve issues beyond CI - benchmarking being the example I'm aware of currently.

@jamesp
Copy link
Member Author

jamesp commented Apr 28, 2021

The pro for this approach is that we get a known, reproducible, test environment for every CI run until we explicitly update the lockfiles committed to git.

As a workflow, I was thinking somethign along the lines of:

  1. GH Action updates the lockfiles once a week and creates a PR. If all tests pass, we merge the PR and now tests run against latest dependencies.
  2. If tests fail on this PR (most likely being matplotlib image hash discrepancies) we CO the PR, fix the code & tests, commit and merge. Changes required due to dependency change are therefore isolated from other development changes.

@bjlittle
Copy link
Member

@jamesp How will this work for feature branches, that pull in additional or different requirements. Any thoughts?

@bjlittle
Copy link
Member

bjlittle commented Apr 28, 2021

@jamesp Interesting that there are graphical phash failures... what's changed?

Do you have scope and capacity to investigate this further?

@rcomer
Copy link
Member

rcomer commented Apr 29, 2021

there still seems to be an imagehash failure

That particular gallery example is missing bits (see #4110). I have no idea if the failure is related to that though.

Edit: now thinking it quite likely is that: #4110 (comment)

@jamesp
Copy link
Member Author

jamesp commented Apr 29, 2021

What do you want to do about the link-check failures, given we know there is a fix in #4104?

Given the output of the test above suggests the code is working, it's the same test that's failing,I would suggest, if everyone is comfortable, we ignore that here, and rebase #4104.

I'm not sure about the gallery failure, I'll take a look at the imagehash on 3.8 and see what the issue is. the hamming distance score suggests it's not far off.

@rcomer
Copy link
Member

rcomer commented Apr 29, 2021

@jamesp I think a change at matplotlib v3.4 has reinstated correct behaviour for that gallery example. See #4110

@jamesp
Copy link
Member Author

jamesp commented Apr 29, 2021

We should address that image hash in a separate PR, I think this one is about complete now.

@jamesp
Copy link
Member Author

jamesp commented May 4, 2021

What more do we need to do to get comfortable with merging this with the failing doc tests? We seem to be approaching a circular dependency across a number of PRs!

@trexfeathers
Copy link
Contributor

trexfeathers commented May 4, 2021

What more do we need to do to get comfortable with merging this with the failing doc tests? We seem to be approaching a circular dependency across a number of PRs!

@pp-mo also suggested we should briefly discuss this as a team. It's not just a matter of "does it work?" - it's a change in how we do things.

@trexfeathers
Copy link
Contributor

@jamesp don't get me wrong, I'm also very keen we get this over the line before we get an even bigger spider web of blocking problems.

@jamesp
Copy link
Member Author

jamesp commented May 4, 2021

Thanks @trexfeathers. Pushed commits to address those points above.

I agree this is a change to how things are done that should be discussed. Could @trexfeathers or @pp-mo setup a quick call for all those interested?

@rcomer
Copy link
Member

rcomer commented May 4, 2021

That anomaly log gallery failure also came up in my local testing but I ignored it because it wasn't previously failing here. The idiff output looked similar to the change shown at SciTools/test-iris-imagehash#40, except the colour bar was also affected.

For the square root failure, see #3993.

😔

@rcomer rcomer linked an issue May 4, 2021 that may be closed by this pull request
@jamesp
Copy link
Member Author

jamesp commented May 4, 2021

I've reverted the changes to imagerepo.json as they should be in a separate PR. Let's keep this one entirely focussed on the CI changes so that it can be easily reverted if needed.

Once this is merged as a priority we should fix:

  1. square root failure Failing test_square_root test #3993
  2. image hashes for galleries and test
  3. linkcheck failures.

Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can confirm this only includes the expected CI failures. Merging as agreed.

@trexfeathers trexfeathers merged commit f5ceb80 into SciTools:master May 4, 2021
@jamesp jamesp mentioned this pull request May 12, 2021
tkknight added a commit to tkknight/iris that referenced this pull request May 17, 2021
* master: (49 commits)
  Update CI environment lockfiles + Cartopy 0.19 changes (SciTools#4125)
  separate arg types from descriptions (SciTools#4100)
  Use assertArrayAllClose for sqrt test (SciTools#4118)
  Removed branch suffix (SciTools#4117)
  Corrected plot_anomaly_log_colouring for new Matplotlib linscale rules. (SciTools#4115)
  replace most recent hashes (SciTools#4112)
  Linkcheck update (SciTools#4104)
  CI lockfiles (SciTools#4108)
  Fix bug in gallery_tests runner (SciTools#4111)
  add logo to conda and pypi badges (SciTools#4088)
  pre-commit-ci update (SciTools#4092)
  Fix cirrus ci and mpl (SciTools#4087)
  Update readme docs stable (SciTools#4089)
  linux kernel version fix for conda 4.10+ (SciTools#4084)
  update docs pypi instructions (SciTools#4077)
  update flake8 pre-commit (SciTools#4067)
  Add GitHub discussions badge (SciTools#4070)
  conda requirements add pip (SciTools#4062)
  add pre-commit.ci badge and doc support (SciTools#4061)
  cirrus-ci credits for non-master (SciTools#4060)
  ...
tkknight added a commit to tkknight/iris that referenced this pull request May 26, 2021
* master: (87 commits)
  Contourf Antialias Fix (SciTools#4150)
  minor tidy of cirrus and nox (SciTools#4152)
  update bug and feature templates (SciTools#4147)
  Update CI environment lockfiles + Cartopy 0.19 changes (SciTools#4125)
  separate arg types from descriptions (SciTools#4100)
  Use assertArrayAllClose for sqrt test (SciTools#4118)
  Removed branch suffix (SciTools#4117)
  Corrected plot_anomaly_log_colouring for new Matplotlib linscale rules. (SciTools#4115)
  replace most recent hashes (SciTools#4112)
  Linkcheck update (SciTools#4104)
  CI lockfiles (SciTools#4108)
  Fix bug in gallery_tests runner (SciTools#4111)
  add logo to conda and pypi badges (SciTools#4088)
  pre-commit-ci update (SciTools#4092)
  Fix cirrus ci and mpl (SciTools#4087)
  Update readme docs stable (SciTools#4089)
  linux kernel version fix for conda 4.10+ (SciTools#4084)
  update docs pypi instructions (SciTools#4077)
  update flake8 pre-commit (SciTools#4067)
  Add GitHub discussions badge (SciTools#4070)
  ...
@rcomer rcomer mentioned this pull request Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gallery: wind plot missing lake features Replace conda update / conda cache in cirrus with conda-lock
5 participants