Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up daemon handling in the tests #5753

Merged
merged 2 commits into from
Feb 23, 2023

Conversation

Ericson2314
Copy link
Member

@Ericson2314 Ericson2314 commented Dec 9, 2021

Motivation

init.sh is tested on its own. We used to do that. I deleted it in 4720853 but I am not sure why. Better to just restore it; at one point working on this every other test passed, so seems good to check whether init.sh can be run twice.

We don't need to run init.sh twice, but I want to try to make our tests as robust as possible so that manual debugging (where tests for better or worse might be run ways that we didn't expect) is less fragile.


Split common.sh into the vars and functions definitions vs starting
the daemon (and possibly other initialization logic). This way,
init.sh can just source the former. Trying to start the daemon
before nix.conf is written will fail because nix daemon requires
--experimental-features 'nix-command'.

killDaemon is idempotent, so it's safe to call when no daemon is
running.

startDaemon and killDaemon use the PID (which is now exported to
subshells) to decide whether there is work to be done, rather than
NIX_REMOTE, which might conceivably be set differently even if a
daemon is running.

startDaemon and killDaemon can save/restore the old NIX_REMOTE as
NIX_REMOTE_OLD.

init.sh kills daemon before deleting everything (including the daemon
socket).

Context

Checklist for maintainers

Maintainers: tick if completed or explain if not relevant

  • agreed on idea
  • agreed on implementation strategy
  • tests, as appropriate
    • functional tests - tests/**.sh
    • unit tests - src/*/tests
    • integration tests
  • documentation in the manual
  • code and comments are self-explanatory
  • commit message explains why the change was made
  • new feature or bug fix: updated release notes

@edolstra
Copy link
Member

edolstra commented Dec 9, 2021

What was the issue here? I re-run init.sh all the time without any problems.

@Ericson2314
Copy link
Member Author

@edolstra At one point everything was passing but init.sh. This has now grown a lot into fixing various issues. (I am about to push again.)

@Ericson2314 Ericson2314 changed the title Make init.sh safe to run twice Clean up daemon handling Dec 9, 2021
@Ericson2314 Ericson2314 marked this pull request as ready for review December 9, 2021 19:31
@Ericson2314
Copy link
Member Author

OK updated with lots more stuff. Curious if CI will agree with me, but seems good locally.

@Ericson2314
Copy link
Member Author

macOS build timed it out?

@abathur
Copy link
Member

abathur commented Dec 11, 2021

Probably the flaky build issue in #3605

@Ericson2314
Copy link
Member Author

Thanks

@Ericson2314
Copy link
Member Author

Rebased, hoping won't get that spurious error this time.

@Ericson2314
Copy link
Member Author

Yay it did!

Ping @edolstra and @regnat

@Ericson2314
Copy link
Member Author

@edolstra Does this look good to you? I would like to start pushing on this and other testing improvements again prior to getting back to fun feature work like RFC 92, so we have a firmer foundation before any more (potentially destabilizing) new feature.

Copy link
Contributor

@tomberek tomberek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test failures (repair.sh.test) are unrelated. Re-based onto master and functions as expected.

Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much like the idea behind this.

As an alternative (maybe simpler, but also maybe stupid because I didn’t give it a lot of thoughs) scheme, how about

  • Making common.sh pure again (remove the startDaemon call from it)
  • Start the daemon in init.sh (and source it rather than execute it in mk/run_tests.sh so that the needed variables are properly propagated)
    ?

@Ericson2314
Copy link
Member Author

@thufschmitt I had thought about that, but I decided you (probably) had an intent of init.sh just preparing files and not running processes, and I decided that was a good thing :). I forget the exact reasons why.

@Ericson2314 Ericson2314 force-pushed the init-twice-works branch 2 times, most recently from bec7803 to 7a6a87e Compare February 28, 2022 17:08
@Ericson2314
Copy link
Member Author

@thufschmitt rebased!

@dpulls
Copy link

dpulls bot commented Feb 28, 2022

🎉 All dependencies have been resolved !

@Ericson2314 Ericson2314 changed the base branch from master to 0.5-release February 28, 2022 18:32
@Ericson2314 Ericson2314 changed the base branch from 0.5-release to master February 28, 2022 18:32
set -eu -o pipefail

# Don't start the daemon
source common/vars-and-functions.sh
Copy link
Member Author

@Ericson2314 Ericson2314 Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thufschmitt to answer the question you asked on IRC a bit better, I didn't put the starting the daemon in init for these reasons:

  1. Running tests without recalling init should still work. if one wants to examine state: the test itself starts and stops the daemon, rather than something else starting and it stopping.
  2. init.sh merely does cleanup and setup of files. Besides removing stuff, it just creates new blank filesystem state. Stopping the daemon before removing the files is just for any rogue daemon, before it becomes harder to stop because the socket file was deleted.

The file is split because init.sh still needs the env vars.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes of course, it’s nice to be able to manually run init.sh once and then the test (I actually do that all the time)

@Ericson2314
Copy link
Member Author

@edolstra @thufschmitt approved. Is this OK with you?

@Ericson2314
Copy link
Member Author

@edolstra pinging again. It really would be good to merge this. It doesn't commit us to doing any more testing, it just makes the existing testing more robust.

@stale stale bot added the stale label Sep 21, 2022
@stale stale bot removed stale labels Jan 16, 2023
@Ericson2314 Ericson2314 changed the title Clean up daemon handling Make init.sh safe to run twice Jan 16, 2023
@Ericson2314
Copy link
Member Author

Ericson2314 commented Jan 16, 2023

I was inspired to return to this because #7600 has a devilishly hard to debug issue because it only cropped up in the daemon tests, and I was reminded how flaky that stuff is today.

In the process I found that the history had become quite messy, and so I am cleaning it up. This is not just the first commit in what was a few cleanups. So the previous reply history may talk about changes that are no longer in here. Ah I see that the original change probably didn't work on its own.

tests/init.sh Outdated Show resolved Hide resolved
@Ericson2314 Ericson2314 added the with-tests Issues related to testing. PRs with tests have some priority label Jan 16, 2023
@Ericson2314 Ericson2314 changed the title Make init.sh safe to run twice Clean up daemon handling Jan 17, 2023
@Ericson2314 Ericson2314 changed the title Clean up daemon handling Clean up daemon handling in the tests Jan 20, 2023
@thufschmitt thufschmitt self-assigned this Jan 23, 2023
Ericson2314 referenced this pull request in hercules-ci/nix Feb 7, 2023
@Ericson2314
Copy link
Member Author

@thufschmitt Small reminder on reviewing this :). @roberth and I last night discussed some good ideas to clean stuff up further, and avoid the profusion of support files (to which this PR sadly adds one more). But I think it is good to land this in the meantime; having the daemon start and stop logic be more robust will make further cleanups easier.

@roberth roberth mentioned this pull request Feb 10, 2023
14 tasks
Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been stalled for too long, indeed. Let's merge!

@thufschmitt
Copy link
Member

Well, now there are some conflicts 🙃

`init.sh` is tested on its own. We used to do that. I deleted it in
4720853 but I am not sure why. Better
to just restore it; at one point working on this every other test
passed, so seems good to check whether `init.sh` can be run twice.

We don't *need* to run `init.sh` twice, but I want to try to make our
tests as robust as possible so that manual debugging (where tests for
better or worse might be run ways that we didn't expect) is less
fragile.
Split `common.sh` into the vars and functions definitions vs starting
the daemon (and possibly other initialization logic). This way,
`init.sh` can just `source` the former. Trying to start the daemon
before `nix.conf` is written will fail because `nix daemon` requires
`--experimental-features 'nix-command'`.

`killDaemon` is idempotent, so it's safe to call when no daemon is
running.

`startDaemon` and `killDaemon` use the PID (which is now exported to
subshells) to decide whether there is work to be done, rather than
`NIX_REMOTE`, which might conceivably be set differently even if a
daemon is running.

`startDaemon` and `killDaemon` can save/restore the old `NIX_REMOTE` as
`NIX_REMOTE_OLD`.

`init.sh` kills daemon before deleting everything (including the daemon
socket).
@Ericson2314
Copy link
Member Author

Thanks @thufschmitt. For some reason when I rebased --- no conflicts!

@roberth roberth merged commit b5bbf14 into NixOS:master Feb 23, 2023
@Ericson2314 Ericson2314 deleted the init-twice-works branch February 23, 2023 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
with-tests Issues related to testing. PRs with tests have some priority
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants