Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse input lock files #7730

Open
roberth opened this issue Feb 1, 2023 · 21 comments
Open

Reuse input lock files #7730

roberth opened this issue Feb 1, 2023 · 21 comments
Assignees
Labels
feature Feature request or proposal flakes idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome.

Comments

@roberth
Copy link
Member

roberth commented Feb 1, 2023

(was: Solve dev(/test) dependencies and bloated, diverging lock files, by reconsidering the flattened lock file)

Is your feature request related to a problem? Please describe.

We've previously identified issues with the way the lock file currently works.

  • Irrelevant development dependencies are added to the lock files of all dependent flakes
  • Mututally dependent flakes cause diverging lock files as each lock file update incorporates a new version with a chain of irrelevant dependencies.
    • issue?
  • Some users just complain about the size of their lock file. Partly pedantic, but they're actually kind of right:
  • Ideally, lock file changes are reviewable, so that nothing weird creeps in through contributions that have such updates.

Describe the solution you'd like

We may consider two scenarios; one with follows and one without. Let's start without follows.

Observing that we do not exploit the apparent benefit of knowing all inputs upfront, we may consider loading the required parts of dependency lockfiles on demand.

Example scenario

  • Dependencies . -> foo -> nixpkgs
  • No follows.

Evaluation order, current situation:

  • need packages.default
  • need inputs.foo.packages.foo
  • read foo details from ./flake.lock
  • fetch foo
  • need foo.inputs.nixpkgs
  • read nixpkgs_1 details from local lock file
  • fetch nixpkgs_1

Same evaluation, partial lock file

  • need packages.default
  • need inputs.foo.packages.foo
  • read foo details from local lock file
  • fetch foo
  • need foo.inputs.nixpkgs
  • read nixpkgs details from foo/flake.lock
  • fetch nixpkgs

As you can see, the fetching characteristics remain the same.

Let's reintroduce follows, and first consider the example:

inputs.foo.nixpkgs.follows = "nixpkgs";

In other words, this injects the local nixpkgs dependency into foo. The lazy fetching behavior remains the same. This injection should be handled by call-flake.nix and does not require changes to the lock file.

Now consider the opposite follows:

inputs.nixpkgs.follows = "foo.nixpkgs";

In this case, if we were not to include a copy of foo.nixpkgs details in the local lock file, we'd have a slightly worse lazy fetching behavior, as nixpkgs would only be fetchable after foo's lock has been fetched. For this case, we do want "flat lock file" behavior, but limited to this dependency. The rule for which transitive dependencies to include in the lock file might be as simple as direct inputs + inputs referenced in follows.

This suggests the following implementation strategy:

  • When generating a lock, apply the above rule to determine which flakes to traverse and include in the lock file
  • When calling a flake, call it such that its inputs are those from its own lock // those set in the current lock
    • Lazy trees #6530 already adds an overrides parameter in call-flake.nix that appears to enable this. The missing bit is that some/many nodes would be created by reinvoking call-flake.nix rather than sharing the node identifiers. Node identifiers are local to each call-flake.nix call. i.e. overrides is a prerequisite for calling dependency locks.

Describe alternatives you've considered

Mark development dependencies explicitly.

  • This is more configuration which is generally not good.
  • This adds more corner cases and error messages.
  • This does not solve the problem of having mutually recursive development dependencies. Note that we can barely manage to make builds not mutually recursive, but flakes encompass more than builds, so expecting flakes not to be mutually recursive is not realistic.

Additional context

Priorities

Add 👍 to issues you find important.

@roberth
Copy link
Member Author

roberth commented Jun 12, 2023

I quite like the syntax @mkenigs used in #4004 to talk about overrides

A (override B.C.D.E) -> B (override C.D)-> C -> D -> E

It is also a useful vehicle for talking about the locking algorithm, which seems that it would recurse into dependencies until the overrides are (), and would only record the inputs that are not already locked or overridden somewhere.

Similarly, it would probably reflect the overrides parameter in call-flake.nix. This parameter does not exist in master yet, and lazy-trees defines it as overrides on sources. It would have to become a structure representing overrides containing invoked flakes rather than source inputs, and it perhaps need a somewhat richer structure to represent deep overrides.

@thufschmitt thufschmitt added the idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. label Jul 31, 2023
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-07-31-nix-team-meeting-minutes-76/31486/1

@edolstra
Copy link
Member

@tomberek My thought on the new lock file format: It should store mappings from unlocked input paths to locked refs, and nothing more, and it shouldn't store input paths that are already locked by the lock files of input flakes.

I.e. something like

{
  "version": 8,
  "locks": {
    "nixpkgs": {
      "original": { ... original ref from flake.nix ... }
      "locked": { ... the immutable lock ... }
    },
    "nix": {
      "original": { ... original ref from flake.nix ... }
      "locked": { ... the immutable lock ... }
   },
   # note: since the "nix" input has its own lock file, we don't store the locks of the "nix" input here, unless there are overrides
   # e.g. if there is a (non-follows) override "nix.nixpkgs", or if "nix" does not have a lock file:
   "nix.nixpkgs": {
     "original": ...,
     "locked": ...
   };
  }
} 

Thus, if we encounter an unlocked input a.b.c in flake.nix, we look up locks."a.b.c" in flake.lock. If it doesn't exist (or if its "original" ref doesn't match), we resolve the unlocked ref and add/replace the lock file entry a.b.c.

We don't create lock file entries for "follows" inputs (e.g. if "a.b" follows "c.d", we don't create a lock file entry for "a.b"). In general, we don't store any information in the lock file that is already in flake.nix (except for the "original" refs, which are needed to decide whether we need to update a lock file entry if the corresponding unlocked entry has changed).

Note that the above replaces the graph structure of the current lock file format with a simple tree.

@roberth
Copy link
Member Author

roberth commented Aug 24, 2023

Note that the above replaces the graph structure of the current lock file format with a simple tree.

  • current graph: everything, ready to for a mapAttrs and fix without doing a recursive call-flake.nix - only one set of nodes
  • proposed tree: just the impure flakerefs in the current flake's inputs. call-flake.nix does something similar to the current call-flake.nix, but does so for individual flakes with their inputs, without the unnecessary concept of nodes.

To make follows possible, call-flake.nix needs to accept input overrides as a parameter. Without such a parameter, an input A would only see its own inputs, because call-flake.nix reinvoked itself with A's lock file instead of the user's lock file.

@roberth roberth added this to the Flakes milestone Aug 30, 2023
@tomberek
Copy link
Contributor

tomberek commented Sep 19, 2023

From NixCon, in discussion with @lheckemann , @infinisil , @roberth

  • established an example where behavior differs. When an immediate dependency moves or is no longer available (eg: authorization), flakes it depends on are not accessible. This means something like inputs.A.inputs.B.inputs.C.packages.SYSTEM.thing would fail, where with a full-graph would allow for accessing C's outputs even if B was unavailable. So in this particular sense we lose this style of laziness. This is an edge-case and likely not worth basing decision-making on.
  • Simpler implementation is highly preferable.
  • Need a strategy to still support older lockfiles? (for how long? until stabilization?)
  • Need for a cache of resolved lockfile information
    • can we use existing git cache?
    • reuse eval cache sqlite?
    • another new cache?
  • include a strict-lock option?
  • follows and overrides information now only resides in the flake.nix (split source of truth to define the graph, but it ensures that there is only one place the information resides)

@mkenigs
Copy link
Contributor

mkenigs commented Sep 19, 2023

One potential consideration for the new format:

After @lheckemann's talk I was thinking about a locking algorithm that would handle something like this:
flake A's flake.nix:

inputs.B.url = "github...";
inputs.B.useLock = false;

inputs.C.url = "github...";
inputs.C.useLock = false;

Suppose B and C both depend on nixpkgs. Then locking flake A would store a single revision for nixpkgs and store it in flake A's flake.lock. This is very similar to how language dependency managers like cargo make a distinction between lib and binary dependencies.

It might be possible to express the same thing in the current format, by doing e.g.:

inputs.nixpkgs = "github...";
inputs.B.inputs.nixpkgs.follows = "nixpkgs";
inputs.C.inputs.nixpkgs.follows = "nixpkgs";

But I think that's more difficult, and in practice it seems people often don't add all of the follows they could. It seems like useLock might be a more likely way to cut down on lockfile bloat.

Implementing useLock would require storing the entirety of the dependency graph of the child in the parent's flake.lock. That's possible in the current format. It's not possible with the format currently proposed in this issue, although I think it could be a natural extension (i.e. useLock could imply the strict-lock option @tomberek mentioned)

@lheckemann
Copy link
Member

It should in fact be possible with the proposed format, and would be represented on the lockfile side in much the same way as the follows approach.

Note also that ignoring downstream locks would only result in the same nixpkgs being used if the downstream inputs use the same nixpkgs input spec. I hope we can also find an approach (though it need not be implemented inside Nix) that works for replacing references that use different branches, or even different repositories or input types.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-09-18-nix-team-meeting-minutes-87/33194/7

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-09-18-nix-team-meeting-minutes-87/33194/8

@roberth
Copy link
Member Author

roberth commented Sep 20, 2023

would be represented on the lockfile side in much the same way as the follows approach.

Yes, except follows doesn't have to be in the lockfile at all. We expect that the inputs declarations in flake.nix can be the single source of truth. Same for features that behave like follows.

@mkenigs
Copy link
Contributor

mkenigs commented Sep 20, 2023

This would have to be in the lockfile I think, because it's not present in flake.nix as either a follows or an override (it's a bit more like an implicit follows). Adapting what @edolstra has above, it could be expressed as:

  "locks": {
   "B.nixpkgs": {
     "original": ...,
     "locked": ...
   };
   "C.nixpkgs": {
     "original": ...,
     "locked": ...
   };
  };

That would result in more duplicated information than the current format.

@roberth
Copy link
Member Author

roberth commented Sep 20, 2023

(it's a bit more like an implicit follows).

An implicit follows is an override just as well.


Also responding to your previous comment:

Implementing useLock would require storing the entirety of the dependency graph of the child in the parent's flake.lock. That's possible in the current format. It's not possible with the format currently proposed in this issue

It is actually possible, by storing the whole graph as overrides. useLock would cause call-flake.nix to look for, and apply those overrides.

So such a feature does not conflict with the proposal to reuse input lock files without copying.

although I think it could be a natural extension (i.e. useLock could imply the strict-lock option @tomberek mentioned)

Yes, except applied to part of the dependency tree. It's even more similar to what we do for inputs that have no lock file.

For useLock to work well, it would be nice to have inputs that are "fixed" or "dev only" or "private" - ones that aren't affected by useLock because that wouldn't achieve anything.

This can all be added later though.

@mkenigs
Copy link
Contributor

mkenigs commented Sep 20, 2023

It is actually possible, by storing the whole graph as overrides. useLock would cause call-flake.nix to look for, and apply those overrides.

I think I may be misunderstanding:

proposed tree: just the impure flakerefs in the current flake's inputs

Which I took to mean inputs/overrides/follows mentioned in flake.nix. But if storing the whole graph as overrides, including inputs that aren't in any way present in flake.nix, is possible, then that's great.

This can all be added later though.

👍 for sure! If I do take a crack at implementing useLock, should I do that using the existing format, or do the changes mentioned in this issue need to happen first?

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/sane-stable-stateless-nixos-setup/35568/3

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/flake-lock-inadequate/37753/12

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/flake-lock-inadequate/37753/13

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-flake-inputs-not-lazy/25463/11

@szlend
Copy link
Member

szlend commented Feb 27, 2024

Another issue with the current lock file is that it causes nix to needlessly re-evaluates equivalent inputs.

For example if we have two inputs A and B that both depend on the same revision of nixpkgs (equivalent narHash), then nixpkgs will be evaluated twice. This can be worked around by using B.inputs.nixpkgs.follows = "A/nixpkgs".

However if you want to take advantage of binary caching, then using follows is not an option. If A.nixpkgs and B.nixpkgs at some point diverge because A and B aren't always updated at the same time, then using follows on B.nixpkgs will cause it to miss the binary cache.

So if you want to take advantage of binary caching, there's no practical way to avoid re-evaluating inputs.

@szlend
Copy link
Member

szlend commented Mar 14, 2024

I looked into fixing the performance implications of this (without touching the lock file format) in #10218.

infinisil added a commit to NixOS/nixfmt that referenced this issue Mar 14, 2024
- Simplifies the whole thing due to not having to mess with `system`, see NixOS/nix#3843
- Makes the lockfile _way_ smaller, see NixOS/nix#7730
infinisil added a commit to NixOS/nixfmt that referenced this issue Mar 14, 2024
- Simplifies the whole thing due to not having to mess with `system`, see NixOS/nix#3843
- Makes the lockfile _way_ smaller, see NixOS/nix#7730
@roberth
Copy link
Member Author

roberth commented Mar 21, 2024

@szlend Thank you for looking into this.

I think the solution in #10218 and in this proposal are somewhat incompatible, because #10218 would require fetching all lock files all the time. It seems that laziness could be recovered with #8025 though, so I'm not too worried.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/monorepos-dont-map-to-our-social-structure/44162/18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal flakes idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome.
Projects
Status: Defined work
Status: 🏁 Review
Development

No branches or pull requests

8 participants