Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nixpkgs eval time is increasing too fast #320528

Open
superherointj opened this issue Jun 17, 2024 · 16 comments
Open

Nixpkgs eval time is increasing too fast #320528

superherointj opened this issue Jun 17, 2024 · 16 comments
Labels
0.kind: bug 0.kind: regression Something that worked before working no longer architecture Relating to code and API architecture of Nixpkgs infrastructure significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.

Comments

@superherointj
Copy link
Contributor

superherointj commented Jun 17, 2024

At Mar 25, 2023, eval time was at 1m5s:
#221716 (comment)

Now, Jun 17, 2024, it is: 1m53,458s, same host, my host

CI is also getting slower.

@superherointj superherointj added 0.kind: bug infrastructure 0.kind: regression Something that worked before working no longer significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. architecture Relating to code and API architecture of Nixpkgs labels Jun 17, 2024
@superherointj
Copy link
Contributor Author

superherointj commented Jun 17, 2024

Bisect

On bisect's 1st commit (18100dc) (50%)

image

Assuming it is bad:

Bisects 2 commit (83383ab) (25%)

image

Bisects 3 commit (9441fc2) (12.5%)

image

Bisects 4 commit (c8a59e2) (6.25%)

image

Bisects 5 commit (32ff037) (3.125%)

image

Bisect 6 commit (fd6717f) (1.562%)

image

...

Seems linear growth?

@superherointj superherointj changed the title Nixpkgs eval times are growing too fast Nixpkgs eval times is increasing too fast Jun 17, 2024
@superherointj superherointj changed the title Nixpkgs eval times is increasing too fast Nixpkgs eval time is increasing too fast Jun 17, 2024
@superherointj
Copy link
Contributor Author

superherointj commented Jun 18, 2024

Eval History

Command measured:

time nix-env --query --available --out-path --file ./. > /dev/null

release-24.05 (ff91973)

user 1m50,262s
user 1m48,587s
user 1m50,375s

release-23.11(f0d1115)

user 1m26,884s
user 1m26,785s

release-23.05 (9a333ea)

user 1m26,971s
user 1m26,688s

release-22.11 (ea4c80b)

user 0m59,795s
user 0m59,473s

release-22.05 (380be19)

user 0m43,354s
user 0m43,409s

release-21.11 (2766f77)

user 0m36,231s
user 0m35,882s

@Mic92
Copy link
Member

Mic92 commented Jun 18, 2024

Did the number of derivations that we build also grow linear over this time frame?
We also not just added new packages but also increased derivation count on existing ones i.e. importing Cargo.lock in buildRustPackage. I also heard that the use of finalAttrs can increase eval time, which is inline with other experiences / benchmarks that I made, fixed points computation like the modules and overlays are slow and we should avoid them when possible.

@superherointj
Copy link
Contributor Author

superherointj commented Jun 18, 2024

Did the number of derivations that we build also grow linear over this time frame?

Packages increase ~8k per release.

We also not just added new packages but also increased derivation count on existing ones i.e. importing Cargo.lock in buildRustPackage.

Also packages 'by-name' is dynamic.

I wonder the possibility of having an eval multi-core.

Or some other hack, 'by-name' could be generated and static, also optimized/cached. Would that help? Needs to be confirmed.

Lockfiles and it's bundled dependencies in nixpkgs being bad, yes.

It's hard to point to a single problem here.

This is only getting worse unless we do something. Question is what?

@Mic92
Copy link
Member

Mic92 commented Jun 18, 2024

This is only getting worse unless we do something. Question is what?

Reduce the size of nixpkgs

  • Removing packages marked as broken
  • Removing leaf packages without a maintainer
  • Removing inactive maintainer (already happens every once in a while)
  • Maybe we should start becoming more picky about packages that we accept in nixpkgs? -> RFC: Relevance criteria for packages accepted in nixpkgs

Make nixpkgs evaluation faster or at least don't regress

  • Make nix-eval-jobs not write out derivations so it's closer in speed per derivation like nix-env but also can use multiple threads. However don't expect linear speed-up with increasing threads since a lot of packages will be evaluated several times as we split of eval over several cores.
  • Port over remaining performance fixes from lix to nix (8% speed-up for "hello" in my micro-benchmark)
  • Better performance measurement tools for nixpkgs i.e. Tracy-based expr evaluation profiler nix#9967 to make our nix code faster and give better feedback to nix core developers to find out what is slow
  • More emphasize on performance when reviewing changes. I believe ofborg reports some performance metrics?
  • Add "evaluation time" to nixpkgs-review reports if evaluation was done locally.

@alyssais
Copy link
Member

We also not just added new packages but also increased derivation count on existing ones i.e. importing Cargo.lock in buildRustPackage.

cc @Ericson2314 since I know you're interested in improving this

@SuperSandro2000
Copy link
Member

  • More emphasize on performance when reviewing changes. I believe ofborg reports some performance metrics?

From what I've got told, those are usually a bit wonky unless and only round about.

@AndersonTorres
Copy link
Member

AndersonTorres commented Jun 18, 2024

Removing inactive maintainer (already happens every once in a while)

Self-promotion :)

#290642
#310759

Maybe we should start becoming more picky about packages that we accept in nixpkgs? -> RFC: Relevance criteria for packages accepted in nixpkgs

my two cents:

https://discourse.nixos.org/t/monorepos-dont-map-to-our-social-structure/44162/39

@SuperSandro2000
Copy link
Member

SuperSandro2000 commented Jun 18, 2024

I think we must do some profiling to know which lines are the worst hitters and where we need to optimize the most. Also it would be cool to compare, which parts got the biggest increase over time.

I think right now we are just guessing, it could be nix, it could be lib, it could be finalAttrs or meta or splicing or it could be anything else.

@superherointj
Copy link
Contributor Author

Parallel Nix evaluation
https://determinate.systems/posts/parallel-nix-eval/

@andrewhamon
Copy link
Contributor

Reduce the size of nixpkgs

I just want to point out that, for darwin at least, there has seen substantial eval regressions even for a single package. I brought this up in matrix and was told that its probably because the bootstrapping chain getting longer.

I'm sure that evaluation performance of all packages is very important for NixOS maintainers, but just want to add that even little people like me are feeling eval performance pain in narrower context, and that won't be fixed by removing unmaintained packages :)

@Mic92
Copy link
Member

Mic92 commented Jul 2, 2024

I also noticed that macOS takes significantly longer to evaluate the same nix-shell as on my Linux machine.

@Ericson2314
Copy link
Member

Yeah we need to do some profiling, the fact is no one has any idea why it is so slow. 10,000s of packages should not be this slow to evaluate!

@Atemu
Copy link
Member

Atemu commented Jul 14, 2024

I had a hunch Cargo.lock files might be causing problems. To test this theory, I stubbed their parsing by replacing

with "" and making all Cargo.lock files empty (fd Cargo.lock -X sh -c 'echo | tee {}'). I then tested before and after.

We spend about 30% of our eval time and 15% of our heap size on parsing/handling Cargo.lock files:

$ nix --version
nix (Lix, like Nix) 2.90.0-rc1

a62a011:

{
  "cpuTime": 76.9637680053711,
  "envs": {
    "bytes": 3409693504,
    "elements": 253308386,
    "number": 172903302
  },
  "gc": {
    "heapSize": 13862367232,
    "totalBytes": 26579119472
  },
  "list": {
    "bytes": 710862960,
    "concats": 16410139,
    "elements": 88857870
  },
  "nrAvoided": 204104994,
  "nrFunctionCalls": 156648818,
  "nrLookups": 79297894,
  "nrOpUpdateValuesCopied": 442157128,
  "nrOpUpdates": 21092997,
  "nrPrimOpCalls": 83293818,
  "nrThunks": 237313579,
  "sets": {
    "bytes": 9959639024,
    "elements": 586641136,
    "number": 35836303
  },
  "sizes": {
    "Attr": 16,
    "Bindings": 16,
    "Env": 8,
    "Value": 24
  },
  "symbols": {
    "bytes": 2368176,
    "number": 171078
  },
  "values": {
    "bytes": 7275923184,
    "number": 303163466
  }
}

real	1m22.266s
user	1m17.120s
sys	0m4.726s

a62a011 (stubbed):

{
  "cpuTime": 59.946327209472656,
  "envs": {
    "bytes": 2568382656,
    "elements": 190128834,
    "number": 130918998
  },
  "gc": {
    "heapSize": 12067205120,
    "totalBytes": 21380865600
  },
  "list": {
    "bytes": 635086560,
    "concats": 12757511,
    "elements": 79385820
  },
  "nrAvoided": 149374377,
  "nrFunctionCalls": 118551974,
  "nrLookups": 62084350,
  "nrOpUpdateValuesCopied": 367865584,
  "nrOpUpdates": 15322106,
  "nrPrimOpCalls": 60336171,
  "nrThunks": 176102963,
  "sets": {
    "bytes": 8241294736,
    "elements": 487038438,
    "number": 28042483
  },
  "sizes": {
    "Attr": 16,
    "Bindings": 16,
    "Env": 8,
    "Value": 24
  },
  "symbols": {
    "bytes": 2313257,
    "number": 169549
  },
  "values": {
    "bytes": 5733585480,
    "number": 238899395
  }
}


real	0m59.247s
user	0m55.028s
sys	0m3.957s
Delta side-by-side
Δ tmp/normal.json ⟶   tmp/stubbed.json
──────────────────────────────────────────────────────────────────────────────────────────────────

─────┐
• 1: │
─────┘
│  1 │{                                          │  1 │{
│  2 │    "cpuTime": 76.9637680053711,           │  2 │    "cpuTime": 59.946327209472656,
│  3 │    "envs": {                              │  3 │    "envs": {
│  4 │        "bytes": 3409693504,               │  4 │        "bytes": 2568382656,
│  5 │        "elements": 253308386,             │  5 │        "elements": 190128834,
│  6 │        "number": 172903302                │  6 │        "number": 130918998
│  7 │    },                                     │  7 │    },
│  8 │    "gc": {                                │  8 │    "gc": {
│  9 │        "heapSize": 13862367232,           │  9 │        "heapSize": 12067205120,
│ 10 │        "totalBytes": 26579119472          │ 10 │        "totalBytes": 21380865600
│ 11 │    },                                     │ 11 │    },
│ 12 │    "list": {                              │ 12 │    "list": {
│ 13 │        "bytes": 710862960,                │ 13 │        "bytes": 635086560,
│ 14 │        "concats": 16410139,               │ 14 │        "concats": 12757511,
│ 15 │        "elements": 88857870               │ 15 │        "elements": 79385820
│ 16 │    },                                     │ 16 │    },
│ 17 │    "nrAvoided": 204104994,                │ 17 │    "nrAvoided": 149374377,
│ 18 │    "nrFunctionCalls": 156648818,          │ 18 │    "nrFunctionCalls": 118551974,
│ 19 │    "nrLookups": 79297894,                 │ 19 │    "nrLookups": 62084350,
│ 20 │    "nrOpUpdateValuesCopied": 442157128,   │ 20 │    "nrOpUpdateValuesCopied": 367865584,
│ 21 │    "nrOpUpdates": 21092997,               │ 21 │    "nrOpUpdates": 15322106,
│ 22 │    "nrPrimOpCalls": 83293818,             │ 22 │    "nrPrimOpCalls": 60336171,
│ 23 │    "nrThunks": 237313579,                 │ 23 │    "nrThunks": 176102963,
│ 24 │    "sets": {                              │ 24 │    "sets": {
│ 25 │        "bytes": 9959639024,               │ 25 │        "bytes": 8241294736,
│ 26 │        "elements": 586641136,             │ 26 │        "elements": 487038438,
│ 27 │        "number": 35836303                 │ 27 │        "number": 28042483
│ 28 │    },                                     │ 28 │    },
│ 29 │    "sizes": {                             │ 29 │    "sizes": {
│ 30 │        "Attr": 16,                        │ 30 │        "Attr": 16,

──────┐
• 33: │
──────┘
│ 33 │        "Value": 24                        │ 33 │        "Value": 24
│ 34 │    },                                     │ 34 │    },
│ 35 │    "symbols": {                           │ 35 │    "symbols": {
│ 36 │        "bytes": 2368176,                  │ 36 │        "bytes": 2313257,
│ 37 │        "number": 171078                   │ 37 │        "number": 169549
│ 38 │    },                                     │ 38 │    },
│ 39 │    "values": {                            │ 39 │    "values": {
│ 40 │        "bytes": 7275923184,               │ 40 │        "bytes": 5733585480,
│ 41 │        "number": 303163466                │ 41 │        "number": 238899395
│ 42 │    }                                      │ 42 │    }
│ 43 │}                                          │ 43 │}

@Mic92 Mic92 mentioned this issue Jul 14, 2024
13 tasks
@Mic92
Copy link
Member

Mic92 commented Jul 14, 2024

Anyone up for a spring clean? NixOS/rfcs#180

@Atemu
Copy link
Member

Atemu commented Jul 14, 2024

So I just found out that the 20s delta we see here is caused by just ~300 packages. This means an average of 76.67ms for each Cargo.lock in Nixpkgs. Quite a lot if you ask me.

More details about the problem with Cargo.lock in Nixpkgs at #327063

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug 0.kind: regression Something that worked before working no longer architecture Relating to code and API architecture of Nixpkgs infrastructure significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.
Projects
None yet
Development

No branches or pull requests

10 participants