Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffrs library pulls in a lot of dependencies that are not needed. #143

Open
xfbs opened this issue Oct 20, 2023 · 4 comments
Open

Buffrs library pulls in a lot of dependencies that are not needed. #143

xfbs opened this issue Oct 20, 2023 · 4 comments
Assignees
Labels
complexity::high Issues or ideas that are highly complex. require discussion and may affect backwards compatibility component::cli Everything related to the buffrs cli type::idea Rough idea or proposal for buffrs type::refactoring Changing the inner-workings of buffrs

Comments

@xfbs
Copy link
Contributor

xfbs commented Oct 20, 2023

buffrs is two things at once

  • a library which is used to build things, using buffrs::include!()
  • a binary which is used to setup and manage buffrs projects

The unfortunate side effect of that is that buffrs comes with a lot of dependencies, some of which are not needed when using it as a library.

A common pattern here is to do one of two things:

  • Split buffrs into multiple crates. For example, there could be a buffrs or buffrs_core crate which implements the library side of things, and a buffrs-cli crate which implements the CLI. We can still name the binary it produces buffrs if the did call this one buffrs-cli.
  • Use features. For example, we could add a cli feature to buffrs, mark the binary as being conditional on the presence of this feature, and only enable certain things (such as miette's fancy feature) when this feature is present.

Doing this would also help us in some way by having an explicit error boundary, where we could use structured errors on the library side and unstructured ones on the binary side, as currently the two are a bit mixed.

This is not really high-priority, but something to think about.

@xfbs xfbs self-assigned this Oct 20, 2023
@mara-schulke
Copy link
Contributor

@xfbs we already make use of cargo features to only inline the bare minimum of features for each target. Maybe we can improve them, but i dont think we need to split crates here?

@mara-schulke mara-schulke added component::cli Everything related to the buffrs cli complexity::high Issues or ideas that are highly complex. require discussion and may affect backwards compatibility type::refactoring Changing the inner-workings of buffrs type::idea Rough idea or proposal for buffrs labels Oct 21, 2023
@j-baker
Copy link
Collaborator

j-baker commented Nov 30, 2023

I'm with @xfbs on this. I'm trying to use buffrs, and I'm finding it's pulling in (in my cargo build):

libz-sys
libssh-sys
libgit2-sys

Each of these adds a few seconds to my compile, are duplicated due to buffrs being a build and normal dependency, and worse, the requirements for rebuilding are highly frequent on environment changes. Essentially, running 'cargo clippy' from a vscode shell and then running it from iterm is enough to cause these deps to require a full rebuild (20 seconds overall) whereas without them, the build would only require compiling my crates.

My understanding is that it's not easy to have a dependency be default included with a binary target, but optional if depended on as a lib.

At present, I think I can work around this by depending on buffrs as:

buffrs = { version = "0.7.2", default_features = false, features = ["build"]}

or similar. But it feels unfortunate that using buffrs as documented needlessly slows down your development, and my experience is that crate boundaries are the correct way to ensure appropriate dependency choice. In particular, I think the thing that is being traded off here is:

  • for a cli, you basically can use any dependency you want, because it's compile once, use many.
  • for a library, you want to use the absolute minimum set of dependencies, because it can add requirements for usage which aren't desirable. For some examples, libgit2 and libssh add dependencies on a native libz being present, clap adds a very recent MSRV, etc.

It's unsurprising to me that buffrs is accidentally slowing down cargo builds given that most folks developing on it are devving on the cli.

@xfbs
Copy link
Contributor Author

xfbs commented Nov 30, 2023

@j-baker, I think I had identified the similar issues as you. Even thought we already do some feature-gating as @mara-schulke suggested, we might have a fundamental issue right now.

Problem definition

Under the same buffrs namespace, we want to house two relatively different things:

  • the buffrs library, which should be as bare-bones as possible (only compiles protobuf; very minimal, carefully-chosen dependencies because this is rebuilt frequently),
  • the buffrs CLI, which should be nice (using clap; having a lot of features out-of-the-box, err on the side of having more features and dependencies because they only need to be built once).

In other words: the default feature (and thus dependency) sets of these two use-cases are different.

Ideally, you want to be able to add this to your project's dependencies and be sure to get buffrs with the minimal feature set you need to use it productively (aka, it has everything it needs to compile protocol buffer files, and nothing more):

[dependencies]
buffrs = "1.2.3"
We can actually kind of split the library use-case into two

We can actually even split the library use-case into two separate use-cases:

  • Using the library in a builds.rs script, where we actually build protocol buffer stuff
  • Using the library in the dependencies to use the buffrs::include! macro

So, a usage might actually look like this:

[dependencies]
# literally only the `include` macro?
buffrs = "1.2.3"

[build-dependencies]
# enable protobuf dependencies?
buffrs = { version = "1.2.3", features = ["build"] }

At the same time, we want to have a CLI which is fairly comprehensive. It might depend on a lot of nice crates, such as clap, miette with the fancy feature (which the library does not need), validation support, and some other niceties out-of-the-box.

With the current setup, we cannot easily achieve this. Currently, even using buffrs as a library pulls in dependencies that are only needed by the CLI. These are both the same crate, dependencies are on a crate-level and not a target level in Rust.

As far as I understand, we have two approaches to make this work, I will list both with their respective implications:

Solution space

Feature-gating the CLI

The idea here:

  1. make all the stuff that the CLI needs to build optional
  2. introduce a feature cli which turns on all of those dependencies
  3. tell Cargo to only build the CLI when this feature is enabled.

This is how we would tell Cargo:

[features]
# ...
cli = ["clap", "git2", "..."]

[[bin]]
name = "buffrs"
required-features = ["cli"]

Now, when a project uses buffrs as dependency it is very light-weight (no git2, no clap, etc). At the same time, we are free to add nice dependencies gated by the cli target, and the CLI is only built when we enable this target.

The downside of this approach is that it makes installing the CLI counter-intuitive: instead of being simply able to cargo install buffrs, you need to manually enable the CLI feature, because otherwise (from Cargo's perspective) the buffrs CLI does not exist:

cargo install buffrs --features cli

This might be a bit annoying, because when you forget to add the cli feature, it does not actually install anything.

And we probably (?) also don't want to add cli to the default-features set, because that makes it very easy for users for buffrs (as a library) to forget to turn that off (and potentially frustrate them, without them understanding why it builds so many dependencies).

Splitting crates

The other option is to have a separate CLI crate.

  1. Make everything in buffrs that is non-essential as optional.
  2. Create a buffrs-cli crate in the repo, which depends on buffrs and enables all optional features by default, plus has additional features only it needs (clap).
  3. Set this buffrs-cli crate to produce a buffrs binary

So we basically add a folder called cli/ in the repo with the following manifest:

[package]
name = "buffrs-cli"
# ...

[[main]]
name = "buffrs" # override name of binary
path = "src/main.rs"

This package will then still export a buffrs binary. It can have all of the nice dependencies we need (and maybe enable some extra dependencies on the buffrs library, such as support for git2).

This would mean that installing the CLI would look like this:

cargo install buffrs-cli

This is a small change, but I feel that this might be the easiest and "future-proof".

Conclusion

Let's maybe get @mara-schulke's input on this, to see if any of the ideas here make sense. This is not urgent, but I think if we put a little thought into this we can definitely achieve some rather large improvements.

@qsantos
Copy link
Contributor

qsantos commented Dec 21, 2023

To add my two cents, I have the same observations. My own preference goes strongly to splitting Buffrs into separate crates, especially since there is already a separate crate for the registry.

My reasoning is that crates are much easier to discover than features within a crate. They also enable (more) parallel compilation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
complexity::high Issues or ideas that are highly complex. require discussion and may affect backwards compatibility component::cli Everything related to the buffrs cli type::idea Rough idea or proposal for buffrs type::refactoring Changing the inner-workings of buffrs
Projects
None yet
Development

No branches or pull requests

4 participants