Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features are always resolved at workspace level #66

Open
Shnatsel opened this issue Aug 9, 2022 · 13 comments
Open

Features are always resolved at workspace level #66

Shnatsel opened this issue Aug 9, 2022 · 13 comments
Labels
bug Something isn't working third party Work item for a third-party dependency

Comments

@Shnatsel
Copy link
Member

Shnatsel commented Aug 9, 2022

We use cargo metadata, so we are affected by this issue: rust-lang/cargo#7754

@Shnatsel Shnatsel added third party Work item for a third-party dependency bug Something isn't working labels Aug 9, 2022
@sunshowers
Copy link

sunshowers commented Dec 21, 2024

Hey!

I'm currently working through using cargo-auditable and/or cyclonedx to generate an SBOM, and I've run into this general class of issues.

To address them, I'm planning to hook up the Cargo build simulations I wrote within guppy. These build simulations are, to my knowledge, correct for both the v1 and v2 resolvers, as validated by extensive randomized testing against Cargo itself. (They're also the foundation of cargo-hakari).

The idea is that guppy's build simulations will produce the exact set of packages and features that a particular cargo build command would run, even if the build is expected to be run on a host platform that differs from the current platform. (So, for example, you can see the set of packages and features that would be built if you did a native build on x86_64-unknown-linux-gnu, even if you're on Windows or macOS.)

I'm planning to maintain a fork for now. Would there be interest in upstreaming it?

@sunshowers
Copy link

sunshowers commented Dec 21, 2024

I'm also curious what your thoughts are about cargo-auditable vs cargo-cyclonedx. We're currently evaluating which approach to go with (data stored in the binary vs a file on the side) -- it seems like both have benefits and costs.

@Shnatsel
Copy link
Member Author

Thank you for reaching out! Yes, this is interesting. I have also been looking into the krates crate, which also lets one write custom traversal functions over the crate graph and replicate resolver v2 fairly easily that way.

I am cautious to introduce custom reimplementations of complex and loosely documented algorithms, but extensive randomized testing gives me enough confidence in its correctness.

My concern is that Cargo is about to roll out the MSRV-aware resolver v3, which is going to ship in rust 1.85 in February 2025. This is the same release that will stabilize the 2024 edition and make MSRV-aware resolver the default. So whatever algorithm we use to refine cargo metadata output would also have to replicate resolver v3.

TL;DR: I would be happy to switch to guppy if it also supports the upcoming resolver v3. And I guess we'd still have to maintain a raw cargo metadata fallback for future resolver versions.

@Shnatsel
Copy link
Member Author

Regarding cargo-auditable vs cargo-cyclonedx: I've worked on both tools. "both have benefits and costs" just about sums it up.

@sunshowers
Copy link

sunshowers commented Dec 21, 2024

Thanks for the response!

Resolver v3 only affects the part of the dependency resolver that finds which versions to resolve to. Feature resolution is the same as v2, so no code changes are required. I'll try pushing an update to guppy to add the v3 resolver as an option, which works the same as the v2 resolver.

I would definitely be careful about trying to reimplement the resolver yourself. While building guppy's Cargo resolution I found a number of edge cases. For example:

  • the build-dependencies table is only taken into account if a build.rs is present
  • proc macros are right at the edge of host and target dependencies, so need to be handled carefully. Guppy provides a couple of knobs to handle a few different scenarios around them. Most of them won't be relevant for SBOM generation though.
  • To handle weak features, guppy uses a custom spin on a depth-first search called a "buffered filtered DFS".
  • There's also all the usual bits folks are familiar with in Cargo, such as the fact that dependency cycles are permitted due to dev-dependencies.

The resolver lives in
https://github.com/guppy-rs/guppy/blob/main/guppy/src/graph/cargo/build.rs. It's relatively short, but that's only because it's built on a lot of the more generic scaffolding that guppy provides.

Interestingly, in comparison testing I've found that guppy is many times faster than Cargo at delivering the same results. It's bad enough that guppy's CI runs the comparison tests in release mode — guppy itself is plenty fast in debug mode, but Cargo is too slow when you're running hundreds of simulations per run (even with opt-level 1).

@Shnatsel
Copy link
Member Author

Curiously, Cargo itself is in the process of migrating to PubGrub to speed things up, and also make the dependency resolver a standalone crate: https://rust-lang.github.io/rust-project-goals/2024h2/pubgrub-in-cargo.html

The tracking issue is here, and they have also demonstrated a significant performance improvement - see the comment from 3 days ago.

If I add some dependency solver, I'd prefer it to be that one, because this will be identical to what Cargo itself uses. That would get us if not perfect correctness then at least perfect adherence to Cargo's quirks. Do you think it would be a good idea to just use that in place of guppy?

@sunshowers
Copy link

Hmm I thought pubgrub was just for version resolution, not feature resolution (which isn't a constraint solving problem)

@Shnatsel
Copy link
Member Author

I may be very wrong here! Let me ask the person working on PubGrub in Cargo.

@Shnatsel
Copy link
Member Author

Indeed, the bits to be exposed by Cargo are not sufficient for the task.

Then guppy is clearly the best option, at least until Cargo adds native SBOM precursors and it's not clear if that's ever going to happen.

I'll be happy to add guppy to cargo auditable once resolver v3 is supported. We'll still need to keep the fallback cargo metadata codepath around for future/unrecognized resolver versions, but I'm excited to finally be able to handle resolvers v2 and v3 properly!

@sunshowers
Copy link

sunshowers commented Dec 23, 2024

Resolver v3 is now supported as of guppy 0.17.11. See https://docs.rs/guppy/latest/guppy/graph/cargo/enum.CargoResolverVersion.html#variant.V3.

If you decide to try integrating guppy, let me know! Happy to sync up in Zulip. I have to admit that guppy was the first big Rust crate I designed and I'd definitely do some things differently today. If there are particular warts that stand out or you have ideas for improving the API, please feel welcome to suggest them.

@Shnatsel
Copy link
Member Author

Shnatsel commented Dec 23, 2024

Thank you! Is there any example code I could use as a starting point for integrating guppy? Say, could you point me to how cargo-workspace-hack uses it?

I'm particularly interested in how the resolver version is determined.

@sunshowers
Copy link

Thank you! Is there any example code I could use as a starting point for integrating guppy? Say, could you point me to how cargo-workspace-hack uses it?

This code can help you get started: https://github.com/guppy-rs/guppy/blob/6769bd5748b5a0cfe243db2a20b95a41ccc14878/cargo-guppy/src/lib.rs#L213-L296

Hakari (the workspace-hack manager) does a few things as well. See https://github.com/guppy-rs/guppy/blob/6769bd5748b5a0cfe243db2a20b95a41ccc14878/tools/hakari/src/hakari.rs#L988-L1007, and in general search for CargoOptions in that file.

I'm particularly interested in how the resolver version is determined.

The resolver version actually must be passed in -- it is not auto-determined. One of guppy's design principles is that it works purely in-memory with information passed into it (typically the output of cargo metadata), without accessing the environment in any way. (There were at some point plans to use guppy on the web targeting wasm, so it is built to enable that.)

Unfortunately, cargo metadata doesn't contain the resolver version, so it must be passed in separately via CargoResolverVersion. I wish cargo made it available in a machine-readable form, but until then I think the following algorithm would work with current versions of Cargo.

  1. Find the workspace root and read its Cargo.toml. Guppy can help with this: https://docs.rs/guppy/latest/guppy/graph/struct.Workspace.html#method.root returns the root directory.
  2. If it has workspace.resolver, use that.
  3. Otherwise, if it has package.edition, use that (including a consideration for package.edition.workspace).
  4. Otherwise, use resolver v1.

Maybe there's a crate already which does this discovery.

@sunshowers
Copy link

sunshowers commented Dec 24, 2024

Oh, and the first example linked also shows how to set a different host platform. This feature allows hakari to work in a completely platform-independent manner, such that the same workspace-hack/Cargo.toml file is produced no matter what platform you're on -- an essential feature for practical use, similar to platform-independent lockfiles. This is something that Cargo's current APIs don't enable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working third party Work item for a third-party dependency
Projects
None yet
Development

No branches or pull requests

2 participants