-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supply chain security with embedded protoc binaries #575
Comments
This commit makes it possible to opt out of downloading the bundled pre-compiled protoc binaries. This is useful for downstream users with strict supply chain security requirements (see tokio-rs#575 for details). The approach taken is to move the pre-compiled binaries to a new crate called `protoc-bin`. prost-build depends on this crate only if the `protoc-vendored-bin` feature is enabled (which it is by default). While it was always possible to avoid the vendored binaries by specifying the `PROTOC` environment variable, the feature is a bit more foolproof; security-concious users can audit their Cargo.lock to ensure that the protc-bin crate is never even downloaded. This commit also adds a new feature called `protoc-vendored-src`, which uses the new `protobuf-src` crate to compile a copy of protoc from source on the fly. This feature makes `cargo build` seamless on all platforms that Protobuf supports, not just those that protoc-bin distributed pre-compiled binaries for, but has the downside of requiring a C++ toolchain and increasing compile times. If neither of the new vendored features are enabled, `prost-build` fails unless `PROTOC` and `PROTOC_INCLUDE` are explicitly specified.
It's time to switch from rust-protobuf to Prost. There are a few factors in this decsion: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decsion: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decsion: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decsion: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decsion: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decsion: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decsion: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decision: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decision: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decision: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decision: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decision: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
It's time to switch from rust-protobuf to Prost. There are a few factors in this decision: * The maintainer of rust-protobuf has announced that the project is essentially unmaintained for the time being [0]. * Prost is the more popular Protobuf library across the ecosystem. In particular, the Tonic GRPC library uses Prost, and the Tokio tracing console uses Prost. These are both extremely important for Materialize Platform, which we expect to be built on top of Tonic and making heavy use of distributed tracing. There have been been two major blockers to adopting Prost: 1. Prost lacked dynamic message support. That was miraculously fixed last month with the release of a fantastic prost-reflect crate [1]. 2. Prost embeds `protoc` in its Git repository (!). See tokio-rs/prost#575. That is a disaster for supply chain security. This commit patches in a Materialize-specific fork of Prost that compiles `protoc` from source rather than using upstream's embedded binary blobs. This commit excludes one crucial piece: converting the dynamic Protobuf decoding in the `interchange` and `testdrive` crates from rust-protobuf to prost-reflect. I'll do that migration in a future commit to avoid inflating the size of this PR unnecessarily. [0]: stepancheg/rust-protobuf#583 (comment) [1]: https://www.reddit.com/r/rust/comments/rsg6gx/announcing_prostreflect_a_crate_for_protobuf/
I think doing something along what's proposed here is a great idea! |
No problem, @LucioFranco! Thanks for taking a look now. I actually whipped up a proposal in #576, but we need you or another maintainer to weigh in on how to expose the toggle of which I’m also maintaining a fork of prost that forces compiling protoc from source: https://github.com/materializeinc/prost. You could upstream that approach in prost, too! But I figure you probably don’t want to force a C++ dependency on all users of prost. |
By the way, if it’s easiest to sync on Discord or something, happy to find the time! |
Yeah, let me try to catch up on everything so may be a bit slow then we can sync. I'd love for yall to not have to maintain a fork so I'd like to get this solved :) |
Ok I spent sometime today talking to @jonhoo about this and how we also want to not ship compiled binaries internally here. I think the proposed solution in #576 is a step in the right direction but I think we will want to avoid having two mutually exclusive feature flags. I think the direction we should head should actually be to not ship any pre-compiled binaries and only allow users to either provide a So I think we should follow this sort of flow for obtaining a
I believe this solves most of our issues, while this may increase build times for users that don't have a |
This sounds even better than my PR! |
@benesch I am working on a PR for this, ill post it in here and tag you on it once its ready. Thanks again for getting this rolling! |
Sounds wonderful! You are taking an even stronger stance on this than I had hoped, so I am delighted! |
We're thinking about using prost over at @MaterializeInc, but the supply chain security of the embedded
protoc
binaries has me very nervous.Specifically: what sort of validation are you doing before you merge PRs to check that the binaries supplied by third parties are identical to the binaries published by google/protobuf? I did some spot checks and they look identical, but I didn't see anything in your CI pipeline that would assert this.
We'd honestly probably want to opt out of the bundled protoc deps entirely. I know that's already possible if you specify
PROTOC
, but I'd feel better if there were a way to not even bring the binaries into our dependency tree. Otherwise it seems too easy for accidents to happen; e.g., if that env var somehow gets unset.How would y'all feel about adopting an approach similar to rust-protobuf, where the bundled binaries are kept in a separate crate (e.g.,
prost-build-vendored
)? Using that crate could be enabled by default, but we could keep it off @MaterializeInc and lint for its absence usingcargo deny
, and I'd sleep a good bit easier at night.I've also been toying with the idea of building something like
protobuf-src
for Rust (a laopenssl-src
) that builds libprotobuf from source and makes its protoc available to downstream libraries. That could be a nice alternative forprost-build
to expose.The text was updated successfully, but these errors were encountered: