Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repo Consolidation: Round 3 #1556

Closed
5 tasks done
marten-seemann opened this issue May 26, 2022 · 16 comments
Closed
5 tasks done

Repo Consolidation: Round 3 #1556

marten-seemann opened this issue May 26, 2022 · 16 comments
Labels
effort/days Estimated to take multiple days, but less than a week kind/enhancement A net-new feature or improvement to an existing feature P0 Critical: Tackled by core team ASAP

Comments

@marten-seemann
Copy link
Contributor

marten-seemann commented May 26, 2022

The transport consolidation (#1187) has dramatically reduced repo sprawl. We now only have a few go-libp2p-* repos left in the dependency tree of go-lipbp2p. This is a proposal to also consolidate them into go-lipbp2:

  • go-libp2p-resource-manager: /p2p/host/resource-manager
  • go-peerstore: /p2p/host/peerstore
  • go-libp2p-core: /core
  • go-eventbus: /p2p/host/eventbus

Not sure what to do about go-libp2p-asn-util. This repo should probably have been named go-asn-util, it's not libp2p related in any way. I also don't feel like moving several MB of ASN here.

Anything I missed? Thoughts, @MarcoPolo @vyzo @Stebalien @raulk?

@marten-seemann marten-seemann added kind/enhancement A net-new feature or improvement to an existing feature P0 Critical: Tackled by core team ASAP effort/days Estimated to take multiple days, but less than a week labels May 26, 2022
@vyzo
Copy link
Contributor

vyzo commented May 26, 2022

asn no, the rest is reasonable proposition.

@Stebalien
Copy link
Member

Agree with vyzo. Otherwise, lgtm.

@vyzo
Copy link
Contributor

vyzo commented May 28, 2022

I think we should keep core separate, it's the interface after all.

@marten-seemann
Copy link
Contributor Author

I think we should keep core separate, it's the interface after all.

I think I disagree. core being separate is causing us a lot of trouble:

  • When we want to change one of the interfaces, we first have to cut a core release before we can use them in go-libp2p. It would be nicer to make that change atomically.
  • core and go-libp2p being out of sync (core already having released an update, but go-libp2p not yet) causes errors for downstream users who run go get -u, and then we need to tell them which versions of core go with which versions of go-libp2p

We probably should keep the GitHub Action that posts the gocompat output every time we make a change to /core, so we don't accidentally break things.

@vyzo
Copy link
Contributor

vyzo commented May 28, 2022 via email

@marten-seemann
Copy link
Contributor Author

Should we be worried about this? I'd assume that if you pull in libp2p types, you're also using go-libp2p. You could imagine a project suffering from repo sprawl where a library only needs the type, and go-libp2p is pulled in by their main repo, but even then the only effect is that go get will download slightly more data.

@vyzo
Copy link
Contributor

vyzo commented May 28, 2022

It is semantically different; i think a separate interface is good practice for something as complex as libp2p.
After all, apart from instantiation and tests, user code only cares about the interface.

@vyzo
Copy link
Contributor

vyzo commented May 28, 2022

think of it as the header file.

@marten-seemann
Copy link
Contributor Author

marten-seemann commented May 28, 2022

Any suggestion how to fix the two problems I listed above then? For me, this is mostly about development velocity, and I want to get away from having to release one module (in a release that we don't want users to use yet!) in order to make progress in another.

think of it as the header file.

Which is usually kept in the same directory as the source file 🤪

@vyzo
Copy link
Contributor

vyzo commented May 28, 2022

Any changes in interface should be carefully considered and too much velocity there is indicative of instability.
We dont have to release before creating the implementation pr, and only release when the implementation in go-libp2p is ready to merge.
The window of a broken go get -u is insignificantly small then.

@marten-seemann
Copy link
Contributor Author

Not convinced of the cleanliness argument, but let's look at some real-world examples. I looked at libraries building on top of libp2p, and determined their libp2p dependencies after deleting all test files (rm **/*_test.go && go mod edit -go 1.17 && go mod tidy).

go-libp2p-pubsub

github.com/libp2p/go-libp2p-core v0.15.1
github.com/libp2p/go-libp2p-discovery v0.6.0

go-libp2p-discovery was moved into go-libp2p, so this would still need to import go-libp2p.

go-libp2p-kad-dht

github.com/libp2p/go-eventbus v0.2.1
github.com/libp2p/go-libp2p-core v0.15.1
github.com/libp2p/go-libp2p-kbucket v0.4.7
github.com/libp2p/go-libp2p-peerstore v0.6.0
github.com/libp2p/go-libp2p-record v0.1.3
github.com/libp2p/go-libp2p-routing-helpers v0.2.3
github.com/libp2p/go-libp2p-swarm v0.10.2
github.com/libp2p/go-libp2p-xor v0.1.0
github.com/libp2p/go-msgio v0.2.0
github.com/libp2p/go-netroute v0.2.0

go-libp2p-swarm (moved into go-libp2p in round 2) is imported for an error assertion, and go-libp2p-peerstore (will be moved in round 3) for the slightly weird function PeerInfos.

go-bitswap

github.com/libp2p/go-buffer-pool v0.0.2
github.com/libp2p/go-libp2p v0.14.3
github.com/libp2p/go-libp2p-core v0.8.5
github.com/libp2p/go-libp2p-loggables v0.1.0

go-libp2p is import to access the ping package.

go-graphsync

github.com/libp2p/go-buffer-pool v0.0.2
github.com/libp2p/go-libp2p-core v0.8.5
github.com/libp2p/go-msgio v0.0.6

This actually wouldn't need to import go-libp2p.

@MarcoPolo
Copy link
Collaborator

I think we can still be very deliberate with changes to the core interfaces even if it's in a single repo. I'm not aware of many libraries that have their interfaces defined in a separate repo, but please correct me if I'm wrong.

From a user's POV a single repo would be better since go-libp2p + interfaces are well defined for any given commit.

@vyzo
Copy link
Contributor

vyzo commented Jun 1, 2022

My main rationale for the interface being a separate repo is that libp2p is a very complex system, and that applications and libraries only ever care about the core interface (except for instantiation and perhaps tests).

I don't want our users to get in the habbit of using internal libp2p components; the libp2p contract is that the interface is a stable facade, and in internals anything goes.

The development velocity effect is now gone, given rhat we are essentially a monorepo.
When we want to make a change in core, we open pr, develop using that pr in the monorepo, and when its ready we merge both prs.

No release is necessary in either repo, it is totally fine for master to point to core master.
When we are ready for release, we bump both and off we go. The probablity of a user doing a go get -u and breaking while we are tagging releases is vanishingly small.

@Stebalien
Copy link
Member

I think the right way to think of this is as follows: go-libp2p is now a "subproject" within the libp2p org, comprised of many separate packages that happen to be versioned together for convenience.

Minimal Dependencies

This was, IIRC, the main reason for the split.

However, this is no longer an issue, from what I can see, as long as we keep "core" from depending on anything else in go-libp2p. As of go 1.17,:

  • Anyone can pull in github.com/libp2p/go-libp2p/core without pulling in anything else. Yes, they'll download the rest of the repo, but that's really not a huge issue.
  • When vendoring code, only required packages are vendored. So I don't expect this to impact auditing.

External Transports

By including core, there's no way to:

  1. Develop an external transport that depends on core.
  2. Make go-libp2p depend on this transport.

Any time we make a breaking change to the transport, we'd need two releases to pull through an update.

On the other hand, if/when we get to the point where we want go-libp2p to depend on some transport, I expect we'd want to just pull it into the tree itself?

External Development

In theory, multiple repos allows third-parties to "own" separate codebases (other repos). In practice, this is wishful thinking.

  • It's much easier for code to rot in separate repos.
  • It's much harder to keep track of third-party work in separate repos.
  • It's pretty easy to setup a "code owners" system to allow contributions within a single repo.
  • We can now finally setup restricted tags, so it's possible to securely grant "code owner" access to parts of the repo.
  • If we don't trust some third party with this level of restricted write access, we shouldn't depend on their code...
  • We should explore git subtrees. This is a great way to:
    • Allow external developement.
    • Pull in "stable" releases into a repo that can be better managed.

Furthermore, this change won't prevent third-parties from developing transports. It just means that, if we ever want to ship the transport by default, it'll need to live in this repo (or we'll have to pull it in with git subtrees). But, IMO, that's fine.

Development Velocity

The current split makes breaking changes very painful. This isn't all bad, but it has made some refactors take longer than they should.

Additionally, It's very easy to get the system into an "unreleasable" state without realizing it.

No release is necessary in either repo, it is totally fine for master to point to core master.

This isn't fine and it has cost us (me) a ton of time. It has also delayed critical fixes, sometimes for months when we thought some tiny change to the core interfaces would be easy, but it actually caused a cascade of random stuff we needed to deal with.

Separate Versioning

One of the drawbacks to the new approach is that it makes it impossible to cut a release of a single package (e.g., for a bug fix). However, IMO, that's a good thing. It's really easy to forget that some bug was fixed in some random release of a random repo.

@BigLep
Copy link
Contributor

BigLep commented Aug 11, 2022

A couple of thoughts:

  1. ResourceManager readme docs in Add dos and monitoring docs docs#160 will need to be updated with new links when this is done.
  2. If it isn't there already, can we please capture the repo/package rationale in the README (likely taking content from Repo Consolidation: Round 3 #1556 (comment) )?

@p-shahi
Copy link
Member

p-shahi commented Sep 1, 2022

Closing as "Update links in dos-mitigation.md" was addressed in libp2p/docs#165

@p-shahi p-shahi closed this as completed Sep 1, 2022
Repository owner moved this from 🏃‍♀️ In Progress to 🎉 Done in go-libp2p Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/days Estimated to take multiple days, but less than a week kind/enhancement A net-new feature or improvement to an existing feature P0 Critical: Tackled by core team ASAP
Projects
Archived in project
Development

No branches or pull requests

6 participants