-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace resolved
field by hash
#64
base: master
Are you sure you want to change the base?
Conversation
text/0000-lock-without-registry.md
Outdated
|
||
# Detailed design | ||
|
||
Replace the `resolved` by a `hash` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think hash would not be enough.
resolved "https://registry.yarnpkg.com/abbrev/-/abbrev-1.1.0.tgz#d0554c2256636e2f56e7c2e5ad183f859428d81f"
should be converted into
resolved "/abbrev/-/abbrev-1.1.0.tgz#d0554c2256636e2f56e7c2e5ad183f859428d81f".
Where https://registry.yarnpkg.com
will be substituted by default and can be overridden with a setting in .yarnrc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it would require everyone to update their lockfiles to have the partial tarballs URLs, that may limit the usefulness of this feature.
How about having a config that would swap registries, e.g.
registry-replace "https://registry.yarnpkg.com=https://registry.npm.taobao.com;https://registry....=...."
It would allow people from remote areas seamlessly switch registries for existing projects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bestander as noted in my motivation comment, retaining the registry URL within in a yarn.lock
file could leak company repositories, so entirely removing them from a yarn.lock
file would be nice.
Actually it would require everyone to update their lockfiles to have the partial tarballs URLs, that may limit the usefulness of this feature.
I can see that it would require intervention by each project to upgrade their yarn.lock
files to follow the new approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you still release your yarn.lock to open source then you want the https://registry.yarnpkg.com domain to be present.
registry-replace "https://registry.yarnpkg.com...."
setting could be used to replace the URLs at fetch time and your project could be configured not to leak artifactory domain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why insn't the hash enough @bestander? Isn't the url completely inferred from repository+name+version?
From package.json
, yarn can already find what is the correct url (based on the registry in the configuration), so it should be the same in this case?
Here the hash would be present to ensure the artefact is the right one only, not for localisation at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tarball URL is returned by npm backend https://github.com/yarnpkg/yarn/blob/master/src/resolvers/registries/npm-resolver.js#L189, Yarn is not constructing it.
I don't think URL reconstruction is justified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a way, the whole point of this RFC to discuss do that: constructing the URL on the fly :)
When you say it is the npm backend constructing the URL, do you mean that IF we wanted to reconstruct it, we would need to store in the lockfile that it is a npm dependency so that we know we need the npm backend to reconstruct the full URL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True :)
I mean that npm is responsible for sending us the URL, for compatibility reasons we probably don't want to move this logic to Yarn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well you don't need to move it in yarn, you simply need to ask npm again every time :)
I understand this is a tricky place to make changes, but personally, I feel like yarn is broken by design by the choice of storing urls directly in the lock file (and I'm curious to see how npm is going to tackle this with their new coming version).
As it currently is, it cannot scale, you can't use proxy caches easily, you potentially expose private information publicly, and so on.
The alternative of using a registry-rewrite
is interesting too though, so maybe it will be simpler (in terms of retro-compatibility) to use that and keep the registry URL in the lockfile.
Having the URL in the lockfile also has the merit to ease the use of multiple repositories (not proxies of existing ones) side to side (even though I'm not sure how you can add a dependency with a different registry easily currently with yarn).
In that case the repository base-url (e.g., https://registry.npmjs.org
) would act as a kind of identifier of a given repository, and users could configure mirrors (e.g., https://registry.npm.taobao.org
) for these repositories (i.e, the rewrite you talked about) in their configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that npm is responsible for sending us the URL, for compatibility reasons we probably don't want to move this logic to Yarn.
Okay, so now I see what you mean, and I agree that entirely removing the resolution URL from yarn.lock
would mean that yarn would need to generate its own URLs internally, which has all the downsides of additional code for a spec yarn doesn't control.
If the URL must follow a specific spec, could the logic used to generate the value of dist.tarball
be refactored out of npm
and yarn
, and shared?
The alternative of using a registry-rewrite is interesting too though, so maybe it will be simpler
Another thought that came to mind is what happens where there are multiple registries that need to be overwritten? I've been assuming a developer only needed to override a single registry, but what if a yarn.lock
has references to multiple artifact registries? A developer would need to grep the yarn.lock
file, enumerate all registries used, and write registry-rewrite
configs for each.
text/0000-lock-without-registry.md
Outdated
|
||
In yarn.lock, the `resolved` field includes registry such as `https://registry.npmjs.org`. | ||
In China, most developers will set it to `https://registry.npm.taobao.org` for speed; but it seems slow for travis-ci and circleci. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also add as a motivation that the current approach leads to developers leaking their internal artifact repository sites to the public internet via yarn.lock
if they have their company's artifact repository configured in a .npmrc
or .yarnrc
file.
text/0000-lock-without-registry.md
Outdated
# Detailed design | ||
|
||
Replace the `resolved` by a `hash` field. | ||
The `url` in `resolved` is unnecessary; keeping `hash` is enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get an example of what the resolved
field would look like following this standard? (Kind of like your examples in yarnpkg/yarn#3330)
text/0000-lock-without-registry.md
Outdated
|
||
# Unresolved questions | ||
|
||
No questions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an open question, I would like to ask how will this be rolled out to all Yarn using projects? Will Yarn replace the entire yarn.lock
file? Will Yarn only use the new format for changed resolutions in the yarn.lock
file?
text/0000-lock-without-registry.md
Outdated
|
||
Just set the registry before `yarn install` if you do not want to use `https://registry.npmjs.org`. | ||
Or use `yarn install --registry=https://registry.npm.taobao.org`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll probably need to emphasize on the documentation website that install
will use the .yarnrc
configured registry, the command line configured registry, or will fallback to the default.
text/0000-lock-without-registry.md
Outdated
|
||
# How We Teach This | ||
|
||
Just set the registry before `yarn install` if you do not want to use `https://registry.npmjs.org`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference to the npm registry needs to be replaced with a reference to the Yarn registry - https://registry.yarnpkg.com
We may want to split the discussion into several RFCs. |
In my company we had a bit different issue. It's a bit complicated ;) We've got our own registry that stores our own private scoped packages and also acts as a proxy to the public npm/yarn registry for public packages. The problem is that developers could configure yarn in 2 ways:
Then we noticed that this causes a mess in |
I think storing hash instead of the whole url is good for keeping yarn's logic simple & robust. Caching url may gain some performance benefit, but that forces yarn to deal with the situation that different registries exists in one project because that may happen. Supporting multiple registries in one project is a huge challenge, considering different use cases in different team. I don't think yarn is ready to face it when yarn includes the field IMP, we should abandon this feature (storing url including registry for each package) first. Then reconsider the use cases & support it in a more reasonable way. Of course keeping compatibility will be the hardest part. Maybe there have to be a breaking change. Just for more discussion:) |
@szimek we have a similar setup at our company. We have two internal registries, once that is purely a mirror of the public npm registry, and a second registry that contains just our company's scope, We then have developers setup the following in their
We decided against a virtual registry built on top of the other two (We are not sure whether we will ever create the single virtual registry).
This happens at our company as well. We don't force developers to use the registries I listed above (except for the scoped registry, but that's the only place those are published).
Eventually we plan on taking the Facebook approach and isolating our CI environment from the internet, which means that any external references will fail those CI jobs. |
In terms of complexity I would not sign up on this drastic change yet. I suggest focusing on specific use cases and find a way to address them with the existing structure instead of starting a significant revamp. |
Nay. Changing @doxiaodong As to your issue - registry mirror config may be overridden by |
@OpenGG are you saying that we should manually replace registry URL paths in |
@destroyerofbuilds No. That's a temporary hack with current yarn implementation. NPM@5 introduces interesting changes related to this issue, I suggest we all give it a try before jumping to any conclusion.
|
Yep, that is what we need to do |
Should that be pulled out into a separate RFC? Lastly, how does the npm approach address item 2 in this RFC - |
1. I would go ahead and start a new RFC.
2. How about still using public URLs in yarn.lock but if you have a private
repository in the config the actual requests will have a URL replaced
during runtime?
…On 2 June 2017 at 18:12, Hutson Betts ***@***.***> wrote:
If you generated your package lock against registry A, and you switch to
registry B, npm will now try to install the packages from registry B,
instead of A.
Should that be pulled out into a separate RFC?
Lastly, how does the npm approach address item 2 in the RFC - the current
approach leads to developers leaking their internal artifact repository
sites to the public internet via yarn.lock if they have their company's
artifact repository configured in a .npmrc or .yarnrc file.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#64 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACBdWMsAtLbSYcEla7kuxUYXHX58RA9uks5sAEKUgaJpZM4NU2W3>
.
|
👍 I like the idea of a small, targeted, RFC that helps to bring Yarn in-line with npm behavior that, honestly, seems quite reasonable.
Please bare with me as I verbalize this a little to see if we're on the same page. I have However, when Yarn generates the Later, when I run So technically the fetches happen against the registry URL I've specified, even though the That behavior, if I'm understanding it correctly, lines up with the behavior of npm mentioned by @OpenGG. That seems intuitive, since npm@5 does most of that already, and the aforementioned new RFC would bring Yarn inline with that behavior already. The only divergence is that npm@5 writes out the override from Therefore, the only change would be that the override in |
Yes, that seems correct, thanks a lot for helping out!
…On 2 June 2017 at 18:29, Hutson Betts ***@***.***> wrote:
:1. I would go ahead and start a new RFC.
👍 I like the idea of a small, targeted, RFC that helps to bring Yarn
in-line with npm behavior that, honestly, seems quite reasonable.
How about still using public URLs in yarn.lock but if you have a private
repository in the config the actual requests will have a URL replaced
during runtime?
Please bare with me as I verbalize this a little to see if we're on the
same page.
I have registry=http://artifactory.example.com/artifactory/api/npm/npm in
my .yarnrc configuration file that points to my company's super secret
internal artifact repository.
However, when Yarn generates the yarn.lock file, Yarn uses the default
yarn registry URL?
Later, when I run yarn, Yarn extracts the registry configuration option
from .yarnrc, and re-writes the registry URL using my configuration
override?
So technically the fetches happen against the registry URL I've specified,
even though the yarn.lock specifies other URLs?
That behavior, if I'm understanding it correctly, lines up with the
behavior of npm mentioned by @OpenGG <https://github.com/opengg>.
That seems intuitive, since ***@***.*** does most of that already, and the
aforementioned new RFC would bring Yarn inline with that behavior already.
The only divergence is that ***@***.*** writes out the override from .npmrc to
the package-lock.json file.
Therefore, the only change would be that the override in .yarnrc would
never be applied to the default when writing out unscoped packages to
yarn.lock. I assume, too, that would apply to scopes?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#64 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACBdWNV4iD7Q_pUeqmq3LdXDTbvSSv0eks5sAEaIgaJpZM4NU2W3>
.
|
And thus how do we support the use of multiple registry then? For example for packages deployed on a tertiary public repository? One of the important point of the RFC was that we want the
So how do you differentiate them? I see some ideas:
For the record, this is how maven tackles this situation (and maven has a lot of this kind of use with multiple repository containing different packages):
If we were to make a parallel to the present discussion:
This is just some food for thought for whoever write another RFC on the matter, not a fully thought solution :) |
great points, @victornoel! |
This discussion is a bit fractured (here, here, here, and here), so I've attempted to summarize my interpretation of it here. Purpose of the
|
After further consideration, adopting the behavior of npm v5 might be a breaking change after all (as mentioned here). What I had in mind was for Yarn to do as npm v5 does, which is ignore the registry specified in the Yarn already supports custom registries and even scoped registries via Unfortunately, it is possible that people are relying upon the current behavior. The most common situation in which you'd want to use an alternate registry for a select group of packages is for internal packages, which should be scoped under an organization name. Scoped registries address that use case quite well. But if anybody out there is relying upon alternate registries for non-scoped packages, that behavior would break if we were to make That is, Yarn would have to drop support for alternate registries for non-scoped packages, which are currently supported. This situation is a bit muddy, because I don't know that support for alternate registries for non-scoped packages was intended, or whether it works that way incidentally. I also don't know if anybody would care if that behavior was lost, whereas it is abundantly clear that a resolution to this problem would be appreciated. |
Yes! That would be great. It's actually already planned for the 2.0, but noone started working on it yet: yarnpkg/yarn#5892
For scoped packages, I think the expectation is that those users would just have to share the same configuration. They can easily do this by checking in a yarnrc in the repository that would only contain the scrope registries. For non-scoped packages, it never was supported in the first place (after all the lockfile has a big "DO NOT EDIT" notice at the very top) and I don't think we should try to support it. Especially since this feature is scheduled to be released with a major version bump. |
Fantastic Edit: Though... this could be accomplished without editing the lockfile. Installing a package, changing the registry, then installing another package could result in a lockfile with alternate registries for non-scoped packages. That seems like a bit of an edge case though.
Removing the registry from the lockfile is scheduled for then. I was hoping we could let the registry in |
We already have kinda-breaking changes in master (the new integrity field which, without being breaking per-se, will cause a bunch of changes into most lockfiles), so I think it would be better for the next release to be the 2.0, ideally. |
Well, that's exciting! That leaves us with little time to update these RFCs. I've just put up a PR to rewrite the existing PR to more closely match As for this PR... I did find a downside to removing the registry from the They are using the It looks like What is the impact to Yarn if we stop using |
Hmm I'm not entirely sure I follow - we currently have these informations: registry configuration + resolved field (registry + url). Assuming the registries from the resolved fields should always match the registry configuration (or be updated to match it), then this information is strictly redundant, and even if we were to use it as a cache key (I don't think we do, iirc the cache key is based on the package name + package version) we would be able to reconstruct it at runtime. Does that make sense? |
After tracing the install process taken by Yarn, it looks like it does use that field as a cache. I can't see how it would construct it without asking the registry for metadata. Here is an example of the URL we're discussing: In the The That is the round-trip that I am concerned we might be adding to each installation if we remove the field altogether, without caching that information elsewhere. Edit: I should add that conceptually I still agree that the field should be removed. But assuming I'm right about there being a performance impact, we would have to accept that cost. We could cache that information somewhere else instead; either as part of this change, or at a later date. |
I've done a bit more investigation on the option of generating the tarball URL using the standard format ( It looks like this approach was taken by The solution taken by the Ultimately, I don't see any straightforward way to remove the tarball URL from the lockfile without hurting performance. It serves to make the installation process significantly faster if your registry matches the registry used in the cached (which it does for most users, most of the time). At best, we could omit the tarball URL in certain known cases (i.e. known registries like |
yarnpkg/yarn#3330