-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
url: parsing should not serialize windows drive letter twice #15490
Conversation
Have been talking with @bmeck I think we've actually found an inconsistency in the URL parsing state machine:
parses differently in Node vs. a browser like Chrome, resulting in the path:
which is an invalid Windows path; the actual fix is going to be in the URL parsing state machine not, dropping the drive letter is not to spec. |
Will need to double check that this conforms with the url standard before signing off. Ping @TimothyGu (should be fine, but just need to verify) |
@jasnell it looks like a bug in url standard, |
@TimothyGu, @jasnell, @bmeck and I were talking a bit before this pull, this seems to pull us closer to the current behavior of Chrome (and brings us closer to @bmeck's interpretation of the spec) please loop me into any tracking issues I should be following ... this was fun to debug 😛 |
I think the central issue is:
which really shouldn't be the case if the dedicated function for converting a file URL to a path is used: Lines 166 to 168 in 75606c4
|
@TimothyGu this turned out to be a red herring, the root issue was that the parse method incorrectly uses the Windows drive portion of both the |
@TimothyGu no, the behavior is in url, see:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! So this bug has just been fixed in the spec two days ago: whatwg/url#343. We haven't had a chance to implement the change yet, but this PR actually unwittingly implements a variant of that PR.
You might also want to import the Web Platform Tests associated with that change to test/fixtures/url-tests.js
, available at web-platform-tests/wpt#7326.
src/node_url.cc
Outdated
@@ -1698,7 +1699,8 @@ void URL::Parse(const char* input, | |||
} else { | |||
if (has_base && | |||
base->scheme == "file:") { | |||
if (IsNormalizedWindowsDriveLetter(base->path[0])) { | |||
if (IsNormalizedWindowsDriveLetter(base->path[0]) && | |||
!(remaining > 0 && IsWindowsDriveLetter(ch, p[1]))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec uses the "starts with a Windows drive letter" algorithm which checks a few other things. You might want to use that instead by factoring out
Lines 1670 to 1676 in 75606c4
if ((remaining == 0 || | |
!IsWindowsDriveLetter(ch, p[1]) || | |
(remaining >= 2 && | |
p[2] != '/' && | |
p[2] != '\\' && | |
p[2] != '?' && | |
p[2] != '#'))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TimothyGu 👍 something like?
static inline bool StartsWithWindowsDriveLetter(char ch, const char* p, int remaining) {
bool starts_with_drive_letter = false;
if !IsWindowsDriveLetter(ch, p[1]) ||
(remaining >= 2 &&
p[2] != '/' &&
p[2] != '\\' &&
p[2] != '?' &&
p[2] != '#'))) {
starts_with_windows_drive_letter = true;
}
return starts_with_windows_drive_letter;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops logic is reversed in that snippet, but you get the idea; we'd switch the code you shared to !StartsWithDriveLetter().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to check remaining
before accessing p[1]
?
see fix in: whatwg/url#343, discussion in whatwg/url#345 |
@bcoe Hey uh, have you seen #15490 (review) yet? |
@TimothyGu whoops missed that 👍 lol, all converging on the problem at the same time. I've got a work day at npm, Inc today, but would love to see this over the finish line tonight if you don't mind. It unblocks some other pulls I have open around getting es-module tests passing in CI. Thanks for your help. |
@TimothyGu DRYd up as requested, switched tests over to test the same scenarios as whatwg. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some suggestions how to make this implementation closer to the URL spec.
src/node_url.cc
Outdated
// https://url.spec.whatwg.org/#start-with-a-windows-drive-letter | ||
static inline bool StartsWithWindowsDriveLetter(char ch, | ||
const char* p, | ||
int remaining) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The URL spec. does not use the remaining here, because it isn't (well) defined when c is EOF.
So I think this function can be rewritten in a bit simpler and universal form:
static inline bool StartsWithWindowsDriveLetter(const char* p,
const char* end) {
const size_t length = end - p;
return length >= 2 &&
IsWindowsDriveLetter(p[0], p[1]) &&
(length == 2 ||
p[2] == '/' ||
p[2] == '\\' ||
p[2] == '?' ||
p[2] == '#');
}
src/node_url.cc
Outdated
p[2] != '?' && | ||
p[2] != '#'))) { | ||
if (remaining == 0 || | ||
!StartsWithWindowsDriveLetter(ch, p, remaining)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remaining == 0
is unnecessary, because StartsWithWindowsDriveLetter
tests for an empty or too short substring, so:
- if (remaining == 0 ||
- !StartsWithWindowsDriveLetter(ch, p, remaining)) {
+ if (!StartsWithWindowsDriveLetter(p, end)) {
src/node_url.cc
Outdated
@@ -1698,7 +1710,8 @@ void URL::Parse(const char* input, | |||
} else { | |||
if (has_base && | |||
base->scheme == "file:") { | |||
if (IsNormalizedWindowsDriveLetter(base->path[0])) { | |||
if (IsNormalizedWindowsDriveLetter(base->path[0]) && | |||
!StartsWithWindowsDriveLetter(ch, p, remaining)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your implementation is a bit different than in the spec. I thought about such implementation too, but it has a performance problem. Try the test: new URL("/c:/foo/bar", "file://host/path")
, it sets URL’s host to base’s host here and later empties URL's host in the path state 1.4.1.1.
So I think it is better to follow the spec. and avoid unnecessary steps:
if (has_base &&
- base->scheme == "file:") {
+ base->scheme == "file:" &&
+ !StartsWithWindowsDriveLetter(p, end)) {
+ if (IsNormalizedWindowsDriveLetter(base->path[0])) {
- if (IsNormalizedWindowsDriveLetter(base->path[0]) &&
- !StartsWithWindowsDriveLetter(ch, p, remaining)) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rmisev thank you for the through review, I've implemented your suggestions. I think it might be good to eventually pull in all of url/urltestdata.json
to our test suite, I might do that as a follow up pull request (thought @TimothyGu?).
@bcoe We already have the entirety of urltestdata.json. It's called test/fixtures/url-test.js. See #15490 (review). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after url-test.js is updated (and the nits are addressed).
test/cctest/test_url.cc
Outdated
@@ -104,3 +104,15 @@ TEST_F(URLTest, ToFilePath) { | |||
|
|||
#undef T | |||
} | |||
|
|||
// https://github.com/w3c/web-platform-tests/pull/7326/files | |||
TEST_F(URLTest, PathDriveLetter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test will be necessary once url-test.js is updated.
@@ -552,6 +552,19 @@ static inline bool IsSpecial(std::string scheme) { | |||
return false; | |||
} | |||
|
|||
// https://url.spec.whatwg.org/#start-with-a-windows-drive-letter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a https://
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm missing something:
// https://url.spec.whatwg.org/#start-with-a-windows-drive-letter
Does have https
, perhaps misread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I'm sorry. It's probably a Chrome extension I'm using.
Would you mind updating the commit hash in the file header for url-tests.js as well? node/test/fixtures/url-tests.js Line 5 in cd1b55a
|
Address issue with Windows drive letter handling that was causing es-module test suite to fail, see: whatwg/url#343
@TimothyGu I think I've addressed your comments. |
Landed in 456d8e2 |
Address issue with Windows drive letter handling that was causing es-module test suite to fail. PR-URL: #15490 Ref: whatwg/url#343 Reviewed-By: Timothy Gu <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]>
Address issue with Windows drive letter handling that was causing es-module test suite to fail. PR-URL: #15490 Ref: whatwg/url#343 Reviewed-By: Timothy Gu <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]>
Address issue with Windows drive letter handling that was causing es-module test suite to fail. PR-URL: nodejs/node#15490 Ref: whatwg/url#343 Reviewed-By: Timothy Gu <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]>
es-modules do not currently work on Windows (we missed this when landing them initially because we'd missed adding a variable to vcbuild.bat/Makefile).
The underlying issue was that
node::url::URL.path()
was serializing windows style paths incorrectly if there was a drive letter in both the path and the base:resolve("/D:/a/b/c.mjs", "file:///C:/a/b/c")
would result in an incorrect parse of
/C:/D:/a/b/c.mjs
.the same parse in Chrome resolves to:
/D:/a/b/c.mjs
.Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
url,es-module
reviewer: @bmeck, @jasnell