-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: support domains with empty labels in i18n #12707
Conversation
cc @nodejs/url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since long domain name labels and long domain names are invalid already, it might not need to ignore
UIDNA_ERROR_LABEL_TOO_LONG
andUIDNA_ERROR_DOMAIN_NAME_TOO_LONG
.
Can you elaborate on this? What do you mean by "invalid already"?
src/node_i18n.cc
Outdated
@@ -461,6 +461,9 @@ int32_t ToUnicode(MaybeStackBuffer<char>* buf, | |||
&status); | |||
} | |||
|
|||
if (info.errors != 0) | |||
info.errors &= ~UIDNA_ERROR_EMPTY_LABEL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave off the if
as the &
will do nothing if the error is not present, and I'd add a reference to https://url.spec.whatwg.org/#concept-domain-to-ascii (especially the VerifyDnsLength bit) for why we are ignoring this error.
@TimothyGu Sorry for my unclear description. I meant we cannot ignore -- |
dd3b613
to
f06dd84
Compare
src/node_i18n.cc
Outdated
@@ -500,6 +503,11 @@ int32_t ToASCII(MaybeStackBuffer<char>* buf, | |||
&status); | |||
} | |||
|
|||
// https://url.spec.whatwg.org/#concept-domain-to-ascii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add an explicit reference to the VerifyDnsLength option.
// The WHATWG URL "domain to ASCII" algorithm explicitly sets the
// VerifyDnsLength flag to false, which disables the domain name length
// verification step in ToASCII (as specified by UTS #46). Unfortunately,
// ICU4C's IDNA module does not support disabling this flag through `options`,
// so just filter out the errors that may be caused by the verification step
// afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the actual comment example! I put it into the commit :)
src/node_i18n.cc
Outdated
@@ -461,6 +461,9 @@ int32_t ToUnicode(MaybeStackBuffer<char>* buf, | |||
&status); | |||
} | |||
|
|||
// https://url.spec.whatwg.org/#concept-domain-to-unicode | |||
info.errors &= ~UIDNA_ERROR_EMPTY_LABEL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this is a bit tricky. A more detailed comment like the comment should be warranted:
// UTS #46's ToUnicode operation applies no validation of domain name length
// (nor a flag requesting it to do so, like VerifyDnsLength for ToASCII). For
// that reason, unlike ToASCII below, ICU4C correctly accepts long domain
// names. However, ICU4C still sets the EMPTY_LABEL error in contrary to UTS
// #46. Therefore, explicitly filters out that error here.
Follow the spec of domainToASCII/domainToUnicode in whatwg, and synchronise WPT url test data. Refs: web-platform-tests/wpt#5397
f06dd84
to
d080abb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for upstreaming the WPT changes as well!
Follow the spec of domainToASCII/domainToUnicode in whatwg, and synchronise WPT url test data. Refs: web-platform-tests/wpt#5397 PR-URL: #12707 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Timothy Gu <[email protected]>
landed in 0f58d3c |
Follow the spec of domainToASCII/domainToUnicode in whatwg, and synchronise WPT url test data. Refs: web-platform-tests/wpt#5397 PR-URL: nodejs#12707 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Timothy Gu <[email protected]>
Looks like we should backport this to v7.x, no? /cc @watilde |
Yes, we should backport it into v7. I'm not sure I should make a patch for it, will check it if it makes conflicts. |
Summary
The
domainToUnicode
and thedomainToASCII
could be used in the middle of parsing the origin, and it means the input values could be a non-final domain name label. Then the converter should ignore theUIDNA_ERROR_EMPTY_LABEL
error. Since long domain name labels and long domain names are invalid already, it might not need to ignoreUIDNA_ERROR_LABEL_TOO_LONG
andUIDNA_ERROR_DOMAIN_NAME_TOO_LONG
.Refs:
node/test/fixtures/url-idna.js
Lines 192 to 201 in 6c21397
Updates
Checklist
make -j4 test
Affected core subsystem(s)
src, test