Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsPromises Resolver with IDN certain subdomains #39139

Closed
7c opened this issue Jun 24, 2021 · 11 comments
Closed

dnsPromises Resolver with IDN certain subdomains #39139

7c opened this issue Jun 24, 2021 · 11 comments
Labels
dns Issues and PRs related to the dns subsystem. i18n-api Issues and PRs related to the i18n implementation. icu Issues and PRs related to the ICU dependency.

Comments

@7c
Copy link

7c commented Jun 24, 2021

Certain version does throw error:

TypeError: Cannot convert name to ASCII
    at Resolver.queryA [as resolve4] (dns.js:227:48)

with MacOS BigSur Darwin Kernel Version 20.5.0, node v14.15.4 but does work with node v10.15.3 (ubuntu) and v10.24.1 (ubuntu).

Code which does this unexpected behaviour:

var { Resolver } = require('dns');
var resolver = new Resolver()
resolver.setServers(['8.8.8.8'])

function nslookup(host) {
    return new Promise(async function (resolve,reject) {
        console.log(`nslookup(${host})`)
        resolver.resolve4(host,function(err,ips) {
            if (err) {
                console.log(err)
                return resolve(false)
            }
            console.log(ips)
            return resolve(true)
        })
    })
    
}

async function start() {
    await nslookup('a.xn--ngbq.com')        // success
    await nslookup('ab.xn--ngbq.com')       // success
    await nslookup('a0.xn--ngbq.com')       // success
    await nslookup('1.xn--ngbq.com')        // fails
    await nslookup('10000.xn--ngbq.com')    // fails

    
}

start()

for some reason, a valid IDN domain like xn--nqbg.com does with alphanumeric hosts but does not work with number subdomains at my bigsur mac with node v14.15.4, upgraded to v14.17.1 and it still fails

@Ayase-252 Ayase-252 added the dns Issues and PRs related to the dns subsystem. label Jun 24, 2021
@Ayase-252
Copy link
Member

Ayase-252 commented Jun 24, 2021

It might be introduced in #25679

@targos
Copy link
Member

targos commented Jun 26, 2021

It looks like a bug related to ICU:

> internalBinding('icu').toASCII('a10000.xn--ngbq.com')
'a10000.xn--ngbq.com'
> internalBinding('icu').toASCII('10000.xn--ngbq.com')
Uncaught [TypeError: Cannot convert name to ASCII] {
  code: 'ERR_INVALID_ARG_VALUE'
}

node/src/node_i18n.cc

Lines 598 to 671 in 3c09987

int32_t ToASCII(MaybeStackBuffer<char>* buf,
const char* input,
size_t length,
enum idna_mode mode) {
UErrorCode status = U_ZERO_ERROR;
uint32_t options = // CheckHyphens = false; handled later
UIDNA_CHECK_BIDI | // CheckBidi = true
UIDNA_CHECK_CONTEXTJ | // CheckJoiners = true
UIDNA_NONTRANSITIONAL_TO_ASCII; // Nontransitional_Processing
if (mode == IDNA_STRICT) {
options |= UIDNA_USE_STD3_RULES; // UseSTD3ASCIIRules = beStrict
// VerifyDnsLength = beStrict;
// handled later
}
UIDNA* uidna = uidna_openUTS46(options, &status);
if (U_FAILURE(status))
return -1;
UIDNAInfo info = UIDNA_INFO_INITIALIZER;
int32_t len = uidna_nameToASCII_UTF8(uidna,
input, length,
**buf, buf->capacity(),
&info,
&status);
if (status == U_BUFFER_OVERFLOW_ERROR) {
status = U_ZERO_ERROR;
buf->AllocateSufficientStorage(len);
len = uidna_nameToASCII_UTF8(uidna,
input, length,
**buf, buf->capacity(),
&info,
&status);
}
// In UTS #46 which specifies ToASCII, certain error conditions are
// configurable through options, and the WHATWG URL Standard promptly elects
// to disable some of them to accommodate for real-world use cases.
// Unfortunately, ICU4C's IDNA module does not support disabling some of
// these options through `options` above, and thus continues throwing
// unnecessary errors. To counter this situation, we just filter out the
// errors that may have happened afterwards, before deciding whether to
// return an error from this function.
// CheckHyphens = false
// (Specified in the current UTS #46 draft rev. 18.)
// Refs:
// - https://github.com/whatwg/url/issues/53
// - https://github.com/whatwg/url/pull/309
// - http://www.unicode.org/review/pri317/
// - http://www.unicode.org/reports/tr46/tr46-18.html
// - https://www.icann.org/news/announcement-2000-01-07-en
info.errors &= ~UIDNA_ERROR_HYPHEN_3_4;
info.errors &= ~UIDNA_ERROR_LEADING_HYPHEN;
info.errors &= ~UIDNA_ERROR_TRAILING_HYPHEN;
if (mode != IDNA_STRICT) {
// VerifyDnsLength = beStrict
info.errors &= ~UIDNA_ERROR_EMPTY_LABEL;
info.errors &= ~UIDNA_ERROR_LABEL_TOO_LONG;
info.errors &= ~UIDNA_ERROR_DOMAIN_NAME_TOO_LONG;
}
if (U_FAILURE(status) || (mode != IDNA_LENIENT && info.errors != 0)) {
len = -1;
buf->SetLength(0);
} else {
buf->SetLength(len);
}
uidna_close(uidna);
return len;
}

/cc @nodejs/i18n-api

@targos targos added i18n-api Issues and PRs related to the i18n implementation. icu Issues and PRs related to the ICU dependency. labels Jun 26, 2021
@srl295
Copy link
Member

srl295 commented Jun 26, 2021

It looks like a bug related to ICU:


> internalBinding('icu').toASCII('a10000.xn--ngbq.com')

'a10000.xn--ngbq.com'

> internalBinding('icu').toASCII('10000.xn--ngbq.com')

Uncaught [TypeError: Cannot convert name to ASCII] {

  code: 'ERR_INVALID_ARG_VALUE'

}

node/src/node_i18n.cc

Lines 598 to 671 in 3c09987

int32_t ToASCII(MaybeStackBuffer<char>* buf,
const char* input,
size_t length,
enum idna_mode mode) {
UErrorCode status = U_ZERO_ERROR;
uint32_t options = // CheckHyphens = false; handled later
UIDNA_CHECK_BIDI | // CheckBidi = true
UIDNA_CHECK_CONTEXTJ | // CheckJoiners = true
UIDNA_NONTRANSITIONAL_TO_ASCII; // Nontransitional_Processing
if (mode == IDNA_STRICT) {
options |= UIDNA_USE_STD3_RULES; // UseSTD3ASCIIRules = beStrict
// VerifyDnsLength = beStrict;
// handled later
}
UIDNA* uidna = uidna_openUTS46(options, &status);
if (U_FAILURE(status))
return -1;
UIDNAInfo info = UIDNA_INFO_INITIALIZER;
int32_t len = uidna_nameToASCII_UTF8(uidna,
input, length,
**buf, buf->capacity(),
&info,
&status);
if (status == U_BUFFER_OVERFLOW_ERROR) {
status = U_ZERO_ERROR;
buf->AllocateSufficientStorage(len);
len = uidna_nameToASCII_UTF8(uidna,
input, length,
**buf, buf->capacity(),
&info,
&status);
}
// In UTS #46 which specifies ToASCII, certain error conditions are
// configurable through options, and the WHATWG URL Standard promptly elects
// to disable some of them to accommodate for real-world use cases.
// Unfortunately, ICU4C's IDNA module does not support disabling some of
// these options through `options` above, and thus continues throwing
// unnecessary errors. To counter this situation, we just filter out the
// errors that may have happened afterwards, before deciding whether to
// return an error from this function.
// CheckHyphens = false
// (Specified in the current UTS #46 draft rev. 18.)
// Refs:
// - https://github.com/whatwg/url/issues/53
// - https://github.com/whatwg/url/pull/309
// - http://www.unicode.org/review/pri317/
// - http://www.unicode.org/reports/tr46/tr46-18.html
// - https://www.icann.org/news/announcement-2000-01-07-en
info.errors &= ~UIDNA_ERROR_HYPHEN_3_4;
info.errors &= ~UIDNA_ERROR_LEADING_HYPHEN;
info.errors &= ~UIDNA_ERROR_TRAILING_HYPHEN;
if (mode != IDNA_STRICT) {
// VerifyDnsLength = beStrict
info.errors &= ~UIDNA_ERROR_EMPTY_LABEL;
info.errors &= ~UIDNA_ERROR_LABEL_TOO_LONG;
info.errors &= ~UIDNA_ERROR_DOMAIN_NAME_TOO_LONG;
}
if (U_FAILURE(status) || (mode != IDNA_LENIENT && info.errors != 0)) {
len = -1;
buf->SetLength(0);
} else {
buf->SetLength(len);
}
uidna_close(uidna);
return len;
}

/cc @nodejs/i18n-api

Can you repro with raw icu code or maybe https://icu4c-demos.unicode.org/icu-bin/idnbrowser ? If so please file an icu bug

@targos
Copy link
Member

targos commented Jun 27, 2021

/cc @jasnell @TimothyGu since you both worked on this function.

@TimothyGu
Copy link
Member

10000.xn--ngbq.com is not a valid IDN, because it violates the IDNA 2008 Bidi rule (RFC 5893 § 2). This is why an exception is (correctly) thrown.

The label xn--ngbq, when decoded, gives the Arabic characters U+0628 ب and U+0630 ذ, and so would be considered a "RTL label". This renders the entire domain name a "Bidi domain name". The Bidi rule states that if the IDN is a Bidi domain name, then it must follow a set of criteria in order to be valid. The first of them says:

  1. The first character [of a label] must be a character with Bidi property L, R, or AL.

This criterion essentially means that each label must be unambiguously LTR or RTL. Here, a has property L, so that's fine. However, 1 has Bidi property EN, and thus fails the test, invalidating the entire URL.


The reason for this rule to exist is captured in RFC 5893 § 3. But here's a simple explanation. Try to paste a12.بذ.com to the address box of your favorite browser, and you'll see that it's rendered "properly":

Chrome:
image

Firefox:
image

But if you try 12.بذ.com, and you'll see that it's rendered in an odd (and worse, potentially ambiguous) order:

Chrome:
image

Firefox:
image

@7c
Copy link
Author

7c commented Jun 28, 2021

on my browser
image
i can indeed enter these hostnames, more than this, dns system seems to ignore this RFC, because it can be resolved too. So all dns system seems to ignore this rule then

@TimothyGu
Copy link
Member

You might be able to enter the xn-- encoded domains, but you can’t enter the actual Unicode domains, at least on Chrome. Indeed, it’s unfortunate that DNS providers don’t enforce these rules. But they really should.

@srl295
Copy link
Member

srl295 commented Jun 28, 2021

@TimothyGu 💯 for digging into this. Sounds right.

@7c I think (correct me if I'm wrong) the xn- domain is valid DNS, but not valid IDN.

@TimothyGu just to be pedantic, I think what you mean is that registrars should enforce this, and not allow registration of such a domain? Or that the DNS implementation should IDN-validate actual records? (At least with a warning)

Everyone: I think this should be closed as not a bug, unless a regression test should be added to verify that an error is correctly thrown.

@7c
Copy link
Author

7c commented Jun 28, 2021

@srl295 we have registered xn--ngbq.com as valid domain from a registrar, we could set it up at AWS route53 properly, it does indeed resolve properly through DNS system. How we detected this issue is; we do run tests with subdomains and we randomly take Date.now().$domain for test. We have seen that the tests fail on MACOS(node 14.latest) but do not fail on ubuntu (node 10.x).

@srl295
Copy link
Member

srl295 commented Jun 28, 2021

@7c

it does indeed resolve properly through DNS system
… 
dns system seems to ignore this RFC

Yes, 10000.xn--ngbq.com is valid DNS, but invalid IDN. I can resolve it as well. It shouldn't be used in URLs.
DNS isn't ignoring the RFC. DNS simply did not make 10000.xn--ngbq.com invalid.

To use another analogy, 111011111011111110111110 is valid binary, so you can create a character string with it, but it's invalid Unicode, so you can't use it to display a message in a string.

$ dig 10000.xn--ngbq.com
10000.xn--ngbq.com.	172800	IN	A … 

@srl295 we have registered xn--ngbq.com as valid domain from a registrar, we could set it up at AWS route53 properly, it does indeed resolve properly through DNS system. How we detected this issue is; we do run tests with subdomains and we randomly take Date.now().$domain for test. We have seen that the tests fail on MACOS(node 14.latest) but do not fail on ubuntu (node 10.x).

@TimothyGu is correct about the spec, and it's actually a bug in Node 10.x and you should have received the error previously.

I would recommend that in your test, you check:

require('url').domainToASCII('10000.xn--ngbq.com')

…and if it returns '' then do not use that domain in your test

@7c
Copy link
Author

7c commented Jun 28, 2021

roger; yes it sounds like a bug in 10.x. I have tested 12.x and it also correctly fails... So this seems to be a <=10.x bug. You guys have at least a valid case for a test case where this error should be thrown. It is probably hard to find such edge cases but required for a project like nodejs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dns Issues and PRs related to the dns subsystem. i18n-api Issues and PRs related to the i18n implementation. icu Issues and PRs related to the ICU dependency.
Projects
None yet
Development

No branches or pull requests

5 participants