-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDR-16941 Improve coverage for non-latn default/native number symbols/patterns #3690
CLDR-16941 Improve coverage for non-latn default/native number symbols/patterns #3690
Conversation
Are we also tracking whether the equivalent number formats to those currently in basic where the locale uses native digits by default in a separate ticket for resolution in esub or a future release instead (since they are available at a higher coverage level if we do need to change them in v46)? |
@AEApple Not sure I understand the question. We had coverage in certain locales for the number symbols and patterns for default and native number systems, at either moderate or modern coverage depending on the item. What this ticket does is ensure that we do that across all locales for all of the number systems that are default, native, or finance systems in those locales. |
@@ -1278,7 +1278,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic | |||
<!ATTLIST coverageLevel inLanguage CDATA #IMPLIED > | |||
<!--@MATCH:any--> | |||
<!ATTLIST coverageLevel inScript CDATA #IMPLIED > | |||
<!--@MATCH:validity/script--> | |||
<!--@MATCH:any--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so we can use coverage variables for inScript, as we already do for inLanguage and inTerritory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally @MATCH:validity/script-with-variables
or something
Larger question is whether coverage level belongs in "supplemental data" when, like casing, it's primarily inward tool-focused data. Perhaps we should split some of these out -- and give them their own DTD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great. A lot of work, though! Longer term, I think we need to set aside some time to review the structure of the coverage levels, and some up with a more concise format to lower the maintenance burden. We already talked about expanding elements; we also might have more predefined variables (that we can construct at boot time based on what's in a particular locale). And other ideas.
<coverageVariable key="%guruNativeLanguages" value="(pa)"/> | ||
<coverageVariable key="%hanidecNativeLanguages" value="(yue|zh)"/> | ||
<!-- Number system coverage by language and/or script; default number system items at various levels, native & finance system items all at modern --> | ||
<coverageVariable key="%adlmDefaultScripts" value="(Adlm)"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't think any of this belongs in data at all. It should be read from the locale files themselves and cached. But.. it's an improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -265,29 +283,34 @@ For terms of use, see http://www.unicode.org/copyright.html | |||
<coverageLevel value="moderate" inLanguage="%traditionalCollationLanguages" match="localeDisplayNames/types/type[@key='collation'][@type='traditional']"/> | |||
<coverageLevel value="moderate" inLanguage="%CJK_Languages" match="localeDisplayNames/types/type[@key='collation'][@type='unihan']"/> | |||
<coverageLevel value="moderate" match="localeDisplayNames/types/type[@key='numbers'][@type='latn']"/> | |||
<coverageLevel value="moderate" inLanguage="zh" match="localeDisplayNames/types/type[@key='numbers'][@type='(han(s|t)(fin)?|hanidec)']"/> | |||
<coverageLevel value="moderate" inLanguage="%hanidecNativeLanguages" match="localeDisplayNames/types/type[@key='numbers'][@type='(han(s|t)(fin)?|hanidec)']"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<coverageLevel value="moderate" inLanguage="%hanidecNativeLanguages" match="localeDisplayNames/types/type[@key='numbers'][@type='(han(s|t)(fin)?|hanidec)']"/> | |
<coverageLevel value="moderate" inLanguage="%hanidecNativeLanguages" match="localeDisplayNames/types/type[@key='numbers'][@type='(hans|hant)(fin)?|hanidec)']"/> |
more readable.. or, spell all of them out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this in the follow-on PR.
<coverageLevel value="moderate" inScript="%mteiDefaultScripts" match="numbers/symbols[@numberSystem='mtei']/group"/> | ||
<coverageLevel value="moderate" inScript="%mteiDefaultScripts" match="numbers/symbols[@numberSystem='mtei']/minusSign"/> | ||
<coverageLevel value="moderate" inScript="%mteiDefaultScripts" match="numbers/symbols[@numberSystem='mtei']/percentSign"/> | ||
<coverageLevel value="moderate" inScript="%mteiDefaultScripts" match="numbers/symbols[@numberSystem='mtei']/plusSign"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll say again, I appreciate this being done.
drop the data from coverage*.xml entirely and derive it from the locale files |
Covered by https://unicode-org.atlassian.net/browse/CLDR-17618 |
CLDR-16941
latn
default, native, and finance numbering systems in all locales. In some cases we select coverage by script instead of or in addition to language, to disambiguate cases where languages are written in multiple scripts.ALLOW_MANY_COMMITS=true