Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-21947 Replace FixedDecimal with DecimalQuantity in PluralRule sample parsing #2007

Merged
merged 1 commit into from
Aug 11, 2022

Conversation

echeran
Copy link
Contributor

@echeran echeran commented Feb 25, 2022

Checklist
  • Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-21947
  • Required: The PR title must be prefixed with a JIRA Issue number.
  • Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
  • Required: Each commit message must be prefixed with a JIRA Issue number.
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

@echeran echeran changed the title ICU-21322 Replace FixedDecimal with FormattedNumber in PluralRule sample parsing ICU-21322 Replace FixedDecimal with DecimalQuantity in PluralRule sample parsing Mar 10, 2022
@echeran echeran marked this pull request as ready for review March 10, 2022 20:50
@echeran echeran requested a review from sffc March 10, 2022 20:50
Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you ensure that CLDR can build with these changes? (You can't delete FixedDecimal, etc. unless that is true.)

@echeran
Copy link
Contributor Author

echeran commented Mar 11, 2022

Yes, I haven't yet gotten to the CLDR part, but you're right, this shouldn't be merged without testing it together with a PR on the CLDR side.

FYI, FixedDecimal isn't being removed, this PR is only attempting to remove its parse function. The current implementation for FixedDecimal parsing is flawed for compact notation numbers -- the resulting FixedDecimals sometimes give incorrect values for certain plural operands (but somehow, so far, it happens in a way that hasn't burned us....yet). So I thought it made sense to remove parsing for FixedDecimals so that we don't reach for it again, now that we have the proper corresponding replacement DecimalQuantity via #2012 with reliable parse and format functions.

Another FYI, the file that this PR is attempting to remove is PluralSamples.java. I thought to do so because: 1) it is not being referenced anywhere, and isn't running a test, 2) it still uses FixedDecimal (as stated above, we should stop using), 3) relevant functionality seems duplicated in PluralRules.java, 4) it's marked as @deprecated with class-level Java doc saying "Refactor samples as first step to moving into CLDR". I think that is reasonable, but let me know if I'm missing something regarding PluralSamples.java.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too disruptive of a change to land right now, and C++ is not done. I suggest holding this until at least after RC.

Copy link
Contributor Author

@echeran echeran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know what you think of these changes to the ICU side of work based on the review feedback so far.

FYI, in parallel, I'm working on the CLDR side of work, over in PR #1800. I've managed to reverse-engineer the instructions for how to do this type of ICU-CLDR tandem development, I've gotten the tests to compile, and I've gotten test failures down to 6 cases across 2 tests.

@echeran echeran requested a review from sffc March 16, 2022 16:51
@echeran echeran changed the title ICU-21322 Replace FixedDecimal with DecimalQuantity in PluralRule sample parsing ICU-21947 Replace FixedDecimal with DecimalQuantity in PluralRule sample parsing Mar 16, 2022
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done with PluralRules.java

@macchiati
Copy link
Member

macchiati commented Mar 18, 2022 via email

@echeran
Copy link
Contributor Author

echeran commented Mar 19, 2022

A good test would be that each of the samples in the plural and ordinal rules in CLDR round-trip; you get exactly the same results from ICU code. This doesn't fully test what CLDR does, because it also selects the samples using code that depends on FixedDecimal ...

@macchiati I believe that we already have tests in ICU that do what you were describing as occurring in CLDR. I just now added source comments on relevant tests to clarify what each test is actually doing because it is not always obvious. So for the round trip that goes plural rule -> keyword(s) -> sample(s) -> select -> keyword (round trip on the plural rule keyword), the relevant test is PluralRulesTest.TestDecimalQuantitySamples. This is performed over all locales available via API. If the round trip you were thinking of was the narrower concern of sample string -> parse -> format -> sample string, we did that in the previous PR #2012 .

And the CLDR tests are still passing when I run the corresponding CLDR changes unicode-org/cldr#1800 against this PR, via the instructions in the CLDR PR. So long as there are tests in CLDR that cover any extra functionality we are concerned about, then we have checked all the impacts that we know how to check.

@echeran echeran requested review from macchiati and sffc March 19, 2022 05:02
sffc
sffc previously approved these changes Mar 27, 2022
Comment on lines 295 to 303
if (!samplesString.matches(".*\\d+e[-]?\\d+.*")) {
assertEquals("samples; " + title, samplesString, samples.toString());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: do a global replacement for e with c in samplesString and then perform the equality check as usual

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

sffc
sffc previously approved these changes Aug 10, 2022
@sffc
Copy link
Member

sffc commented Aug 10, 2022

Looks like I approved these changes 6 months ago, and the changes since then look reasonable, so I will approve again :)

@echeran
Copy link
Contributor Author

echeran commented Aug 11, 2022

@macchiati You still have the "changes requested" bit set that blocks merging. How does this PR look, and how does the above discussion sound?

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good, but I'm concerned about the staging. We use FixedDecimal heavily in CLDR. If that disappears, we would not be able to update to the latest version of ICU until CLDR is fixed to work around it. We're heads-to-the ground trying to get the freeze done by the 17th.

Would it be feasible to leave FixedDecimal around so that we have the chance to get rid of it when the timing is better (and having both of them work at the same time, to ease the migration)?

@markusicu markusicu requested a review from macchiati August 11, 2022 16:32
macchiati
macchiati previously approved these changes Aug 11, 2022
@echeran echeran dismissed stale reviews from macchiati and sffc via 6c1fea4 August 11, 2022 19:56
@echeran echeran force-pushed the fixed-dec-formatted-num branch from 273b9eb to 6c1fea4 Compare August 11, 2022 19:56
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

Copy link
Member

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rslgtm

@echeran echeran merged commit 3ef03a4 into unicode-org:main Aug 11, 2022
@echeran echeran deleted the fixed-dec-formatted-num branch August 11, 2022 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants