ICU-22940 MF2 ICU4C: Error checking improvements in parser #3306

catamorphism · 2024-12-13T23:48:22Z

Improve checking for OOM errors when allocating UnicodeSets, per post-merge comments on #3236

Checklist

Required: Issue filed: ICU-22940
Required: The PR title must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
Required: Each commit message must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
Issue accepted (done by Technical Committee after discussion)
Tests included, if applicable
API docs and/or User Guide docs changed or added, if applicable

icu4c/source/i18n/messageformat2_parser.cpp

FrankYFTang

Thanks for working on this.

FrankYFTang · 2024-12-16T21:02:37Z

icu4c/source/i18n/messageformat2_parser.cpp

    }

-    UnicodeSet* result = new UnicodeSet(*unisets::getImpl(unisets::ALPHA));
+    unisets::gUnicodeSets[unisets::ALPHA] = initAlpha(status);


why not just

UnicodeSet* isAlpha = unisets::gUnicodeSets[unisets::ALPHA] = initAlpha(status); if (U_FAILURE(status)) { return nullptr; }

Fixed in e2990d2

FrankYFTang · 2024-12-16T21:07:17Z

icu4c/source/i18n/messageformat2_parser.cpp

    }

+    unisets::gUnicodeSets[unisets::NAME_START] = initNameStartChars(status);


I think you can simplified to

UnicodeSet* nameStart = unisets::gUnicodeSets[unisets::NAME_START] = initNameStartChars(status); UnicodeSet* digit = unisets::gUnicodeSets[unisets:: DIGIT] = initDigits(status); if (U_FAILURE(status)) { return nullptr; }

Fixed in e2990d2

FrankYFTang · 2024-12-16T21:08:38Z

icu4c/source/i18n/messageformat2_parser.cpp

    }

+    unisets::gUnicodeSets[unisets::CONTENT] = initContentChars(status);


I think you can simplfied to

UnicodeSet* content = unisets::gUnicodeSets[unisets::CONTENT] = initContentChars(status); UnicodeSet* whitespace = unisets::gUnicodeSets[unisets::WHITESPACE] = initWhitespace(status);

Fixed in e2990d2

FrankYFTang

Thanks

FrankYFTang · 2024-12-17T00:06:42Z

icu4c/source/i18n/messageformat2_parser.cpp

+         initTextChars depends on
+            initContentChars
+            initWhitespace
+     */


I think in the end of this function, we should do the following

if (U_FAILURE(status)) { cleanupMF2ParseUniSets(); }

so in case the failure is due to memory stress for the initialization, gMF2ParseUniSetsInitOnce will be reset after all the allocated UnicodeSet inside gUnicodeSets be deleted so it has a second chance to be sucessful.

Made the change in 7f19865 -- can you re-approve? Thanks!

Improve checking for OOM errors when allocating UnicodeSets, per post-merge comments on unicode-org#3236

jira-pull-request-webhook · 2025-01-10T01:30:19Z

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

catamorphism requested review from FrankYFTang and srl295 December 13, 2024 23:48

catamorphism mentioned this pull request Dec 13, 2024

ICU-22940 MF2 ICU4C: Update for bidi support #3236

Merged

7 tasks

FrankYFTang reviewed Dec 14, 2024

View reviewed changes

icu4c/source/i18n/messageformat2_parser.cpp Outdated Show resolved Hide resolved

icu4c/source/i18n/messageformat2_parser.cpp Outdated Show resolved Hide resolved

FrankYFTang requested changes Dec 16, 2024

View reviewed changes

FrankYFTang previously approved these changes Jan 9, 2025

View reviewed changes

catamorphism dismissed FrankYFTang’s stale review via 7f19865 January 9, 2025 23:40

srl295 approved these changes Jan 9, 2025

View reviewed changes

ICU-22940 MF2 ICU4C: Error checking improvements in parser

cb61cf1

Improve checking for OOM errors when allocating UnicodeSets, per post-merge comments on unicode-org#3236

catamorphism force-pushed the bidi-followup branch from 7f19865 to cb61cf1 Compare January 10, 2025 01:30

catamorphism merged commit f8aa68b into unicode-org:main Jan 10, 2025
94 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ICU-22940 MF2 ICU4C: Error checking improvements in parser #3306

ICU-22940 MF2 ICU4C: Error checking improvements in parser #3306

catamorphism commented Dec 13, 2024 •

edited

Loading

FrankYFTang left a comment

FrankYFTang Dec 16, 2024

catamorphism Dec 16, 2024

FrankYFTang Dec 16, 2024

catamorphism Dec 16, 2024

FrankYFTang Dec 16, 2024

catamorphism Dec 16, 2024

FrankYFTang left a comment

FrankYFTang Dec 17, 2024

catamorphism Jan 9, 2025

jira-pull-request-webhook bot commented Jan 10, 2025

		}

		unisets::gUnicodeSets[unisets::NAME_START] = initNameStartChars(status);

		}

		unisets::gUnicodeSets[unisets::CONTENT] = initContentChars(status);

ICU-22940 MF2 ICU4C: Error checking improvements in parser #3306

ICU-22940 MF2 ICU4C: Error checking improvements in parser #3306

Conversation

catamorphism commented Dec 13, 2024 • edited Loading

Checklist

FrankYFTang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FrankYFTang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jira-pull-request-webhook bot commented Jan 10, 2025

catamorphism commented Dec 13, 2024 •

edited

Loading