Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<regex>: Always reject character ranges with set limits #5158

Merged

Conversation

muellerj2
Copy link
Contributor

Fixes #4995.

Unblocks libcxx test std/re/re.regex/re.regex.construct/bad_range.pass.cpp.

Note that I did not add test coverage for regular expressions like [[.a.]-e]: After this change, this regex is rejected as well (rather than parsed incorrectly). However, such regular expressions should actually be accepted (!), because [.xxx.] is not a set of characters, but (kind of) a single character if xxx is the name of a valid collating element. But support of collating symbols is currently so terribly broken that it is basically guaranteed to produce wrong results (see #994 (comment)), so I think it would be a major disservice to users if we actually accepted this regex without a major overhaul of collation support in the implementation.

@muellerj2 muellerj2 requested a review from a team as a code owner November 29, 2024 19:25
@CaseyCarter CaseyCarter added the bug Something isn't working label Nov 30, 2024
@StephanTLavavej StephanTLavavej self-assigned this Dec 1, 2024
tests/std/tests/VSO_0000000_regex_use/test.cpp Outdated Show resolved Hide resolved
stl/inc/regex Outdated Show resolved Hide resolved
tests/std/include/test_regex_support.hpp Outdated Show resolved Hide resolved
tests/std/tests/VSO_0000000_regex_use/test.cpp Outdated Show resolved Hide resolved
@StephanTLavavej
Copy link
Member

Thanks for fixing this complicated control flow! 😻 I pushed minor stylistic changes and test improvements after debugging through the code and convincing myself that your changes were correct.

@StephanTLavavej StephanTLavavej removed their assignment Dec 2, 2024
@StephanTLavavej StephanTLavavej self-assigned this Dec 3, 2024
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit e3e65be into microsoft:main Dec 5, 2024
39 checks passed
@StephanTLavavej
Copy link
Member

Thanks for figuring out how this code was broken and how to fix it! 😻 🧶 🛠️

@muellerj2 muellerj2 deleted the reject-char-ranges-with-set-limits branch December 5, 2024 16:56
@StephanTLavavej StephanTLavavej added the regex Everyone's favorite header label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working regex Everyone's favorite header
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

<regex>: R"([\d-e])" should be rejected
3 participants