Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<regex>: Regex erroneously returns a match #994

Open
AlexGuteniev opened this issue Jul 5, 2020 · 2 comments
Open

<regex>: Regex erroneously returns a match #994

AlexGuteniev opened this issue Jul 5, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@AlexGuteniev
Copy link
Contributor

AlexGuteniev commented Jul 5, 2020

Describe the bug
Regex returns a match where it is not expected to return a match.

Note that the pattern has mismatched [ and ].
Need to clarify if it should actually throw, or the behavior is undefined.

Command-line test case

d:\Temp2>type repro.cpp
#include <iostream>
#include <regex>
#include <string>

int main()
{
        try
        {
                std::string Regex = "[[.(.]a[a]";
                std::string MatchString = "v";
                std::regex std_regex(Regex);
                bool Result = std::regex_match(MatchString, std_regex);

                std::cout << "Is matched:" << Result;
        }
        catch (std::regex_error& err)
        {
                std::cout << err.code() << '\n' << err.what() << '\n';
        }
}
d:\Temp2>cl /EHsc /W4 /WX .\repro.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29009.1 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

repro.cpp
Microsoft (R) Incremental Linker Version 14.27.29009.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:repro.exe
repro.obj

d:\Temp2>.\repro.exe
Is matched:1

Expected behavior
DevCom-306176 reporter expects:

Is matched:0

I expect an exception.

STL version

Microsoft Visual Studio Professional 2019 Preview
Version 16.7.0 Preview 3.1

Additional context
Add any other context about the problem here.
This item is also tracked on Developer Community as DevCom-306176 and by Microsoft-internal VSO-660624 / AB#660624.

@Amaroker
Copy link

Amaroker commented Jul 5, 2020

If I understand correctly, std::regex_match would never be invoked here because the constructor of std::regex has to throw if the regular expression is not valid. It should throw a std::regex_error with code std::regex_constants::error_brack.

@StephanTLavavej StephanTLavavej added the bug Something isn't working label Jul 6, 2020
@achabense
Copy link
Contributor

achabense commented Jun 23, 2023

This is valid, '[' is treated as a normal character in []:

#include<regex>
#include<iostream>

int main() {
    std::regex regex("[a[a]*");
    std::string str("[[[[[[aaaa");
    std::cout << std::regex_match(str, regex);//1
}

As to "[[.(.]a[a]", I find the library is treating [.(.] as a valid [.Name.]... This is where "[.(.]" get handled:

STL/stl/inc/regex

Lines 4020 to 4025 in 40640c6

} else if (_End_arg == _Meta_dot) { // process .
if (_Beg == _Pat) {
_Error(regex_constants::error_collate);
} else {
_Nfa._Add_coll(_Beg, _Pat, _Diff);
}

The standard requires "The name is valid only if std::regex_traits::lookup_collatename is not an empty string". However, in the library, lookup_collatename is simply treated as:

STL/stl/inc/regex

Lines 377 to 380 in 40640c6

template <class _FwdIt>
string_type lookup_collatename(_FwdIt _First, _FwdIt _Last) const { // map [_First, _Last) to collation element
return string_type{_First, _Last};
}

And _Add_coll for [.Name.] is just an insertion:

STL/stl/inc/regex

Lines 3018 to 3024 in 40640c6

template <class _FwdIt, class _Elem, class _RxTraits>
void _Builder<_FwdIt, _Elem, _RxTraits>::_Add_coll(_FwdIt _First, _FwdIt _Last, _Difft _Diff) {
// add collation element to bracket expression
_Node_class<_Elem, _RxTraits>* _Node = static_cast<_Node_class<_Elem, _RxTraits>*>(_Current);
_Sequence<_Elem>** _Cur = _STD addressof(_Node->_Coll);
_Char_to_elts(_First, _Last, _Diff, _Cur);
}

I'm afraid [.Name.] is broken...

#include<regex>
#include<iostream>

int main() {
    std::regex regex("[[.(.]]*");
    std::cout << std::regex_match("(((", regex);//0
    std::cout << std::regex_match("(((v", regex);//0
    std::cout << std::regex_match("v", regex);//1
    std::cout << std::regex_match("w", regex);//1
    std::cout << std::regex_match("vv", regex);//0
    std::cout << "\n";
    regex = "[[.(.]x]*";
    std::cout << std::regex_match("xxx", regex);//1
    std::cout << std::regex_match("xxxv", regex);//1
    std::cout << std::regex_match("xxxw", regex);//1
    std::cout << std::regex_match("xxxvv", regex);//0
    std::cout << std::regex_match("vxxx", regex);//0
    std::cout << std::regex_match("v", regex);//1
    std::cout << std::regex_match("w", regex);//1
    std::cout << std::regex_match("vv", regex);//0
}

[=Name=] is worth some investigation too, as there is no valid check for Name either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants