Partition vector algorithms test: move out lex compare family #5063

AlexGuteniev · 2024-11-03T14:04:52Z

VSO_0000000_vector_algorithms tends to take longer with more algorithms vectorized.

Think we need to partition it.

Starting with lex compare family. They are taking about 30% of total run time,
Also the lex compare direction looks complete, and unlikely to have overlaps with other algorithms.

StephanTLavavej · 2024-11-03T15:01:29Z

Would it be better to keep everything in a single file, but controlled via macros?

STL/tests/std/tests/P2321R2_views_zip/env.lst

Lines 9 to 13 in a1bc126

    
           RUNALL_CROSSLIST
 
           *	PM_CL="/DTEST_INPUT"
 
           *	PM_CL="/DTEST_FORWARD"
 
           *	PM_CL="/DTEST_BIDIRECTIONAL"
 
           *	PM_CL="/DTEST_RANDOM"

With the variant_msvc tests, I permanently split them into a separate file because they're developed totally separately from the LLVM-derived tests. With the vector algorithms tests, I think keeping them in a single file might be better, because that way we can easily re-group how much we run each time.

AlexGuteniev · 2024-11-03T15:11:49Z

Would it be better to keep everything in a single file, but controlled via macros?

Is there an easy way to select a group using some extra parameter python tests\utils\stl-lit\stl-lit.py ..\..\tests\std\tests\VSO_0000000_vector_algorithms -v command?

AlexGuteniev · 2024-11-03T15:17:09Z

I'm not yet concerned about run time of a single configuration. The total run time seems too long.

AlexGuteniev · 2024-11-03T15:36:03Z

Also, partitioning algorithms by algorithm type seems more natural than partitioning views by iterator type.
Algorithms implementations are unrelated (not going to split where they are related).

StephanTLavavej · 2024-11-03T18:44:48Z

We could use tags (like with ASAN), but since the algorithms are unrelated and there's no commonality between the test support machinery, having a separate test does make sense. Let's keep this as-is, thanks!

AlexGuteniev · 2024-11-03T18:57:08Z

there's no commonality between the test support machinery

There is no new commonality extracted during this separation, as all the commonality, which is mostly the randomness initialization, was previously extracted in #4734 into /tests/std/include/test_vector_algorithms_support.hpp to support separate floating minmax testing.

This specific commonality, I think, is not an indication that these tests should be together.
Instead, it seems to be something potentially reusable in more separate tests.
<charconv> test mentioned in #933 is a candidate:

STL/tests/std/tests/P0067R5_charconv/test.cpp

Lines 66 to 122 in 1e312b3

    
           void initialize_randomness(mt19937_64& mt64, const int argc, char** const argv) { 
        
               constexpr size_t n = mt19937_64::state_size; 
        
               constexpr size_t w = mt19937_64::word_size; 
        
               static_assert(w % 32 == 0); 
        
               constexpr size_t k = w / 32; 
        
               vector<uint32_t> vec(n * k); 
        
               puts("USAGE:"); 
        
               puts("test.exe              : generates seed data from random_device."); 
        
               puts("test.exe filename.txt : loads seed data from a given text file."); 
        
               if (argc == 1) { 
        
                   random_device rd; 
        
                   generate(vec.begin(), vec.end(), ref(rd)); 
        
                   puts("Generated seed data."); 
        
               } else if (argc == 2) { 
        
                   const char* const filename = argv[1]; 
        
                   ifstream file(filename); 
        
                   if (!file) { 
        
                       printf("ERROR: Can't open %s.\n", filename); 
        
                       abort(); 
        
                   } 
        
                   for (auto& elem : vec) { 
        
                       file >> elem; 
        
                       if (!file) { 
        
                           printf("ERROR: Can't read seed data from %s.\n", filename); 
        
                           abort(); 
        
                       } 
        
                   } 
        
                   printf("Loaded seed data from %s.\n", filename); 
        
               } else { 
        
                   puts("ERROR: Too many command-line arguments."); 
        
                   abort(); 
        
               } 
        
               puts("SEED DATA:"); 
        
               for (const auto& elem : vec) { 
        
                   printf("%u ", elem); 
        
               } 
        
               printf("\n"); 
        
               seed_seq seq(vec.cbegin(), vec.cend()); 
        
               mt64.seed(seq); 
        
               puts("Successfully seeded mt64. First three values:"); 
        
               for (int i = 0; i < 3; ++i) { 
        
                   // libc++ uses long for 64-bit values. 
        
                   printf("0x%016llX\n", static_cast<unsigned long long>(mt64())); 
        
               } 
        
           }

AlexGuteniev · 2024-11-03T19:07:39Z

Actually I'm floating the separation idea with this PR.

When thinking of search_n I concluded that comprehensive enough test would have a cubic run time: O(H*N*F) where H is haystack length, N is needle length and F is the frequency of a match, will gradually increment all three. And it has no relationship to anything else. So I think it deserves the whole separate .cpp

StephanTLavavej · 2024-11-07T21:13:16Z

When thinking of search_n I concluded that comprehensive enough test would have a cubic run time

If a comprehensive test would have a long run time, then I'd recommend having an off-by-default "EXHAUSTIVE" mode that allows the whole space to be manually run, but having automated runs select a randomized subset of cases.

Expensive tests are problematic both from a machine resourcing and timeout perspective, so we need to be careful.

StephanTLavavej · 2024-11-07T22:39:56Z

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

StephanTLavavej · 2024-11-08T17:24:58Z

Thanks for keeping an eye on this ever-growing test! 🐣 🐤 🐔

Partition vector algorithms test: move out lex compare family

2f25a3b

AlexGuteniev requested a review from a team as a code owner November 3, 2024 14:04

StephanTLavavej self-assigned this Nov 3, 2024

StephanTLavavej added the test Related to test code label Nov 3, 2024

StephanTLavavej approved these changes Nov 7, 2024

View reviewed changes

StephanTLavavej removed their assignment Nov 7, 2024

StephanTLavavej mentioned this pull request Nov 7, 2024

Maintainer priorities #4700

Open

StephanTLavavej self-assigned this Nov 7, 2024

StephanTLavavej merged commit 2e5c251 into microsoft:main Nov 8, 2024
39 checks passed

AlexGuteniev deleted the partition branch November 8, 2024 17:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partition vector algorithms test: move out lex compare family #5063

Partition vector algorithms test: move out lex compare family #5063

AlexGuteniev commented Nov 3, 2024

StephanTLavavej commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024

StephanTLavavej commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024 •

edited

Loading

StephanTLavavej commented Nov 7, 2024

StephanTLavavej commented Nov 7, 2024

StephanTLavavej commented Nov 8, 2024

Partition vector algorithms test: move out lex compare family #5063

Partition vector algorithms test: move out lex compare family #5063

Conversation

AlexGuteniev commented Nov 3, 2024

StephanTLavavej commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024

StephanTLavavej commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024

AlexGuteniev commented Nov 3, 2024 • edited Loading

StephanTLavavej commented Nov 7, 2024

StephanTLavavej commented Nov 7, 2024

StephanTLavavej commented Nov 8, 2024

AlexGuteniev commented Nov 3, 2024 •

edited

Loading