Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cli default args for num_threads should be based on process thread affinity. #30

Closed
cwharris opened this issue Apr 25, 2022 · 0 comments · Fixed by #1866
Closed
Assignees
Labels
bug Something isn't working

Comments

@cwharris
Copy link
Contributor

cwharris commented Apr 25, 2022

We are using os.cpu_count() (or psutil.cpu_count()) as the default argument for num_threads for both CLI and example code, which is based on the physical number of CPUs on the device and is potentially non-determinable.

Instead we should probably use len(os.sched_getaffinity(0)), which looks to be the appropriate default as it accounts for how many threads the current process has access to.

https://github.com/NVIDIA/Morpheus/blob/02bfbfbeb6e9e62d5fd8fe47a2bf2a92a85d1adc/examples/gnn_fraud_detection_pipeline/run.py#L36-L41

@cwharris cwharris added the bug Something isn't working label Apr 25, 2022
rapids-bot bot pushed a commit that referenced this issue Aug 2, 2023
I have a few changes here. Some of them might not make it depending on how others feel.

- Move test utilities in to a library to share across tests.
- Move tests in to individual executable to enforce individual tests to not sharing code.
- Use `gtest_discover_tests` to let `ctest` run and show all tests individually.
- Use `gtest::gtest_main` and remove explicit main entry file.
- ~Build and run tests by default.~
- Ability to filter tests by `ctest --test-dir build -R "TestCuda.*"`

```
+ echo 'Running CTest...'
Running CTest...
+ ctest --test-dir build
Internal ctest changing into directory: /workspaces/morpheus/build
Test project /workspaces/morpheus/build
      Start  1: TestCuda.LargeShape
 1/35 Test  #1: TestCuda.LargeShape .......................................   Passed    0.17 sec
      Start  2: TestDataLoader.DataLoaderInitializationTest
 2/35 Test  #2: TestDataLoader.DataLoaderInitializationTest ...............   Passed    0.07 sec
      Start  3: TestDataLoader.DataLoaderRegisterLoaderTest
 3/35 Test  #3: TestDataLoader.DataLoaderRegisterLoaderTest ...............   Passed    0.07 sec
      Start  4: TestDataLoader.DataLoaderRemoveLoaderTest
 4/35 Test  #4: TestDataLoader.DataLoaderRemoveLoaderTest .................   Passed    0.07 sec
      Start  5: TestDataLoader.PayloadLoaderTest
 5/35 Test  #5: TestDataLoader.PayloadLoaderTest ..........................   Passed    1.42 sec
      Start  6: TestDataLoader.FileLoaderTest
 6/35 Test  #6: TestDataLoader.FileLoaderTest .............................   Passed    2.65 sec
      Start  7: TestControlMessage.InitializationTest
 7/35 Test  #7: TestControlMessage.InitializationTest .....................   Passed    0.07 sec
      Start  8: TestControlMessage.SetMessageTest
 8/35 Test  #8: TestControlMessage.SetMessageTest .........................   Passed    0.07 sec
      Start  9: TestControlMessage.TaskTest
 9/35 Test  #9: TestControlMessage.TaskTest ...............................   Passed    0.07 sec
      Start 10: TestControlMessage.PayloadTest
10/35 Test #10: TestControlMessage.PayloadTest ............................   Passed    1.45 sec
      Start 11: TestDataLoaderModule.DataLoaderModuleInitializationTest
11/35 Test #11: TestDataLoaderModule.DataLoaderModuleInitializationTest ...   Passed    0.07 sec
      Start 12: TestDataLoaderModule.EndToEndPayloadDataLoaderTest
12/35 Test #12: TestDataLoaderModule.EndToEndPayloadDataLoaderTest ........   Passed    0.12 sec
      Start 13: TestDeserializers.GetIndexColCountNoIdxFromFile
13/35 Test #13: TestDeserializers.GetIndexColCountNoIdxFromFile ...........   Passed    1.39 sec
      Start 14: TestDeserializers.GetIndexColCountWithIdxFromFile
14/35 Test #14: TestDeserializers.GetIndexColCountWithIdxFromFile .........   Passed    1.37 sec
      Start 15: TestDeserializers.GetIndexColCountNoIdxSimilarName
15/35 Test #15: TestDeserializers.GetIndexColCountNoIdxSimilarName ........   Passed    0.19 sec
      Start 16: TestDeserializers.GetIndexColCountIdx
16/35 Test #16: TestDeserializers.GetIndexColCountIdx .....................   Passed    0.18 sec
      Start 17: TestDeserializers.GetIndexColCountValidNameInvalidType
17/35 Test #17: TestDeserializers.GetIndexColCountValidNameInvalidType ....   Passed    0.16 sec
      Start 18: TestDevMemInfo.RmmBufferConstructor
18/35 Test #18: TestDevMemInfo.RmmBufferConstructor .......................   Passed    0.16 sec
      Start 19: TestDevMemInfo.VoidPtrConstructor
19/35 Test #19: TestDevMemInfo.VoidPtrConstructor .........................   Passed    0.25 sec
      Start 20: TestDevMemInfo.MakeNewBuffer
20/35 Test #20: TestDevMemInfo.MakeNewBuffer ..............................   Passed    0.25 sec
      Start 21: TestFileInOut.RoundTripCSV
21/35 Test #21: TestFileInOut.RoundTripCSV ................................   Passed    2.67 sec
      Start 22: TestFileInOut.RoundTripJSON
22/35 Test #22: TestFileInOut.RoundTripJSON ...............................   Passed    2.69 sec
      Start 23: TestMatxUtil.ReduceMax1d
23/35 Test #23: TestMatxUtil.ReduceMax1d ..................................   Passed    0.17 sec
      Start 24: TestMatxUtil.ReduceMax2dRowMajor
24/35 Test #24: TestMatxUtil.ReduceMax2dRowMajor ..........................   Passed    0.17 sec
      Start 25: TestMatxUtil.ReduceMax2dColMajor
25/35 Test #25: TestMatxUtil.ReduceMax2dColMajor ..........................   Passed    1.40 sec
      Start 26: TestMatxUtil.Cast
26/35 Test #26: TestMatxUtil.Cast .........................................   Passed    0.16 sec
      Start 27: TestMatxUtil.Threshold
27/35 Test #27: TestMatxUtil.Threshold ....................................   Passed    0.16 sec
      Start 28: TestMatxUtil.ThresholdByRow
28/35 Test #28: TestMatxUtil.ThresholdByRow ...............................   Passed    0.19 sec
      Start 29: TestMultiSlices.Ranges
29/35 Test #29: TestMultiSlices.Ranges ....................................   Passed    1.37 sec
      Start 30: TestTensor.UtilsShapeString
30/35 Test #30: TestTensor.UtilsShapeString ...............................   Passed    0.06 sec
      Start 31: TestTensor.GetElementStride
31/35 Test #31: TestTensor.GetElementStride ...............................   Passed    0.06 sec
      Start 32: TestTensor.AsType
32/35 Test #32: TestTensor.AsType .........................................   Passed    0.16 sec
      Start 33: TestTensor.Create
33/35 Test #33: TestTensor.Create .........................................   Passed    0.16 sec
      Start 34: TestTensor.UtilsValidateShapeAndStride
34/35 Test #34: TestTensor.UtilsValidateShapeAndStride ....................   Passed    0.16 sec
      Start 35: TestTypeUtils.DTypeCopy
35/35 Test #35: TestTypeUtils.DTypeCopy ...................................   Passed    0.06 sec

100% tests passed, 0 tests failed out of 35

Total Test time (real) =  19.91 sec
```

Authors:
  - Christopher Harris (https://github.com/cwharris)

Approvers:
  - David Gardner (https://github.com/dagardner-nv)

URL: #1095
@mdemoret-nv mdemoret-nv moved this to Todo in Morpheus Boards Aug 16, 2024
@mdemoret-nv mdemoret-nv added this to the 24.10 - Release milestone Aug 16, 2024
@morpheus-bot-test morpheus-bot-test bot moved this from Todo to Review - Ready for Review in Morpheus Boards Aug 26, 2024
@rapids-bot rapids-bot bot closed this as completed in 583149c Aug 31, 2024
@github-project-automation github-project-automation bot moved this from Review - Ready for Review to Done in Morpheus Boards Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants