[BUG]: Conda installs a redundant CPU build of torch #1943

dagardner-nv · 2024-10-14T21:08:11Z

Version

24.10

Which installation method(s) does this occur on?

Source

Describe the bug.

We install torch via pip which is how we get the 2.4.0+cu124 version.
I believe the sentence-transformers package is pulling in torchvision which in turn pulls in libtorch.

$ conda list | grep torch
libtorch                  2.4.0           cpu_generic_h4a3044c_1    conda-forge
torch                     2.4.0+cu124              pypi_0    pypi
torchdata                 0.8.0                    pypi_0    pypi
torchvision               0.19.1          cpu_py310hd9679db_0    conda-forge

Minimum reproducible example

CONDA_ALWAYS_YES=true conda env create --solver=libmamba -n morpheus -y --file conda/environments/all_cuda-125_arch-x86_64.yaml

Relevant log output

Click here to see error details

[Paste the error here, it will be hidden by default]

Full env printout

Click here to see environment details

[Paste the results of print_env.sh here, it will be hidden by default]

Other/Misc.

No response

Code of Conduct

I agree to follow Morpheus' Code of Conduct
I have searched the open bugs and have found no duplicates for this bug report

The text was updated successfully, but these errors were encountered:

efajardo-nv · 2024-10-22T15:58:12Z

@dagardner-nv The sentence-transformers conda install (via examples_cuda-125_arch-x86_64.yaml) in the release container results in the following pytorch packages in the container:

# packages in environment at /opt/conda/envs/morpheus:
#
# Name                    Version                   Build  Channel
libtorch                  2.4.1           cpu_generic_hb3b73e9_0    conda-forge
pytorch                   2.4.1           cpu_generic_py310hcbfaffa_0    conda-forge
torch                     2.4.0+cu124              pypi_0    pypi
torchvision               0.19.1          cpu_py310h0339c84_1    conda-forge

The VDB embedding stage then chooses to use CPU version. Replacing with pip package switches it back to GPU but it's not quite as fast (~3 min vs ~2 min in our example).

…encies (#1974) - Update `dependencies.yaml` and re-generate environment yaml's - Avoids install of `pytorch` cpu packages which causes examples like DFP try to use. Closes #1943 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md). - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Eli Fajardo (https://github.com/efajardo-nv) - David Gardner (https://github.com/dagardner-nv) Approvers: - Michael Demoret (https://github.com/mdemoret-nv) URL: #1974

dagardner-nv added the bug Something isn't working label Oct 14, 2024

github-project-automation bot added this to Morpheus Boards Oct 14, 2024

github-project-automation bot moved this to Todo in Morpheus Boards Oct 14, 2024

efajardo-nv mentioned this issue Oct 18, 2024

Benchmark updates/fixes #1958

Merged

efajardo-nv mentioned this issue Oct 23, 2024

Install sentence-transformers via pip to avoid CPU-torch conda dependencies #1974

Merged

dagardner-nv assigned efajardo-nv Oct 24, 2024

morpheus-bot-test bot moved this from Todo to Review - Ready for Review in Morpheus Boards Oct 24, 2024

dagardner-nv closed this as completed Oct 28, 2024

github-project-automation bot moved this from Review - Ready for Review to Done in Morpheus Boards Oct 28, 2024

dagardner-nv added this to the 24.10 - Release milestone Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: Conda installs a redundant CPU build of torch #1943

[BUG]: Conda installs a redundant CPU build of torch #1943

dagardner-nv commented Oct 14, 2024

efajardo-nv commented Oct 22, 2024 •

edited

Loading

[BUG]: Conda installs a redundant CPU build of torch #1943

[BUG]: Conda installs a redundant CPU build of torch #1943

Comments

dagardner-nv commented Oct 14, 2024

Version

Which installation method(s) does this occur on?

Describe the bug.

Minimum reproducible example

Relevant log output

Full env printout

Other/Misc.

Code of Conduct

efajardo-nv commented Oct 22, 2024 • edited Loading

efajardo-nv commented Oct 22, 2024 •

edited

Loading