Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with transpositions on GPUs #65

Merged
merged 3 commits into from
Jun 21, 2022
Merged

Fix issues with transpositions on GPUs #65

merged 3 commits into from
Jun 21, 2022

Conversation

jipolanco
Copy link
Owner

@jipolanco jipolanco commented Jun 21, 2022

Fixes the following issues when transposing distributed CuArrays:

  • sometimes the wrong permutation was applied to the output of a transpose
  • MPI buffers cannot be created directly out of CuPtrs
  • gather sometimes gave wrong results on distributed CuArrays

@codecov-commenter
Copy link

codecov-commenter commented Jun 21, 2022

Codecov Report

Merging #65 (072554e) into master (f89299b) will decrease coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #65      +/-   ##
==========================================
- Coverage   97.01%   96.97%   -0.04%     
==========================================
  Files          22       22              
  Lines        1208     1192      -16     
==========================================
- Hits         1172     1156      -16     
  Misses         36       36              
Impacted Files Coverage Δ
src/Transpositions/Transpositions.jl 98.06% <100.00%> (-0.03%) ⬇️
src/PencilIO/hdf5.jl 93.33% <0.00%> (-0.11%) ⬇️
src/PencilIO/mpi_io.jl 96.12% <0.00%> (-0.08%) ⬇️
src/Pencils/MPITopologies.jl 97.40% <0.00%> (-0.07%) ⬇️
src/Pencils/Pencils.jl 99.13% <0.00%> (-0.02%) ⬇️
src/gather.jl 100.00% <0.00%> (ø)
src/Pencils/data_ranges.jl 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f89299b...072554e. Read the comment docs.

They don't play well with CuArrays. Moreover, the conversion of a CuPtr
to a MPIPtr is not exposed by MPI.jl (unlike the case of CuArray views).
Previously, when calling `gather` on GPU arrays, MPI was sending GPU
data and receiving CPU data. We now convert to CPU arrays before passing
the data to MPI, so that all communications are done on the CPU.
@jipolanco jipolanco changed the title Fix permuted transpositions on GPUs Fix issues with transpositions on GPUs Jun 21, 2022
@jipolanco jipolanco merged commit c88e7cb into master Jun 21, 2022
@jipolanco jipolanco deleted the gpu-transpose branch June 21, 2022 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants