Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support DeformRoiPool with cambricon MLU backend #2137

Merged
merged 7 commits into from
Aug 28, 2022

Conversation

defei-coder
Copy link
Contributor

@defei-coder defei-coder commented Jul 21, 2022

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

The motivation of the PR is to support running DeformRoiPool with Cambricon MLU backend.

It includes three parts:

1. Add deform_roi_pool_mlu_kernel.mlu src code as MLU kernel.
2. Add deform_roi_pool_mlu.cpp to support launching kernel in PyTorch.
3. Refactor test_deform_roi_pool.py to support testing DeformRoiPool with MLU backend.

Modification

  • MLU src code
    Add MLU src code of DeformRoiPool in directory mmcv/ops/csrc/common/mlu/deform_roi_pool_mlu_kernel.mlu.
  • PyTorch adaptation
    Adapt DeformRoiPool for PyTorch in mmcv/ops/csrc/pytorch/mlu/deform_roi_pool_mlu.cpp.
  • Unit test
    Surpport test DeformRoiPool with various backends in tests/test_ops/test_deform_roi_pool.py.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

Before PR:

  • I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
  • Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
  • Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
  • New functionalities are covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with some of those projects, like MMDet or MMCls.
  • CLA has been signed and all committers have signed the CLA in this PR.

@defei-coder defei-coder force-pushed the deform_roi_pool_mlu branch 3 times, most recently from e992e78 to 2d03582 Compare July 22, 2022 03:32
@zhouzaida zhouzaida requested a review from grimoire July 22, 2022 07:43
@defei-coder defei-coder force-pushed the deform_roi_pool_mlu branch from 2d03582 to 0a4cd90 Compare July 25, 2022 11:09
Copy link
Member

@grimoire grimoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build Failed with:

./mmcv/ops/csrc/common/mlu/deform_roi_pool_mlu_kernel.mlu:132:26: error: no matching function for call to 'max'
          static_cast<T>(std::max(roi_bin_grid_height * roi_bin_grid_width, 1));

@zhouzaida zhouzaida added the MLU label Aug 19, 2022
@defei-coder
Copy link
Contributor Author

@grimoire Hi, I build mmcv successfully. How do you build?What is your build environment?Such as pytorch version and neuware_home version(in virtual environment pytorch folder neuware_home/version.txt)

@defei-coder
Copy link
Contributor Author

defei-coder commented Aug 22, 2022

@grimoire hi, I remove the use of std libaray, you can rebuild mmcv. But I still want to know your neuware_home version、pytorch and torch_mlu version.

@grimoire
Copy link
Member

Compile passed. Unit test failed with log:

RuntimeError: offset should be 4d tensor, got 4D.
E       Exception raised from DeformRoIPoolForwardMLUKernelLauncher at ./mmcv/ops/csrc/pytorch/mlu/deform_roi_pool_mlu.cpp:71 (most recent call first):
  • Neuware Version 2.8.5
  • pytorch==1.9.0
  • torch-mlu==1.3.2-torch1.9

@defei-coder
Copy link
Contributor Author

@grimoire hi, the error information in my code is wrong, I just fixed it. You can rebuild mmcv and test again. According to your error information, I think the dim of offset tensor is wrong.

@grimoire
Copy link
Member

failed with new logs:

RuntimeError: offset should be 4d tensor, got 1D.

@defei-coder
Copy link
Contributor Author

failed with new logs:

RuntimeError: offset should be 4d tensor, got 1D.

@grimoire Is the test code used provided by test_deform_roi_pool_allclose in test_deform_roi_pool.py? The offset is created in DeformRoIPoolPack by input tensor. I have tested the test code is ok locally. Maybe you can print the input tensor information.

@grimoire
Copy link
Member

the test passed after 1e998ac, did .mlu() and .to(device) have different behavior?

@defei-coder
Copy link
Contributor Author

the test passed after 1e998ac, did .mlu() and .to(device) have different behavior?

Hi, @grimoire. I find the reason for failure. Due to I used offset.data_ptr(), when offset = input.new_zeros(0), this operator may be uncertainty. Not related to .mlu() or .to(device). You can rebuild mmcv and test it. Sorry for my fault!

Copy link
Member

@grimoire grimoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works now! LGTM

@zhouzaida zhouzaida merged commit e843d73 into open-mmlab:master Aug 28, 2022
zhouzaida pushed a commit to zhouzaida/mmcv that referenced this pull request Oct 20, 2022
…b#2137)

* [Feature] Support DeformRoiPool with cambricon MLU backend

* [Fix] Remove use of std library

* [Fix] Correct the error information

* [Refactor] Refactor test deform_roi_pool code

* [Fix] Fix judgment error

* [Fix] Modify the large tensor check

Co-authored-by: budefei <[email protected]>
zhouzaida pushed a commit to zhouzaida/mmcv that referenced this pull request Oct 22, 2022
…b#2137)

* [Feature] Support DeformRoiPool with cambricon MLU backend

* [Fix] Remove use of std library

* [Fix] Correct the error information

* [Refactor] Refactor test deform_roi_pool code

* [Fix] Fix judgment error

* [Fix] Modify the large tensor check

Co-authored-by: budefei <[email protected]>
zhouzaida pushed a commit that referenced this pull request Oct 22, 2022
* [Feature] Support DeformRoiPool with cambricon MLU backend

* [Fix] Remove use of std library

* [Fix] Correct the error information

* [Refactor] Refactor test deform_roi_pool code

* [Fix] Fix judgment error

* [Fix] Modify the large tensor check

Co-authored-by: budefei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants