This repository has been archived by the owner on Aug 11, 2020. It is now read-only.
Add round-to-nearest-even rounding to float2half(). #368
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When mshadow creates instances of its 'half_t' type based on a value, if converts that value to a 32-bit float, then calls a 'constructor' routine to convert that float to a 16-bit half. On the CPU under linux, the f16c library call cvtss_sh() is used (depending on CPU gen?), but in the absence of this library (e.g. on the Windows systems used by MXNet CI testing) mshadow's own float2half() routine is called. While cvtss_sh() and the GPU intrinsics available on CUDA_VERSION >= 7.5 do a round-to-nearest-even conversion, the float2half() routine rounds to 0 (i.e. it truncates the extra significand bits). This PR corrects this platform-specific difference in behavior by adding round-to-nearest-even rounding to float2half(). This should improve the robustness of the MXNet CI testing and make MXNet behavior more consistent across systems. @piiswrong @eric-haibin-lin @KellenSunderland
This change should only effect Windows users whose VC++ compiler cannot offer the f16c library, or perhaps GPU users still on CUDA 7.0 or earlier. For those users who may wish to compare behaviors, the legacy (pre-PR) truncation behavior is made available by building with -DMSHADOW_HALF_ROUND_TO_NEAREST=0. This PR is tested by an MXNet PR with tests/python/unittest/test_operator.py:test_cast_float32_to_float16. See apache/mxnet#13857. In the process of developing this test, a numpy rounding bug was discovered, but a simple work-around was put in place.
Experiments with the new float2half() routine show a speed-up of 50% when measured over the [0,+inf] range of possible 32-bit inputs. The new float2half() routine can compile for the GPU, although this should be rarely if ever needed, and passes the new test.