-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saturating casts between integers and floats #45205
Conversation
r? @eddyb (rust_highfive has picked a reviewer for you, use r? to override) |
190b73b
to
1cb1073
Compare
test!(2147483648., f32 -> i32, 2147483647); | ||
// With 24 significand bits, floats in [2^30 + 1, 2^31] are rounded to multiples of 2^7. | ||
// Therefore, the next smallest f32 is 2^31 - 128: | ||
test!(2147483520., f32 -> i32, 2147483520); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you have tests on both sides of the boundary for positive values, the other interesting value to test here is -2147483904.
, which is the other side of the boundary for negative values.
Also, if it's interesting to test these boundaries for signed conversions, the unsigned boundaries may also be interesting: 4294967040.
and 4294967296.0
on the upper boundary, and -0.99999994
and -1.
on the lower.
test!(-0., $fty -> $ity, 0); | ||
test!(-$fty::MIN_POSITIVE, $fty -> $ity, 0); | ||
test!(-1., $fty -> $ity, 0); | ||
test!(-100., $fty -> $ity, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd also be interesting to test a number in (-1,-0.5), such as -0.9, to make sure it's properly rounded up to 0.
src/librustc_trans/mir/constant.rs
Outdated
let infinity_bits = match float_ty.float_width() { | ||
32 => C_u32(ccx, ieee::Single::INFINITY.to_bits() as u32), | ||
64 => C_u64(ccx, ieee::Double::INFINITY.to_bits() as u64), | ||
n => bug!("unsupported float width {}", n), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only u128 → f32 is the problematic cast here, so the 64 =>
branch should never be entered (i.e. it should be a bug!
). There's no problem representing u128::MAX
in f64
as a finite number.
src/librustc_trans/mir/rvalue.rs
Outdated
let infinity_bits = match float_ty.float_width() { | ||
32 => C_u32(bcx.ccx, ieee::Single::INFINITY.to_bits() as u32), | ||
64 => C_u64(bcx.ccx, ieee::Double::INFINITY.to_bits() as u64), | ||
n => bug!("unsupported float width {}", n), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it should be, and that would have caught a bug - I applied the f32 overflow check even when casting to f64. Will add tests for that.
1cb1073
to
dd25630
Compare
Incorportated the suggestions by @sunfishcode and @kennytm. |
src/librustc_trans/mir/constant.rs
Outdated
// If this is an u128 cast and the value is > f32::MAX + 0.5 ULP, round up to infinity. | ||
if signed { | ||
llvm::LLVMConstSIToFP(llval, float_ty.to_ref()) | ||
} else if value >= 0xffffff80000000000000000000000000_u128 && float_ty.float_width() == 32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need to hardcode this value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't need to, I just didn't see a way to calculate it from existing constants that is simpler to understand and verify, at least to me. However, I didn't put too much thought into it and suggestions are welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the 8 come from? The float mantissa has 23 bits. Add the implicit bit, you get 24, which is six times UPDATE: see comment belowf
. println!("{:x}", std::f32::MAX as u128);
gives me six times f
as well, and its the same according to the internet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it would be best to have a constant somewhere (maybe in the file?) and use it in both places, e.g. representing it as 0xffffff_u128 << (128 - 24)
, which is IMO the clearest way to display the number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OH I get it, the 8 is needed due to rounding. Disregard my last 2 comments! I still think a constant somewhere with a nice explanation would be cool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it seems to be a number of 1
s at the top of u128
, how about computing it from the number of significand bits?
EDIT: Oh wow, github didn't show me @est31's comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of whether it's hardcoded or computed, I'm not sure where it should live if it's a shared definition. I guess I could chuck it in common
or something, but ehhhh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put it in rustc_const_math::float
.
src/librustc_trans/mir/rvalue.rs
Outdated
if is_u128_to_f32 && bcx.sess().opts.debugging_opts.saturating_float_casts { | ||
// f32::MAX + 0.5 ULP as u128. All inputs greater or equal to this should be | ||
// rounded to infinity, for everything else LLVM's uitofp works just fine. | ||
let max = C_big_integral(int_ty, 0xffffff80000000000000000000000000_u128); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
cc @est31 @nagisa @alexcrichton Who's better qualified to review floating-point code? |
src/librustc_trans/mir/constant.rs
Outdated
@@ -955,6 +952,55 @@ pub fn const_scalar_checked_binop<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, | |||
} | |||
} | |||
|
|||
unsafe fn const_cast_from_float(operand: &Const, signed: bool, int_ty: Type) -> ValueRef { | |||
let llval = operand.llval; | |||
// Note: this breaks if addresses can be turned into integers (is that possible?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is valid rust code, isn't it?
println!("{}", {let w: * const u32 = {let v = 0u32; &v}; w} as usize);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this is constant evaluation and pointer->integer casts in constant expression do not compile:
static Y: i32 = 0;
static Z: usize = &Y as usize;
src/librustc_trans/mir/rvalue.rs
Outdated
// rounded to infinity, for everything else LLVM's uitofp works just fine. | ||
let max = C_big_integral(int_ty, 0xffffff80000000000000000000000000_u128); | ||
let overflow = bcx.icmp(llvm::IntUGE, x, max); | ||
let infinity_bits = match float_ty.float_width() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the match here, the width is checked above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's true now. It wasn't before @kennytm pointed out that I needed the width check up there. (Same in constant evaluation.)
What's up with travis? The run-pass test fails with an error that looks plausible and platform-independent, but I can't reproduce it (on x86_64-windows-msvc).
|
src/librustc_trans/mir/constant.rs
Outdated
if signed { | ||
llvm::LLVMConstSIToFP(llval, float_ty.to_ref()) | ||
} else if value >= 0xffffff80000000000000000000000000_u128 && float_ty.float_width() == 32 { | ||
let infinity_bits = match float_ty.float_width() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, the match is not needed.
src/librustc_trans/mir/constant.rs
Outdated
// If this is an u128 cast and the value is > f32::MAX + 0.5 ULP, round up to infinity. | ||
if signed { | ||
llvm::LLVMConstSIToFP(llval, float_ty.to_ref()) | ||
} else if value >= 0xffffff80000000000000000000000000_u128 && float_ty.float_width() == 32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the 8 come from? The float mantissa has 23 bits. Add the implicit bit, you get 24, which is six times UPDATE: see comment belowf
. println!("{:x}", std::f32::MAX as u128);
gives me six times f
as well, and its the same according to the internet.
dd25630
to
1fbfeca
Compare
@est31 This constant isn't f32::MAX, it's f32::MAX + 0.5 ULP. Evidently the constant is not as clear as I'd hoped, so I'll figure out a way to compute it that hopefully makes it clearer what's happening. How about
The -1 is pretty ugly as well ¯\_(ツ)_/¯ |
src/librustc_trans/mir/rvalue.rs
Outdated
let max = C_big_integral(int_ty, 0xffffff80000000000000000000000000_u128); | ||
let overflow = bcx.icmp(llvm::IntUGE, x, max); | ||
let infinity_bits = C_u32(bcx.ccx, ieee::Single::INFINITY.to_bits() as u32); | ||
let infinity = consts::bitcast(, float_ty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣
[00:13:30] error: expected expression, found `,`
[00:13:30] --> /checkout/src/librustc_trans/mir/rvalue.rs:835:40
[00:13:30] |
[00:13:30] 835 | let infinity = consts::bitcast(, float_ty);
[00:13:30] | ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😱
1fbfeca
to
d96d318
Compare
src/librustc_trans/mir/rvalue.rs
Outdated
bcx.select(bcx.fcmp(llvm::RealOGT, x, f_max), int_max, cast_result) | ||
} else { | ||
cast_result | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if you're interested in micro-optimizing at this point, but if you are: it looks like this code does 4 selects for f32->i32; it would be a little shorter to take advantage of the fact that fptoui/fptosi in LLVM return undef rather than having UB, so you could do the fptosi/fptoui first, and then use selects to pick between the fptosi/fptui result value and the integer min/max/0 values in the various exceptional cases. That way, you'd always need at most 3 selects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with this option a bit but I didn't quickly find a way to write the first two selects do the clipping with two selects such that ity::MIN
comes out if x
is NaN. If that's not possible, i.e., a separate NaN check is always necessary, the current code is shorter (two selects) for casts such as f32 -> u16 or f64 -> f32. But perhaps I'm missing the forest for the trees? If someone can come up with a way to do three selects for signed types and two for unsigned casts, I'd gladly implement that (after I find time to address the other nits).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a way to do three selects for signed and two for unsigned, which should be generalizable to other bitwidths:
define i32 @fptosi(float %f) {
%t = fptosi float %f to i32
%c0 = fcmp olt float %f, 0xC1E0000000000000 ; -0x1p31
%c1 = fcmp oge float %f, 0x41E0000000000000 ; 0x1p31
%c2 = fcmp uno float %f, 0.000000e+00
%s0 = select i1 %c0, i32 -2147483648, i32 %t
%s1 = select i1 %c1, i32 2147483647, i32 %s0
%s2 = select i1 %c2, i32 0, i32 %s1
ret i32 %s2
}
define i32 @fptoui(float %f) {
%t = fptoui float %f to i32
%c0 = fcmp ule float %f, 0.000000e+00
%c1 = fcmp oge float %f, 0x41F0000000000000 ; 0x1p32
%s0 = select i1 %c0, i32 0, i32 %t
%s1 = select i1 %c1, i32 -1, i32 %s0
ret i32 %s1
}
src/librustc_apfloat/lib.rs
Outdated
@@ -380,7 +380,9 @@ pub trait Float | |||
fn from_bits(input: u128) -> Self; | |||
fn from_i128_r(input: i128, round: Round) -> StatusAnd<Self> { | |||
if input < 0 { | |||
Self::from_u128_r(-input as u128, -round).map(|r| -r) | |||
// This is equivalent to `(-input) as u128` except it doesn't overflow on i128::MIN | |||
let abs_input = !(input as u128) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ehm, just use wrapping_neg
? Somehow I forgot to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, that's a thing.
763bf51
to
7a894a8
Compare
@eddyb IIRC you wanted to run crater with the variant that errors on overflow/NaN in trans const eval? That's implemented now. (That it's implemented now also means this shouldn't be merged in the current state until we've had a crater run.) |
@bors try |
⌛ Trying commit 3c3082fe0763462f05c8cee7c9abe14092fa171d with merge cc787eba2ca6e3cf90a5e6b04f4fedd924787b68... |
3cf487c
to
21a89f9
Compare
☔ The latest upstream changes (presumably #45666) made this pull request unmergeable. Please resolve the merge conflicts. |
This affects regular code generation as well as constant evaluation in trans, but not the HIR constant evaluator because that one returns an error for overflowing casts and NaN-to-int casts. That error is conservatively correct and we should be careful to not accept more code in constant expressions. The changes to code generation are guarded by a new -Z flag, to be able to evaluate the performance impact. The trans constant evaluation changes are unconditional because they have no run time impact and don't affect type checking either.
21a89f9
to
ce46649
Compare
Rebased. Travis is green. |
@bors r+ |
📌 Commit ce46649 has been approved by |
⌛ Testing commit ce46649 with merge 3da399728e98a72205b3563482de647c49972154... |
💔 Test failed - status-travis |
cc #45676.
|
@bors r+ |
📌 Commit ef0b999 has been approved by |
Saturating casts between integers and floats Introduces a new flag, `-Z saturating-float-casts`, which makes code generation for int->float and float->int casts safe (`undef`-free), implementing [the saturating semantics laid out by](#10184 (comment)) @jorendorff for float->int casts and overflowing to infinity for `u128::MAX` -> `f32`. Constant evaluation in trans was changed to behave like HIR const eval already did, i.e., saturate for u128->f32 and report an error for problematic float->int casts. Many thanks to @eddyb, whose APFloat port simplified many parts of this patch, and made HIR constant evaluation recognize dangerous float casts as mentioned above. Also thanks to @ActuallyaDeviloper whose branchless implementation served as inspiration for this implementation. cc #10184 #41799 fixes #45134
☀️ Test successful - status-appveyor, status-travis |
} | ||
|
||
#[no_mangle] | ||
pub fn f64_to_u8(x: f32) -> u16 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arg types does not match function name. Should be either f32_to_u16(x: f32) -> u16
or f64_to_u8(x: f64) -> u8
.
Introduces a new flag,
-Z saturating-float-casts
, which makes code generation for int->float and float->int casts safe (undef
-free), implementing the saturating semantics laid out by @jorendorff for float->int casts and overflowing to infinity foru128::MAX
->f32
.Constant evaluation in trans was changed to behave like HIR const eval already did, i.e., saturate for u128->f32 and report an error for problematic float->int casts.
Many thanks to @eddyb, whose APFloat port simplified many parts of this patch, and made HIR constant evaluation recognize dangerous float casts as mentioned above.
Also thanks to @ActuallyaDeviloper whose branchless implementation served as inspiration for this implementation.
cc #10184 #41799
fixes #45134