forked from rust-lang/rust
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update from origin 2020-06-18 #4
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We have been seeing some very inefficient code that went away when using `-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at `-Oz` was compiled into a function which consisted entirely of saving registers to the stack, then using the frame pointer to restore the same registers (without any instructions between the prolog and epilog). The RISC-V LLVM backend supports frame pointer elimination, so it makes sense to allow this to happen when using Rust. It's not clear to me that frame pointers have ever been required in the general case. In #61675 it was pointed out that this made reassembling stack traces easier, which is true, but there is a code generation option for forcing frame pointers, and I feel the default should not be to require frame pointers, given it demonstrably makes code size worse (around 10% in some embedded applications). The kinds of targets mentioned in #61675 are popular, but should not dictate that code generation should be worse for all RISC-V targets, especially as there is a way to use CFI information to reconstruct the stack when the frame pointer is eliminated. It is also a misconception that `fp` is always used for the frame pointer. `fp` is an ABI name for `x8` (aka `s0`), and if no frame pointer is required, `x8` may be used for other callee-saved values. This commit does ensure that the standard library is built with unwind tables, so that users do not need to rebuild the standard library in order to get a backtrace that includes standard library calls (which is the original reason for forcing frame pointers).
* Check for overflow when calculating the slice start & end position. * Align the pointer obtained from the allocator, ensuring that it satisfies user requested alignment (the allocator is only asked for layout compatible with u8 slice). * Remove an incorrect assertion from DroplessArena::align.
Return a pointer from `alloc_raw` instead of a slice. There is no practical use for slice as a return type and changing it to a pointer avoids forming references to an uninitialized memory.
When we're running with dry_run enabled (i.e. all builds do this initially), we're guaranteed to save of a toolstate of TestFail for tools that aren't tested. In practice, we do test tools as well, so for those tools we would initially record them as being TestPass, and then later on re-record the correct state after actually testing them. However, this would not work well if the build failed for whatever reason (e.g. panicking in bootstrap, or as was the case in 73097, clippy failing to test successfully), we would just go on believing that things passed when they in practice did not. This commit also adjusts saving toolstate to never record clippy explicitly (otherwise, it would be recorded when building it); eventually that'll likely move to other tools as well but not yet. This is deemed simpler than checking everywhere we generically save toolstate. We also move clippy out of the "toolstate" no-fail-fast build into a separate x.py invocation; this should no longer be technically required but provides the nice state of letting us check toolstate for all tools and only then check clippy (giving full results on every build).
store `ObligationCause` on the heap Stores `ObligationCause` on the heap using an `Rc`. This PR trades off some transient memory allocations to reduce the size of–and thus the number of instructions required to memcpy–a few widely used data structures in trait solving.
Avoid prematurely recording toolstates When we're running with dry_run enabled (i.e. all builds do this initially), we're guaranteed to save of a toolstate of TestFail for tools that aren't tested. In practice, we do test tools as well, so for those tools we would initially record them as being TestPass, and then later on re-record the correct state after actually testing them. However, this would not work well if the build failed for whatever reason (e.g. panicking in bootstrap, or as was the case in #73097, clippy failing to test successfully), we would just go on believing that things passed when they in practice did not. This commit also adjusts saving toolstate to never record clippy explicitly (otherwise, it would be recorded when building it); eventually that'll likely move to other tools as well but not yet. This is deemed simpler than checking everywhere we generically save toolstate. We also move clippy out of the "toolstate" no-fail-fast build into a separate x.py invocation; this should no longer be technically required but provides the nice state of letting us check toolstate for all tools and only then check clippy (giving full results on every build). r? @oli-obk Supercedes #73275, also fixes #73274
Apparently it got changed.
Check for overflow in DroplessArena and align returned memory * Check for overflow when calculating the slice start & end position. * Align the pointer obtained from the allocator, ensuring that it satisfies user requested alignment (the allocator is only asked for layout compatible with u8 slice). * Remove an incorrect assertion from DroplessArena::align. * Avoid forming references to an uninitialized memory in DroplessArena. Helps with #73007, #72624.
Don't run generator transform when there's a TyErr Not sure if this might cause any problems later on, but we shouldn't be hitting codegen or const eval for the produced MIR anyways, so it should be fine. cc #72685 (comment)
…kinnison Re-order correctly the sections in the sidebar Before that, "trait implementations" and "implementors" titles in the sidebar were before "methods" for example. Which wasn't logical considering that the two sections come after in the "content". r? @kinnison
Use track caller for bug! macro
… r=Mark-Simulacrum Add more info to `x.py build --help` on default value for `-j JOBS`.
Fix typo in docs of std::mem
Use `Ipv4Addr::from<[u8; 4]>` when possible Resolve this comment: #73331 (comment)
Fix forge-platform-support URL Apparently it got changed.
Rollup of 8 pull requests Successful merges: - #73237 (Check for overflow in DroplessArena and align returned memory) - #73339 (Don't run generator transform when there's a TyErr) - #73372 (Re-order correctly the sections in the sidebar) - #73373 (Use track caller for bug! macro) - #73380 (Add more info to `x.py build --help` on default value for `-j JOBS`.) - #73381 (Fix typo in docs of std::mem) - #73389 (Use `Ipv4Addr::from<[u8; 4]>` when possible) - #73400 (Fix forge-platform-support URL) Failed merges: r? @ghost
Update LLVM submodule Only includes one commit: - [D80759](https://reviews.llvm.org/D80759): Fix FastISel dropping srcloc metadata from InlineAsm Fixes #40555
…uppe,Mark-Simulacrum [RISC-V] Do not force frame pointers We have been seeing some very inefficient code that went away when using `-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at `-Oz` was compiled into a function which consisted entirely of saving registers to the stack, then using the frame pointer to restore the same registers (without any instructions between the prolog and epilog). The RISC-V LLVM backend supports frame pointer elimination, so it makes sense to allow this to happen when using Rust. It's not clear to me that frame pointers have ever been required in the general case. In #61675 it was pointed out that this made reassembling stack traces easier, which is true, but there is a code generation option for forcing frame pointers, and I feel the default should not be to require frame pointers, given it demonstrably makes code size worse (around 10% in some embedded applications). The kinds of targets mentioned in #61675 are popular, but should not dictate that code generation should be worse for all RISC-V targets, especially as there is a way to use CFI information to reconstruct the stack when the frame pointer is eliminated. It is also a misconception that `fp` is always used for the frame pointer. `fp` is an ABI name for `x8` (aka `s0`), and if no frame pointer is required, `x8` may be used for other callee-saved values. --- I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere. There are more details on the code size savings seen in Tock here: tock/tock#1660
linker: Never pass `-no-pie` to non-gnu linkers Fixes #73370
richkadel
added a commit
that referenced
this pull request
Oct 5, 2020
This is a combination of 18 commits. Commit #2: Additional examples and some small improvements. Commit #3: fixed mir-opt non-mir extensions and spanview title elements Corrected a fairly recent assumption in runtest.rs that all MIR dump files end in .mir. (It was appending .mir to the graphviz .dot and spanview .html file names when generating blessed output files. That also left outdated files in the baseline alongside the files with the incorrect names, which I've now removed.) Updated spanview HTML title elements to match their content, replacing a hardcoded and incorrect name that was left in accidentally when originally submitted. Commit #4: added more test examples also improved Makefiles with support for non-zero exit status and to force validation of tests unless a specific test overrides it with a specific comment. Commit #5: Fixed rare issues after testing on real-world crate Commit #6: Addressed PR feedback, and removed temporary -Zexperimental-coverage -Zinstrument-coverage once again supports the latest capabilities of LLVM instrprof coverage instrumentation. Also fixed a bug in spanview. Commit #7: Fix closure handling, add tests for closures and inner items And cleaned up other tests for consistency, and to make it more clear where spans start/end by breaking up lines. Commit #8: renamed "typical" test results "expected" Now that the `llvm-cov show` tests are improved to normally expect matching actuals, and to allow individual tests to override that expectation. Commit #9: test coverage of inline generic struct function Commit #10: Addressed review feedback * Removed unnecessary Unreachable filter. * Replaced a match wildcard with remining variants. * Added more comments to help clarify the role of successors() in the CFG traversal Commit #11: refactoring based on feedback * refactored `fn coverage_spans()`. * changed the way I expand an empty coverage span to improve performance * fixed a typo that I had accidently left in, in visit.rs Commit #12: Optimized use of SourceMap and SourceFile Commit #13: Fixed a regression, and synched with upstream Some generated test file names changed due to some new change upstream. Commit #14: Stripping out crate disambiguators from demangled names These can vary depending on the test platform. Commit #15: Ignore llvm-cov show diff on test with generics, expand IO error message Tests with generics produce llvm-cov show results with demangled names that can include an unstable "crate disambiguator" (hex value). The value changes when run in the Rust CI Windows environment. I added a sed filter to strip them out (in a prior commit), but sed also appears to fail in the same environment. Until I can figure out a workaround, I'm just going to ignore this specific test result. I added a FIXME to follow up later, but it's not that critical. I also saw an error with Windows GNU, but the IO error did not specify a path for the directory or file that triggered the error. I updated the error messages to provide more info for next, time but also noticed some other tests with similar steps did not fail. Looks spurious. Commit #16: Modify rust-demangler to strip disambiguators by default Commit #17: Remove std::process::exit from coverage tests Due to Issue rust-lang#77553, programs that call std::process::exit() do not generate coverage results on Windows MSVC. Commit #18: fix: test file paths exceeding Windows max path len
richkadel
pushed a commit
that referenced
this pull request
Nov 29, 2020
Don't run `resolve_vars_if_possible` in `normalize_erasing_regions` Neither `@eddyb` nor I could figure out what this was for. I changed it to `assert_eq!(normalized_value, infcx.resolve_vars_if_possible(&normalized_value));` and it passed the UI test suite. <details><summary> Outdated, I figured out the issue - `needs_infer()` needs to come _after_ erasing the lifetimes </summary> Strangely, if I change it to `assert!(!normalized_value.needs_infer())` it panics almost immediately: ``` query stack during panic: #0 [normalize_generic_arg_after_erasing_regions] normalizing `<str::IsWhitespace as str::pattern::Pattern>::Searcher` #1 [needs_drop_raw] computing whether `str::iter::Split<str::IsWhitespace>` needs drop #2 [mir_built] building MIR for `str::<impl str>::split_whitespace` #3 [unsafety_check_result] unsafety-checking `str::<impl str>::split_whitespace` #4 [mir_const] processing MIR for `str::<impl str>::split_whitespace` #5 [mir_promoted] processing `str::<impl str>::split_whitespace` #6 [mir_borrowck] borrow-checking `str::<impl str>::split_whitespace` #7 [analysis] running analysis passes on this crate end of query stack ``` I'm not entirely sure what's going on - maybe the two disagree? </details> For context, this came up while reviewing rust-lang#77467 (cc `@lcnr).` Possibly this needs a crater run? r? `@nikomatsakis` cc `@matthewjasper`
richkadel
pushed a commit
that referenced
this pull request
Mar 19, 2021
HWAddressSanitizer support # Motivation Compared to regular ASan, HWASan has a [smaller overhead](https://source.android.com/devices/tech/debug/hwasan). The difference in practice is that HWASan'ed code is more usable, e.g. Android device compiled with HWASan can be used as a daily driver. # Example ``` fn main() { let xs = vec![0, 1, 2, 3]; let _y = unsafe { *xs.as_ptr().offset(4) }; } ``` ``` ==223==ERROR: HWAddressSanitizer: tag-mismatch on address 0xefdeffff0050 at pc 0xaaaad00b3468 READ of size 4 at 0xefdeffff0050 tags: e5/00 (ptr/mem) in thread T0 #0 0xaaaad00b3464 (/root/main+0x53464) #1 0xaaaad00b39b4 (/root/main+0x539b4) #2 0xaaaad00b3dd0 (/root/main+0x53dd0) #3 0xaaaad00b61dc (/root/main+0x561dc) #4 0xaaaad00c0574 (/root/main+0x60574) #5 0xaaaad00b6290 (/root/main+0x56290) #6 0xaaaad00b6170 (/root/main+0x56170) #7 0xaaaad00b3578 (/root/main+0x53578) #8 0xffff81345e70 (/lib64/libc.so.6+0x20e70) #9 0xaaaad0096310 (/root/main+0x36310) [0xefdeffff0040,0xefdeffff0060) is a small allocated heap chunk; size: 32 offset: 16 0xefdeffff0050 is located 0 bytes to the right of 16-byte region [0xefdeffff0040,0xefdeffff0050) allocated here: #0 0xaaaad009bcdc (/root/main+0x3bcdc) #1 0xaaaad00b1eb0 (/root/main+0x51eb0) #2 0xaaaad00b20d4 (/root/main+0x520d4) #3 0xaaaad00b2800 (/root/main+0x52800) #4 0xaaaad00b1cf4 (/root/main+0x51cf4) #5 0xaaaad00b33d4 (/root/main+0x533d4) #6 0xaaaad00b39b4 (/root/main+0x539b4) #7 0xaaaad00b61dc (/root/main+0x561dc) #8 0xaaaad00b3578 (/root/main+0x53578) #9 0xaaaad0096310 (/root/main+0x36310) Thread: T0 0xeffe00002000 stack: [0xffffc0590000,0xffffc0d90000) sz: 8388608 tls: [0xffff81521020,0xffff815217d0) Memory tags around the buggy address (one tag corresponds to 16 bytes): 0xfefcefffef80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffef90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffeff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0xfefceffff000: a2 a2 05 00 e5 [00] 00 00 00 00 00 00 00 00 00 00 0xfefceffff010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Tags for short granules around the buggy address (one tag corresponds to 16 bytes): 0xfefcefffeff0: .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. =>0xfefceffff000: .. .. c5 .. .. [..] .. .. .. .. .. .. .. .. .. .. 0xfefceffff010: .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. See https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#short-granules for a description of short granule tags Registers where the failure occurred (pc 0xaaaad00b3468): x0 e500efdeffff0050 x1 0000000000000004 x2 0000ffffc0d8f5a0 x3 0200efff00000000 x4 0000ffffc0d8f4c0 x5 000000000000004f x6 00000ffffc0d8f36 x7 0000efff00000000 x8 e500efdeffff0050 x9 0200efff00000000 x10 0000000000000000 x11 0200efff00000000 x12 0200effe000006b0 x13 0200effe000006b0 x14 0000000000000008 x15 00000000c00000cf x16 0000aaaad00a0afc x17 0000000000000003 x18 0000000000000001 x19 0000ffffc0d8f718 x20 ba00ffffc0d8f7a0 x21 0000aaaad00962e0 x22 0000000000000000 x23 0000000000000000 x24 0000000000000000 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000 x28 0000000000000000 x29 0000ffffc0d8f650 x30 0000aaaad00b3468 ``` # Comments/Caveats * HWASan is only supported on arm64. * I'm not sure if I should add a feature gate or piggyback on the existing one for sanitizers. * HWASan requires `-C target-feature=+tagged-globals`. That flag should probably be set transparently to the user. Not sure how to go about that. # TODO * Need more tests. * Update documentation. * Fix symbolization. * Integrate with CI
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.