-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#[deriving(PartialOrd)] is O(N^2) code size for N enum variants #15375
Comments
cc me |
Updated with a more accurate description |
As in, it appears that |
Yes, |
It could potentially be a regression from the partial_cmp implementation, since I touched the PartialOrd deriving logic. |
cc @cmr |
Pretty sure this is my fault, sorry. :( It might be worth deriving detecting C-like enums like this and just casting to an integer, rather than matching. |
@huonw any idea how long it's been like this? It could explain a blowup in the size of the rust-postgres binary that I've noticed but haven't bothered looking into. |
It has been creating a huge match ever since I wrote the generalised |
@huonw yes on the way home I was also thinking that special case handling of C-like enums in the various deriving implementations could be warranted. Its not as nice as figuring out the source of this blowup (but who ever said we can't do both. :) ) |
Have we actually proved that this is a new thing? Maybe @cmr could test some older compilers? |
@huonw even if its not new (i.e. not an injection), ... I guess I was just assuming that there would still be some way to implement a generic |
|
We could (should) also rewrite the implementations to avoid expanding into a huge nested stack of matches to something like this: match self.a.partial_cmp(&other.a) {
Some(Equal) => {}
r => return r
}
match self.b.partial_cmp(&other.b) {
Some(Equal) => {}
r => return r,
}
... Even if it's not smaller, it'd at least be more readable. |
Is there any reason to not just generate only |
@huonw As you are probably already aware: the current implementation for every method is inherently quadratic growth (O(n^2)). (This is the case for all of the implementation including The idea of just comparing numbers for the C-like enum case is a good one, and I believe it generalizes: expand into a match that handles all of the cases where both sides are the same variant (and recur for each case); then in a fall back, compare the discriminants directly (which were presumably extracted via unsafe code). So for this enum pub enum Key<T> { A(T), B(T), C(T), D(T), ... } instead of this: match *self {
// each of these is ...
A(ref __self_0) => match *__arg_0 { A(ref__arg_1_0) => /* recur */, B(_) => Less, C(_) => Less, D(_) => Less, ... },
// ... N elements long
B(ref __self_0) => match *__arg_0 { A(_) => Greater, B(ref__arg_1_0) => /* recur */, C(_) => Less, D(_) => Less, ... },
// .. and there are N variants, yielding O(N^2) code.
...
} Expand into something like this: let discrim_self = ...;
let discrim_arg_0 = ...;
match (*self, *__arg_0) {
(A(ref __self_0), A(ref __arg_1_0)) => /* recur */,
(B(ref __self_0), B(ref __arg_1_0)) => /* recur */,
(C(ref __self_0), C(ref __arg_1_0)) => /* recur */,
// N variants, but each has code that is bounded in complexity by the particular variant
// thus we should be O(N) overall.
...
(_, _) => discrim_self.cmp(discrim_arg_0)
} |
@pnkfelix That's a good idea!
Previously there was only
Due to the undefinedness of representation, I would think we need an intrinsic for this to avoid the |
@huonw you are right, a compiler intrinstic is the right way to query for a |
Oh, also, I think that the undefinedness of representation may cause the ordering to change, e.g. currently enum Foo { A, B } // 0, 1
Some(A) // a u8 with value 0
None::<Foo> // a u8 with value 2 i.e. |
@huonw Good point. So, okay, if extracting the discriminant via an intrinsic and mapping that to an appropriate value becomes unworkable (or not cost-effective), then: Worst case scenario, we don't bother attempting to base the comparison on the internal discriminant. Instead we compute the appropriate number within the let discrim_self = match *self { A(..) => 0u, B(..) => 1u, C(..) => 2u, ... }; // N variants
let discrim_arg_0 = match *__arg_0 { A(..) => 0u, B(..) => 1u, C(..) => 2u, ... }; // N variants
match (*self, *__arg_0) {
(A(ref __self_0), A(ref __arg_1_0)) => /* recur */,
(B(ref __self_0), B(ref __arg_1_0)) => /* recur */,
(C(ref __self_0), C(ref __arg_1_0)) => /* recur */,
// as above, N variants, each bounded by complexity of the tuple of its variant.
...
(_, _) => discrim_self.cmp(discrim_arg_0)
} I mean, its not great having code size here be proportional to 3*N, but its still O(N), versus what we are dealing with today. |
I wonder if the reason we had not noticed this problem until now is that we rarely use Here is an informal survey:
(I guess a better survey would use |
Over the weekend I whipped up a small commit series almost ready that implements the idea I outlined above. I will hopefully have a PR for it put together sometime tomorrow. |
…r=huonw Instead of generating a separate case (albeit trivial) for each of the N*N cases when comparing two instances of an enum with N variants, this `deriving` uses the strategy outlined here: #15375 (comment) In particular, it generates code that looks more like this: ```rust match (this, that, ...) { (Variant1, Variant1, Variant1) => ... // delegate Matching on Variant1 (Variant2, Variant2, Variant2) => ... // delegate Matching on Variant2 ... _ => { let index_tup = { let idx_this = match this { Variant1 => 0u, Variant2 => 1u, ... }; let idx_that = match that { Variant1 => 0u, Variant2 => 1u, ... }; ... (idx_this, idx_that, ...) }; ... // delegate to catch-all; it can inspect `index_tup` for its logic } } ``` While adding a new variant to the `const_nonmatching` flag (and renaming it to `on_nonmatching`) to allow expressing the above (while still allowing one to opt back into the old `O(N^2)` and in general `O(N^K)` (where `K` is the number of self arguments) code generation behavior), I had two realizations: 1. Observation: Nothing except for the comparison derivings (`PartialOrd`, `Ord`, `PartialEq`, `Eq`) were even using the old `O(N^K)` code generator. So after this hypothetically lands, *nothing* needs to use them, and thus that code generation strategy could be removed, under the assumption that it is very unlikely that any `deriving` mode will actually need that level of generality. 2. Observation: The new code generator I am adding can actually be unified with all of the other code generators that just dispatch on the variant tag (they all assume that there is only one self argument). These two observations mean that one can get rid of the `const_nonmatching` (aka `on_nonmatching`) entirely. So I did that too in this PR. The question is: Do we actually want to follow through on both of the above observations? I'm pretty sure the second observation is a pure win. But there *might* be someone out there with an example that invalidates the reasoning in the first observation. That is, there might be a client out there with an example of hypothetical deriving mode that wants to opt into the `O(N^K)` behavior. So, if that is true, then I can revise this PR to resurrect the `on_nonmatching` flag and provide a way to access the `O(N^K)` behavior. The manner in which I choose to squash these commits during a post-review rebase depends on the answer to the above question. Fix #15375.
The program below generates a 259MB rlib, taking about two minutes to compile. Something pathological is going on here which shouldn't be going on.
cc #15346
The text was updated successfully, but these errors were encountered: