Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchy of Sized traits #3729

Open
wants to merge 64 commits into
base: master
Choose a base branch
from

Conversation

davidtwco
Copy link
Member

@davidtwco davidtwco commented Nov 15, 2024

All of Rust's types are either sized, which implement the Sized trait and have a statically known size during compilation, or unsized, which do not implement the Sized trait and are assumed to have a size which can be computed at runtime. However, this dichotomy misses two categories of type - types whose size is unknown during compilation but is a runtime constant, and types whose size can never be known. Supporting the former is a prerequisite to stable scalable vector types and supporting the latter is a prerequisite to unblocking extern types. This RFC proposes a hierarchy of Sized traits in order to be able to support these use cases.

This RFC relies on experimental, yet-to-be-RFC'd const traits, so this is blocked on that. I haven't squashed any of the previous revisions but can do so if/when this is approved. Already discussed in the 2024-11-13 t-lang design meeting with feedback incorporated.

See this comment for the most recent summary of changes to this RFC since it was opened.

Rendered

@davidtwco davidtwco added the T-lang Relevant to the language team, which will review and decide on the RFC. label Nov 15, 2024
Co-authored-by: León Orell Valerian Liehr <[email protected]>
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
@tmandry tmandry added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Nov 15, 2024
@davidtwco
Copy link
Member Author

I would appreciate if the RFC could list the syntax as an unresolved question: should there be some clear syntactic marker that the default bound is removed, or are we okay with that being an entirely implicit side-effect of adding another bound? I am not sure if it is necessary to commit to a particular syntax for this already, and seeing this used in practice will help determine how confusing it really is.

I've added this as an unresolved question.

As one point for why this is confusing, imagine I have

trait MyTrait: const ValueSized {}

and now I write T: MyTrait. I presume that under your RFC, this is not equivalent to T: MyTrait + const ValueSized, since the latter opts-out of the default bound, but the former does not! That's non-compositional and odd, and IMO the RFC should clearly acknowledge this and make sure we get back to this question before stabilization (by listing it as an unresolved question).

That makes sense. You're right that T: MyTrait would not imply removal of the default bound, it would be T: const Sized + MyTrait unless you wrote T: MyTrait + const ValueSized (or T: MyTrait + ?Sized if we go that way). In Niko's blog post, he suggested it could be that T: MyTrait implies removal of the default bound. I mention that what I have proposed is compatible with that but don't propose it at the moment.

👍 , given that all the types we currently have determine their size based on the metadata, not the value, this seems prudent. The name of size_of_val can be considered unfortunate in that context, but the function does take a value, so it makes some sense.

I've added this as an unresolved question too.

?const ValueSized is a bit odd though, it has to be interpreted as ?(const ValueSized) to make sense with this model, even though it looks like like (?const) ValueSized. (?const) ValueSized would be like adding a ValueSized bound and being in a world where that trait is by default const, and then we opt-out of that default constness. If we go with this syntax, we better never have "traits that are const by default", i.e. we better make it so that (?const) ValueSized is never a valid interpretation of this bound.

I've added explicit parentheses to make this clearer for now until a const Traits RFC clarifies this.

@davidtwco
Copy link
Member Author

davidtwco commented Dec 4, 2024

Another summary comment!

Here are the previous update summaries copied below so you don't need to uncollapse GitHub comments to find it:

2024-11-25

Summary comment: #3729 (comment)

For those following along or catching up, these are the notable the changes to the RFC since this was posted:

  • Clarify proposed behaviour for ?Trait syntax for non-Sized, which is currently accepted
  • Stop re-using std::ptr::Pointee and make Pointee its own new marker trait to avoid backwards incompatibility
  • Clarify backwards compatibility implications of ?Sized syntax and add alternatives to removing the default bound using positive bounds which continue to use ?Sized
  • Add that relaxing existing bounds in trait methods would be backwards incompatible
  • Elaborate on necessity of implicit const ValueSized bound on Self type of traits
  • Add MetaSized alternative to ValueSized which would resolve interactions with mutexes
  • Clarified that bounds on return types can never be relaxed.

And these are all the other smaller changes that don't materially impact what is being proposed:

  • Fixed some minor wording errors where supertrait/subtrait were used backwards
  • Removed HackMD's rust= syntax from codeblocks
  • Fixed referring to the introduction of a const Sized trait, but rather adding a const modifier to the existing Sized trait
  • Added some background/context on dynamic stack allocation
  • Use current experimental const trait syntax
  • Corrected incorrect syntax for traits
  • Listed all alternate bounds (adding ~const ValueSized to a list of bounds that it was missing from)
  • Fixed bound in description of size_of_val changes
  • Corrected description of current size_of_val and align_of_val behaviour
  • Corrected description of extern type usage in structs
  • Mention Rust for Linux's interest in extern type
  • Weaken language in the externref future possibility to make it clear this proposal would not be sufficient on its own to support these
  • Re-write Aligned future possibility so that it is clear Aligned couldn't be added to the proposed hierarchy

At the moment, I prefer the following alternatives to the primary proposal of the RFC, and may re-write to incorporate these as the primary proposal:


These are links to the compare with previous versions of the document (switch to the "Files changed" tab and then the rich diff):

These are the major changes since the last summary:

  • Change from ValueSized to MetaSized
  • Add unresolved question about whether re-using std::ptr::Pointee is important.

These are the minor changes since the last summary:

  • Clarify that bounds on parameters used as return types can never be relaxed (return types still need to implement Sized)
  • Clarify that non-const Sized is not nameable in the current edition due to the backwards compatibility migration, only in the next edition
  • Clarify which Pointee (marker trait Pointee or std::ptr::Pointee) is referenced in the Custom DSTs future possibility
  • Improve clarity of the table describing alternative syntax proposals for relaxing bounds
  • Strengthen wording describing use of runtime-sized types in a const context as unsound
  • Expand on motivation for MetaSized
  • Specify that methods of the new traits would be backwards-incompatible too, not just associated types
  • Use the same indentation in code blocks throughout
  • Fix typo in list of when each trait gets implemented
  • Add a concrete example of code that would break if a supertrait of Clone were relaxed
  • Added note about potential reasons why delaying relaxation of size_of's bound might be desirable
  • Fixed typo, replacing "alignment" with "size"
  • Rewrote "Why have a Pointee trait?" section to clarify that the trait is only necessary due to the proposed removal of ?Sized syntax

These are the alternatives described in the RFC that I think are worth consideration as the primary proposal:

@davidtwco
Copy link
Member Author

Since there seems to be consensus on it, I've adopted the alternative for replacing ValueSized with MetaSized and included this in the document. It just makes sense as it matches the current ?Sized semantics.

@safinaskar
Copy link

safinaskar commented Dec 22, 2024

Edit: oops, I see that thin CStr is mentioned.

What about thin CStr? Please, mention it in RFC. It is somewhat unique, because it has a size, but to determine size, we should read actual data, and thus size_of_val_raw is always unsafe. See https://rust-lang.zulipchat.com/#narrow/channel/219381-t-libs/topic/CStr.20as.20thin.20pointer/near/405436807 , https://rust-lang.zulipchat.com/#narrow/channel/219381-t-libs/topic/CStr.20as.20thin.20pointer/near/405473727 , https://rust-lang.zulipchat.com/#narrow/channel/219381-t-libs/topic/CStr.20as.20thin.20pointer/near/405518769 .

And I absolutely want to be able to use thin CStr as a last field of a struct, because this will finally solve long-standing dirent problem on Linux, which currently requires this horrible hack in standard library: https://github.com/rust-lang/rust/blob/c1132470a6986b12503e8000e322e9164c1f03ac/library/std/src/sys/pal/unix/fs.rs#L730-L768

This was referenced Jan 2, 2025
@workingjubilee workingjubilee added the A-dst Proposals re. DSTs label Jan 2, 2025
@Ericson2314
Copy link
Contributor

Ericson2314 commented Jan 2, 2025

Wow, a funny (but good!) feeling to think that this might finally get resolved after a decade. :)

Instead of introducing a new marker trait, std::ptr::Pointee could be re-used if there were some mechanism to indicate that associated types or methods could only be referred to with fully-qualified syntax. Alternatively, it would be possible to introduce forward-compatibility lints in current edition, the new traits were introduced in the next edition and the edition migration previously described in the next next edition.

I think std::ptr::Pointee should definitely be reused. I am not sure which alternative is, but it seems to me like the right thing to do is:

  • Use std::ptr::Pointee
  • Do the hack to avoid new ambiguity in the current addition
  • Remove the hack in the next addition

Also, I agree with @RalfJung than the ? avoidance is quite confusing, and we are mixing up syntax and semantics. The bottom line is that semantically, the trait hierarchy is the free lattice over the trait partial order. Decoding the complex math speak: that means the minimum element (or maxiumum, whatever way you want the arrows to point) is not a trait but the empty set of traits. So we need to specify what that means.

After we do that, we we can debate syntax.

In particular, the "Why have Pointee?" is a bit misleading in that semantically there currently is indeed no reason, it is just there for syntactic purposes.

However, if we reuse std::ptr::Pointee then there is a semantic reason, namely the associated type. ({} / no traits has no associated types or any other trait members, because it has no traits!) So here is an extra incentive to reuse std::ptr::Pointee --- one can side-step what @RalfJung and I are saying :).

At the point I still think it is good education to talk about the empty trait set as:

  • You can't use it behind a pointer
  • You can't use it as a value
  • You could use it in PhantomData, but probably this is left as future work
  • Maybe you can't quantify over it all (as opposed to you can, but then you can't use the variable anywhere, and then its an unused type parameter error)

Oh and a final thing on semantics, I might also also present the back compat / editions story this way:

  • Old semantic hierarchy
  • New semantic hierarchy
  • mapping from old to new semantics, (old {} (no traits) is mapped not to new {} but to const MetaSized) --- in particular, do not speak of ?Size when describing this semantic mapping.
  • mapping from new syntax in next edition maps to new semantics.

In particular, the existing old syntax maps to the new semantics via the old semantics; we can cleanly factor that into those two steps, and I think this brings conceptual clarity.

@programmerjake
Copy link
Member

  • Use std::ptr::Pointee
  • Do the hack to avoid new ambiguity in the current addition
  • Remove the hack in the next addition

imo having an associated type on practically every type will be annoying, so I think it might be useful to require writing <T as Pointee>::Metadata instead of T::Metadata on all editions.

@workingjubilee
Copy link
Member

the trait's unstable, if it's just a matter of bikeshedding we can rename the type PtrMetadata or sth

davidtwco referenced this pull request in oli-obk/rfcs Jan 8, 2025
@davidtwco
Copy link
Member Author

I think std::ptr::Pointee should definitely be reused. I am not sure which alternative is, but it seems to me like the right thing to do is:

  • Use std::ptr::Pointee
  • Do the hack to avoid new ambiguity in the current addition
  • Remove the hack in the next addition

I've added this as one of the unresolved questions.

In particular, the "Why have Pointee?" is a bit misleading in that semantically there currently is indeed no reason, it is just there for syntactic purposes.

I've rewritten this section, it should have been updated earlier with the addition of the alternatives which keep the ?Sized syntax and clarified that it's just a syntactic difference.

At the point I still think it is good education to talk about the empty trait set as:

  • You can't use it behind a pointer
  • You can't use it as a value
  • You could use it in PhantomData, but probably this is left as future work
  • Maybe you can't quantify over it all (as opposed to you can, but then you can't use the variable anywhere, and then its an unused type parameter error)

Oh and a final thing on semantics, I might also also present the back compat / editions story this way:

  • Old semantic hierarchy
  • New semantic hierarchy
  • mapping from old to new semantics, (old {} (no traits) is mapped not to new {} but to const MetaSized) --- in particular, do not speak of ?Size when describing this semantic mapping.
  • mapping from new syntax in next edition maps to new semantics.

In particular, the existing old syntax maps to the new semantics via the old semantics; we can cleanly factor that into those two steps, and I think this brings conceptual clarity.

I haven't made these changes at the moment - I've had good feedback about the explanations in the RFC and would prefer not to make major changes to that until the proposal itself needs changing.

@davidtwco
Copy link
Member Author

I've updated the previous summary comment so it is still accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dst Proposals re. DSTs I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.