Conversation
This also reworks some of the other sections to further improve the clarity of the message.
| This proposal introduces a new unsafe trait `Place`: | ||
| ```rust | ||
| unsafe trait Place: DerefMut { | ||
| fn place(&mut self) -> *mut Self::Target |
There was a problem hiding this comment.
So, it's a little strange why this method is necessary when it appears that any implementation would just put a call to deref_mut here: the mutable reference would coerce to a pointer, and casting to a pointer obviously removes all reference to lifetimes and lets the compiler do whatever it wants with it.
There was a problem hiding this comment.
Best I understand the semantics of mutable references, it would be unsound to have one to a value that is uninitialized. And the semantics of moving in and out of the Place would involve calling this function in cases where *mut Self::Target would then have to point to something that is uninitialized.
There was a problem hiding this comment.
Hmm, but wouldn't that mean that you have to take *mut self, not &mut self? What you're saying makes sense, but it's unclear how the pointer could be initialized when this function is called. Effectively, the mutable borrow ends after the function returns, so, it's totally valid for something that was originally &mut Self::Target to become initialized as long as the original lifetime has ended and it's only used as a pointer at that point.
There was a problem hiding this comment.
Ok, I answered more completely in the other thread, see #3921 (comment). It basically comes down to that the argument to place is the Box-Like, which is (and should be, otherwise use of this trait is going to be a nightmare) still initialized, even though its Contents might not be. The non-initialized status of the Contents rules out the use of references for that, hence the extra function and the pointer.
| - Safe code shall not modify the initialization status of the contents. | ||
| - Unsafe code shall preserve the initialization status of the contents between two derefences of teh type's values. | ||
| - Values of the place type for which the content is uninitialized shall not be able to be created in safe code. |
There was a problem hiding this comment.
This is… a very confusing set of safety requirements considering how they're effectively already guaranteed by the intrinsic safety requirements of the language: you can't de-initialize the contents of something with a mutable reference, and unsafe code is expected to uphold this regardless of whether the reference is converted into a pointer or not.
There was a problem hiding this comment.
I agree that the formulation here is less than ideal, however there is something real that is asked here, that is unfortunately not guaranteed by the borrow checker alone. For example, if somebody defines the type InplaceBox as follows:
struct InplaceBox<T>(MaybeUninit<T>);
impl<T> Deref for InplaceBox<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
unsafe { self.0.assume_init_ref() }
}
}
impl<T> DerefMut for InplaceBox<T> {
fn deref_mut(&mut self) -> &mut Self::Target {
unsafe { self.0.assume_init_mut() }
}
}
unsafe impl<T> Place for InplaceBox<T> {
type NewArg = ();
fn place(&mut self) -> *mut Self::Target {
self.0.as_mut_ptr()
}
}Then as part of the unsafe contract for place they promise to for example not to do stuff like
#[derive(Debug, Clone, Copy)]
enum NonZero { One = 1, Two = 2 }
pub fn foo() {
let ipb = InplaceBox(MaybeUninit::init(NonZero::One))
println!("{:?}", *ipb); // Still OK, ipb contains something
ipb.0 = MaybeUninit:zeroed(); // Should be forbidden, because borrow checker should still think following is ok.
println!("{:?}", *ipb); // UB happens here now, since even though the borrow checker thinks this is fine, it is not, because of the line above.
}Forbidding these sorts of shenanigans is what I am trying to capture with these requirements. If you have suggestions for how to better formulate that I'd love those.
There was a problem hiding this comment.
I guess that, as I also pointed out in the other comment thread, it's not really clear how this UB is possible based upon what the API is trying to achieve. Yeah, that example is obviously UB, but it's unclear how the Place implementation needs these guarantees in order to work.
Like, no matter what, safe code should not be allowed to create an invalid value, and if you're specifically moving a value out of a place, you kind of necessitate that the container be dropped in some way as part of this process, so, further references will not be possible. For Box, this happens via deallocating the pointer, but it also requires running the drop glue for the field beforehand, and it needs to know whether that should have to occur or not.
There was a problem hiding this comment.
Possibly the trait could remain unstable itself while having a pointer-field based compiler generated implementation path, similar to CoerceUnsize and its CoercePointee's relationship. Consider that we may not need to name the trait explicitly in many use cases, only the compiler must be aware of the trait implementation for variables (that it then has a borrow tree for). This would solve two other issues:
- The macro implemented in the compiler can initially verify a very narrow subset of types to qualify, for which the semantics we want are quite clear. The overlap seems quite large, too. For smart pointers with one pointer-like field from which to derives its behavior seems reasonably well-defined. The contentious method / MIR would be compiler generated, too, which means the meat of the unresolved question is punted to future relaxations of the macro or arbitrary user-defined derives.
- We could define that
SmartPointer<MaybeUninit<T>>is allowed to initializedSmartPointer<T>by filling the place (through compiler defined init-sequence support) and transmutation. This would be based on very similar layout requirements than placement but again the macro could annotate the type to guarantee them.
There was a problem hiding this comment.
We could define that SmartPointer<MaybeUninit> is allowed to initialized SmartPointer by filling the place (through compiler defined init-sequence support) and transmutation. This would be based on very similar layout requirements than placement but again the macro could annotate the type to guarantee them.
You can currently move out of a Box<T> and then back into it again without moving the Box<T> itself. Transmuting requires moving it.
fn main() {
let mut a = Box::new(Box::new(vec![0]));
drop(**a); // Deinitialize the inner box without moving it.
**a = vec![]; // Reinitialize the inner box without moving it.
}There was a problem hiding this comment.
I guess that, as I also pointed out in the other comment thread, it's not really clear how this UB is possible based upon what the API is trying to achieve. Yeah, that example is obviously UB, but it's unclear how the
Placeimplementation needs these guarantees in order to work.Like, no matter what, safe code should not be allowed to create an invalid value, and if you're specifically moving a value out of a place, you kind of necessitate that the container be dropped in some way as part of this process, so, further references will not be possible. For
Box, this happens via deallocating the pointer, but it also requires running the drop glue for the field beforehand, and it needs to know whether that should have to occur or not.
Ok, so where I think my explanation is not clear enough is that perhaps I'm not making a distinction that maybe we should make more explicit. On the one hand we have the implementer of the Place trait, lets call it the Box-Like. Then separately there is the thing the Box-Like can be derefferenced into, lets call that the Contents.
The crux around the undefined behavior is that the initialization status of the Box-Like and its Contents are to an extent independent. Primarily, the Box-Like should always be initialized, in the sense that we are allowed to have references to the Box-Like without having UB. However, the Contents can at various times be uninitialized, such as when the previous value has been moved out of them but the next hasn't yet been moved back in. Using the Deref traits on those Box-Likes is UB, because the resulting reference is a reference to an uninitialized value, which is UB. Really the only reason you are allowed to dereference them is to move some new value in.
This matches what is currently already the case for Box. It is completely valid to have references to a Box whose Contents are uninitialized, that yields no UB (otherwise the Drop implementation for Box would exhibit undefined behavior). However, then using the Deref traits those boxes in any way DOES yield UB, as that process again produces a reference to uninitialized memory.
What the compiler, via the borrow checker, guarantees for Box, and what I propose it will also guarantee for Box-Likes, is that all this complexity with the Contents potentially being uninitialized, can remain hidden from the end user, so long as the implementation of the Box or Box-Like is "reasonable". In other words, that end users can just trust the borrow checker to prevent them from trying to read from an uninitialized Box or Box-Like.
That means however that we need to somehow put into words what it means for the implementation of such a type to be "reasonable", which is what these requirements try to capture. They basically come down to "don't mess with the Contents too much, the compiler will manage that for you", but making concrete rules for that is hard and I'd love input on how to clarify them.
There was a problem hiding this comment.
Possibly the trait could remain unstable itself while having a pointer-field based compiler generated implementation path, similar to
CoerceUnsizeand itsCoercePointee's relationship. Consider that we may not need to name the trait explicitly in many use cases, only the compiler must be aware of the trait implementation for variables (that it then has a borrow tree for). This would solve two other issues:* The macro implemented in the compiler can initially verify a very narrow subset of types to qualify, for which the semantics we want are quite clear. The overlap seems quite large, too. For smart pointers with _one_ pointer-like field from which to derives its behavior seems reasonably well-defined. The contentious method / MIR would be compiler generated, too, which means the meat of the unresolved question is punted to future relaxations of the macro or arbitrary user-defined derives. * We could define that `SmartPointer<MaybeUninit<T>>` is allowed to initialized `SmartPointer<T>` by filling the place (through compiler defined init-sequence support) and transmutation. This would be based on very similar layout requirements than placement but again the macro could annotate the type to guarantee them.
Okay, I have no idea what you are trying to get at here, and how it would solve/circumvent the issue of unclear safety requirements. Could you perhaps make a bit more concrete what your suggestion would be for this particular case?
There was a problem hiding this comment.
Transmuting requires moving it.
A literal transmute at the point of initialization of the value would indeed require moving, hadn't actually thought of it that way. I meant reinterpreting the type in-situ—hence requiring layout compatibility–as a clear semantics of the pointer while the place it is pointing to is (partially-)not-initialized (e.g. for the purposes of drop-glue in partly initialized state we could actually move but probably still never want to to avoid unsound Pin interactions). Maybe we could have Ptr<MaybeUninit<T>> be a value for creating an initially uninit place through a transmute if that solves part of the init-expression interactions.
Could you perhaps make a bit more concrete what your suggestion would be for this particular case?
#[derive(Place)]
struct MyKindOfBox<T> {
alloc: NonNull<T>,
// other fields not mentioning T
}
// Compiler generated:
unsafe impl<T> Place for MyKindOfBox<T> {
// unstable details
unsafe fn deref_ptr(*mut Self) -> *mut T {
// Really probably purely synthesized MIR.
unsafe { PlaceOrBase::deref_ptr(addr_of!(*self.alloc)) }
}
}
// Private trait to define what fields can be used to derive
unsafe trait PlaceOrBase { … }
impl<T: ?Sized+Place> PlaceOrBase for T { … }
impl<T> PlaceOrBase for NonNull<T> { … }The semantics of Place would only need to be explained for the types supported by derive for now which seems much simpler since we can choose that restrictively. All the ambiguity of panicking MIR, the exact safety predicate for trait and deref, taking pointer or reference, can be figured out one by one and separately. Meanwhile we would still get to stably use UniqueArc-places etc. as users.
As for the InplaceBox construction above, it would depend on having a field recognized by the compiler to implement the trait from; MaybeUninit does not work for this purpose (as observed by example). I want to note that the example does not follow the usual pointer initialization (e.g. Box::assume_init) where the pointee has the MaybeUninit. ManuallyDrop is closer to the semantics that Box has for its place. The corresponding sequence of initialization is that we first construct a StackBox<MaybeUninit<T>> on top of an allocation (that allocation has type MaybeUninit<T>), then fill it with the value, and that turns it into StackBox<T>. The latter can only be overwritten with T so it can not be de-initialized as in the example.
It would solve the example in that we would never have to write a bogus &MaybeUninit<T> -> &T conversion guarantee. We have to discard the idea of initializing InplaceBox in a single line of code though. That seems a reasonable price to pay. As for how this could be treated more ergonomically refer to the paragraph above but macros would paper over ergonomics for a lot of cases I believe.
This introduces a
Placetrait, intended to make the special move behavior ofBoxpossible for other types.Important
When responding to RFCs, try to use inline review comments (it is possible to leave an inline review comment for the entire file at the top) instead of direct comments for normal comments and keep normal comments for procedural matters like starting FCPs.
This keeps the discussion more organized.
Rendered