Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reallocating to a different alignment? #5

Closed
SimonSapin opened this issue May 3, 2019 · 14 comments · Fixed by rust-lang/rust#75687
Closed

Support reallocating to a different alignment? #5

SimonSapin opened this issue May 3, 2019 · 14 comments · Fixed by rust-lang/rust#75687

Comments

@SimonSapin
Copy link
Contributor

The GlobalAlloc trait is stable with this method:

unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8

(And similarly the std::alloc::realloc function.)

Note new_size: usize instead of new_layout: Layout. This is because this method does not support changing the alignment.


Do we want to lift this restriction in the Alloc trait?

The two parameters of realloc would be old_size: usize, new_layout: Layout. For allocators that do not support this, this would shift the responsibility to trait impls to have a fallback based on alloc + copy_nonoverlapping + dealloc.

@glandium
Copy link

glandium commented May 3, 2019

What are the use cases for changing the alignment? One I can think of is if you have an allocation for, say an array of u32s, that you got from somewhere, and you want to ensure they are aligned to e.g. 16 bytes to do SIMD stuff to them. But I think most algorithms dealing with things like this just deal with the unaligned parts separately, instead of reallocating.

@SimonSapin
Copy link
Contributor Author

IIRC this lack of known use case might have contributed to the choice of not supporting this in GlobalAlloc.

@TimDiekmann

This comment has been minimized.

@TimDiekmann TimDiekmann added A-Compatibility Issues regarding semver compatibility Discussion Issues containing a specific idea, which needs to be discussed in the WG A-Alloc Trait Issues regarding the Alloc trait and removed A-Compatibility Issues regarding semver compatibility labels May 3, 2019
@gnzlbg
Copy link

gnzlbg commented May 4, 2019

@glandium If you have an allocation for storing Ts, and you are done with it, but want to reuse it, e.g., for storing Us, then you might want to reallocate to fit your Us, which might require an allocation that's bigger or smaller, and has a higher or a lower alignment.

So a general purpose reallocation function might be useful.

Some allocators do support this, for example, jemalloc rallocx takes a pointer, the old size, the new size, and a flags parameter that allows specifying the alignment required for the resulting allocation: https://github.com/jemalloc/jemalloc/blob/dev/test/integration/rallocx.c#L150

Some allocators do not support this, e.g., those using a standard C API, but I don't think that using two Layouts complicates the implementation for these significantly. Right now, they can't just always call C's realloc(ptr, size) because that does not necessarily preserve the alignment (e.g. if ptr was allocated with posix_memalign to a higher alignment), so they already need to have quite a bit of logic to check the old layout alignment, and figure out what to do. Sure, adding more logic for comparing that with the new layouts alignment adds another branch in the hot path, but they are already branchy:

https://github.com/rust-lang/rust/blob/9ebf47851a357faa4cd97f4b1dc7835f6376e639/src/libstd/sys/unix/alloc.rs#L41

https://github.com/rust-lang/rust/blob/9ebf47851a357faa4cd97f4b1dc7835f6376e639/src/libstd/sys_common/alloc.rs#L24

So its basically just another check to decide whether to go to the fallback or not.

@TimDiekmann
Copy link
Member

TimDiekmann commented May 6, 2019

If we decide to lift the alignment restrictions, wouldn't it make sense to also lift the requirements for grow_in_place and shrink_in_place?

adding more logic for comparing that with the new layouts alignment adds another branch in the hot path

Branches in the hot path a never good IMO. It'd be possible to add methods for it though so we end up with these signatures (all fns are unsafe and return a reasonable Result<_, _>):

fn realloc(&mut self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout);
fn realloc_aligned(&mut self, ptr: NonNull<u8>, layout: Layout, new_size: usize);
fn grow_in_place(&mut self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout);
fn grow_in_place_aligned(&mut self, ptr: NonNull<u8>, layout: Layout, new_size: usize);
fn shrink_in_place(&mut self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout);
fn shrink_in_place_aligned(&mut self, ptr: NonNull<u8>, layout: Layout, new_size: usize);

or

fn realloc_unaligned(&mut self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout);
fn realloc(&mut self, ptr: NonNull<u8>, layout: Layout, new_size: usize);
fn grow_in_place_unaligned(&mut self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout);
fn grow_in_place(&mut self, ptr: NonNull<u8>, layout: Layout, new_size: usize);
fn shrink_in_place_unaligned(&mut self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout);
fn shrink_in_place(&mut self, ptr: NonNull<u8>, layout: Layout, new_size: usize);

The former has the better names, the latter is analogue to GlobalAlloc.

An example implementation for the former case:

#[inline]
fn is_aligned(ptr: NonNull<u8>, align: usize) -> bool {
    ptr.as_ptr() as usize % align == 0
}

#[inline]
unsafe fn naive_realloc<A: ?Sized + Alloc>(
    alloc: &mut A,
    ptr: NonNull<u8>,
    old_layout: Layout,
    new_layout: Layout,
) -> Result<NonNull<u8>, AllocErr> {
    let new_ptr = alloc.alloc(new_layout)?;
    ptr::copy_nonoverlapping(
        ptr.as_ptr(),
        new_ptr.as_ptr(),
        cmp::min(old_layout.size(), new_layout.size()),
    );
    alloc.dealloc(ptr, old_layout);
    Ok(new_ptr)
}

#[inline]
unsafe fn realloc_in_place_aligned<A: ?Sized + Alloc>(
    alloc: &mut A,
    ptr: NonNull<u8>,
    layout: Layout,
    new_size: usize,
) -> Result<(), CannotReallocInPlace> {
    let old_size = layout.size();

    if new_size == old_size {
        Ok(())
    } else if new_size > old_size {
        alloc.grow_in_place_aligned(ptr, layout, new_size)
    } else {
        alloc.shrink_in_place_aligned(ptr, layout, new_size)
    }
}

pub unsafe trait Alloc {
    // ...
    unsafe fn realloc(
        &mut self,
        ptr: NonNull<u8>,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<NonNull<u8>, AllocErr> {
        if is_aligned(ptr, new_layout.align())
            && realloc_in_place_aligned(self, ptr, old_layout, new_layout.size()).is_ok()
        {
            return Ok(ptr);
        }
        naive_realloc(self, ptr, old_layout, new_layout)
    }

    unsafe fn realloc_aligned(
        &mut self,
        ptr: NonNull<u8>,
        layout: Layout,
        new_size: usize,
    ) -> Result<NonNull<u8>, AllocErr> {
        if realloc_in_place_aligned(self, ptr, layout, new_size).is_ok() {
            return Ok(ptr);
        }

        naive_realloc(
            self,
            ptr,
            layout,
            Layout::from_size_align_unchecked(new_size, layout.align()),
        )
    }

    #[inline]
    unsafe fn grow_in_place(
        &mut self,
        ptr: NonNull<u8>,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<(), CannotReallocInPlace> {
        debug_assert!(new_layout.size() >= old_layout.size());
        if is_aligned(ptr, new_layout.align()) {
            self.grow_in_place_aligned(ptr, old_layout, new_layout.size())
        } else {
            Err(CannotReallocInPlace)
        }
    }

    unsafe fn grow_in_place_aligned(
        &mut self,
        ptr: NonNull<u8>,
        layout: Layout,
        new_size: usize,
    ) -> Result<(), CannotReallocInPlace> {
        // same as old `grow_in_place` 
    }

    #[inline]
    unsafe fn shrink_in_place(
        &mut self,
        ptr: NonNull<u8>,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<(), CannotReallocInPlace> {
        debug_assert!(new_layout.size() <= old_layout.size());
        if is_aligned(ptr, new_layout.align()) {
            self.shrink_in_place_aligned(ptr, old_layout, new_layout.size())
        } else {
            Err(CannotReallocInPlace)
        }
    }

    unsafe fn shrink_in_place_aligned(
        &mut self,
        ptr: NonNull<u8>,
        layout: Layout,
        new_size: usize,
    ) -> Result<(), CannotReallocInPlace> {
        // same as old `shrink_in_place` 
    }
}

@gnzlbg
Copy link

gnzlbg commented May 6, 2019

Branches in the hot path a never good IMO.

Looking at the code, I think it would still be one branch, but the branch condition would be slightly more complicated, e.g., it would contain an extra && old_layout.align() == new_layout.align().

I don't know what the perf cost of that would be, and without benchmarking that it might be hard to tell (both layouts could be inlined), but I can't imagine that the perf cost is worth the API cost of adding 3 trait methods to work around that, particularly without having more concrete use cases of code that actually needs to do this, and that we could benchmark to make an informed decision.

TimDiekmann added a commit to TimDiekmann/alloc-wg that referenced this issue Oct 25, 2019
@TimDiekmann TimDiekmann added Proposal and removed Discussion Issues containing a specific idea, which needs to be discussed in the WG labels Mar 9, 2020
@TimDiekmann
Copy link
Member

This proposal is around for quite a while now. Do you think the extra branch (or the more complicated check) is worth?

@Lokathor
Copy link

Realloc is ultimately a shorthand and you can live without it.

If it doesn't support an alignment change that's fine, and a particular allocator can choose to have a non-trait method to do that if it has such an ability.

@TimDiekmann
Copy link
Member

I have to come back to this again. While implementing an AffixAlloc, which reserves storage before and after the allocation, I must implement grow and shrink, otherwise the suffix is not copied to the new memory. Changing the alignment here is complicated as user, as you cannot simply alloc-copy-dealloc, but have to copy the suffix, store it in a temporary variable, grow/shrink the buffer, and if the grow/shrink succeeded, copy the content back to the new suffix.

With this in mind (and an affix-allocator is rather simple) we probably should allow changing the alignment.

@TimDiekmann TimDiekmann reopened this Apr 27, 2020
@Amanieu
Copy link
Member

Amanieu commented Apr 27, 2020

I don't understand. If you already have a suffix, doesn't that mean that the original block was already suitably aligned? Why would you need to change the alignment while growing/shrinking?

@TimDiekmann
Copy link
Member

Yes, the block is properly aligned after alloc. Actually, the prefix and the suffix are aligned to the same alignment as well. For growing and shrinking for most allocators you can just use the classic realloc fallback (alloc-copy-dealloc), but in this case, this is not possible because copying will also copy the suffix, but the suffix has to be copied to the end of the new block. This issue was closed because of the assumption, that grow/shrink is just a shorthand, but that's not true for all allocators (if there is one, there will be many). I reopened this, so we can at least take this point into account before eventually closing this again.

@TimDiekmann
Copy link
Member

TimDiekmann commented Aug 12, 2020

Since this issue was opened the AllocRef trait was changed quite a lot. I propose to support reallocating to a different alignment like described in the OP (adjusted to the current AllocRef trait).

The issue was closed once because of the assumption, that realloc is ultimatively a shorthand, which was proved to not be true in all cases (an affix-allocator requires to copy the suffix). Additionally, there are allocators around (jemalloc), which have a function which is able to change the alignment (rallocx), which might be more performant than alloc-copy-dealloc. Please also see #5 (comment).

There are two usecases listed in this issue:

  • Support reallocating to a different alignment? #5 (comment):

    One I can think of is if you have an allocation for, say an array of u32s, that you got from somewhere, and you want to ensure they are aligned to e.g. 16 bytes to do SIMD stuff to them. But I think most algorithms dealing with things like this just deal with the unaligned parts separately, instead of reallocating.

  • Support reallocating to a different alignment? #5 (comment):

    If you have an allocation for storing Ts, and you are done with it, but want to reuse it, e.g., for storing Us, then you might want to reallocate to fit your Us, which might require an allocation that's bigger or smaller, and has a higher or a lower alignment.

Especially the latter use case seems reasonable enough to support reallocating to a different alignment, as reallocation may be faster than alloc-copy-dealloc.

With this proposal and #69 combined, we wouldn't have an issue, that two layouts have to be passed to grow or shrink, which might would be confusing.

Please react to this post with 👍 or 👎 . Please note, that 👎 without any comment isn't very useful at all 😉

cc @Amanieu @glandium @gnzlbg@Lokathor @scottjmaddox @SimonSapin @Wodann

@Wodann
Copy link

Wodann commented Aug 12, 2020

In most cases, I expect the implementation to be something like:

unsafe fn grow(
  &mut self,
  memory: NonNull<[u8]>,
  old_align: usize,
  new_layout: Layout
) -> Result<NonNull<[u8]>, AllocErr> {
  if align == layout.align() {
    MyAlloc::realloc(memory, new_layout);
  } else {
    MyAlloc::realloc_align(memory, old_align, new_layout);
  }
}

As new_layout and old_align are probably going to be compile-time constants most of the time, I don't expect any performance impact.

If that assumption is correct, I agree that this is a more generic API that can cover the range of use cases better.

@TimDiekmann
Copy link
Member

TimDiekmann commented Aug 12, 2020

As new_layout and old_align are probably going to be compile-time constants most of the time, I don't expect any performance impact.

There will be a very small compile-time regression, but runtime-wise it should be exactly the same (it get's optimized out, similar to rust-lang/rust#75264, the static case show the optimization very well)

unsafe fn grow(&mut self, memory: NonNull<[u8]>, old_align: usize, new_layout: Layout)

This is excactly the signature I like to come up with 👍

Edit: Regarding the compile-time regression: We already had similar compile-time regressions with #44, PR was rust-lang/rust#70362

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants