Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is no-copy message write possible with no alloc ? #535

Open
kaidokert opened this issue Dec 22, 2024 · 3 comments
Open

Is no-copy message write possible with no alloc ? #535

kaidokert opened this issue Dec 22, 2024 · 3 comments

Comments

@kaidokert
Copy link

Hey, here's the basic test code i have to create a message in a slice:

pub fn create_serialized_message<const N: usize>(buffer: &mut [u8]) {
    let mut tmp_buffer = [0; N];
    let mut allocator = capnp::message::SingleSegmentAllocator::new(tmp_buffer.as_mut_slice());
    let mut message = capnp::message::Builder::new(&mut allocator);
    let foo_builder = message.init_root::<foo::foo::Builder>();
    let mut numbers = foo_builder.init_numbers(3);
    numbers.set(0, 1);
    numbers.set(1, 2);
    capnp::serialize::write_message(buffer, &message).expect("yay?");
}

The question is, is it possible to, even in theory, to write a message without having a copy in an intermediate buffer ? I couldn't figure it out.

By reading the code, it doesn't currently seem possible, as write_segment_table needs to write 8 bytes to the head of the buffer - after the message has been constructed. And write_segment_table is not public, so i can't manually partition the slice for writing the single-segment header and the message.

@dwrensha
Copy link
Member

dwrensha commented Dec 22, 2024

Hm...
It would almost works to use message::Builder::into_allocator() to tell the message:Builder to release ownership of the SingleSegmentAllocator:

/// Retrieves the underlying `Allocator`, deallocating all currently-allocated
/// segments.
pub fn into_allocator(self) -> A {
self.arena.into_allocator()
}

However, that method calls deallocate_all()
/// Retrieves the underlying `Allocator`, deallocating all currently-allocated
/// segments.
pub fn into_allocator(mut self) -> A {
self.inner.deallocate_all();
self.inner.allocator.take().unwrap()
}

... which causes the buffer to be zeroed:
unsafe fn deallocate_segment(&mut self, ptr: *mut u8, _word_size: u32, words_used: u32) {
let seg_ptr = self.segment.as_mut_ptr();
if ptr == seg_ptr {
// Rezero the slice to allow reuse of the allocator. We only need to write
// words that we know might contain nonzero values.
unsafe {
core::ptr::write_bytes(
seg_ptr, // miri isn't happy if we use ptr instead
0u8,
(words_used as usize) * BYTES_PER_WORD,
);
}
self.segment_allocated = false;
}
}

Note that dropping the message::Builder also calls deallocate_all():

impl<A> Drop for BuilderArenaImplInner<A>
where
A: Allocator,
{
fn drop(&mut self) {
self.deallocate_all()
}
}

So perhaps we should add a version of SingleSegmentAllocator that does not do this zeroing. (The trade-off is that such an allocator would not work for re-using a buffer.)

The segment table for these messages will always have only one word. So your process could look like this:

  1. Obtain a buffer of N words.
  2. Reserve the first word for the segment table, and pass the rest into a new SingleSegmentAllocator.
  3. Construct a message::Builder using that allocator
  4. When you are done constructing the message, use message::Builder::size_in_words() to see how big it is.
  5. Call into_allocator() to release ownerthip of the SingleSegmentAllocator. (Or just drop it, to release the borrow.)
  6. Write the segment size into the header word.

You should be able to make this work by writing your own version of SingleSegmentAllocator, but longer-term it would be a nice addition to capnproto-rust to have this functionality in the library.

@kaidokert
Copy link
Author

Thanks so much for quick response ! I didn't even notice the deallocate part happening yet.

And yep - with these pointers it looks like i should certainly be able to hack something to work here.

@dwrensha
Copy link
Member

dwrensha commented Dec 23, 2024

The deallocate_segment() method might even be a good place to put the logic that writes the length to the 1-word segment table.

You'll also want to make sure that allocate_segment() panics if called after deallocate_segment(), unlike with the existing SingleSegmentAllocator, which allows re-use in a new message. The issue is that if a message's segment is not all zeroes upon initialization, then out-of-bounds accesses are possible.

You might also consider not bothering with segment tables, if all the messages you will be dealing with are single-segment and you have some other means of knowing their lengths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants