Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add callback support to FileDescription::read #4110

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shamb0
Copy link
Contributor

@shamb0 shamb0 commented Dec 26, 2024

Key Changes

  • Introduces IoTransferOperation to manage atomic file transfers
  • Adds callback support to FileDescription::read

Context

This work completes stages 2 of the async file operations roadmap discussed here:

  1. ✅ Generalize callback system (PR Concurrency: Generalize UnblockCallback to MachineCallback #4106)
  2. ✅ Add callback support to FileDescription::read (this PR)
  3. 🔲 Implement readv() using callbacks (this PR)

@shamb0 shamb0 force-pushed the support-shim-fs-atomic-read branch from abfa97f to 88cbd55 Compare December 26, 2024 10:07
@shamb0
Copy link
Contributor Author

shamb0 commented Dec 26, 2024

Hi @RalfJung,

I wanted to share some updates and seek your guidance on a few points:

  1. Callback Integration
    I’ve implemented the callback using the Mutex synchronization object but would appreciate your insights on whether this is the most suitable concurrency synchronization approach for this case.

  2. Clarification on read Behavior
    Regarding your statement:
    "The problem with read is that when it returns, the read may not actually have completed yet."
    I believe this behavior might be specific to the Miri interpreter context. Could you elaborate on this to help me better understand its implications?

  3. Exploration of std::fs Library

    While reviewing the std::fs library on the Unix platform, I found the following relevant implementations:

    • FileDesc: Defines core operations like read (e.g., view code).

      #[derive(Debug)]
      pub struct FileDesc(OwnedFd);
      
      impl FileDesc {
          pub fn read(&self, buf: &mut [u8]) -> io::Result<usize> {
              let ret = cvt(unsafe {
                  libc::read(
                      self.as_raw_fd(),
                      buf.as_mut_ptr() as *mut libc::c_void,
                      cmp::min(buf.len(), READ_LIMIT),
                  )
              })?;
              Ok(ret as usize)
          }
      }
    • File (inner struct): Wraps FileDesc (view code).

      pub struct File(FileDesc);
    • File (outer struct): Provides higher-level abstractions (view code).

      use crate::sys::fs as fs_imp;
      
      #[stable(feature = "rust1", since = "1.0.0")]
      #[cfg_attr(not(test), rustc_diagnostic_item = "File")]
      pub struct File {
          inner: fs_imp::File,
      }
  4. Understanding Read Operation
    Based on my analysis, it appears that the read operation blocks and fills the destination buffer before returning:

    let read_result = (file_handle.file).read(op.borrow_mut().buffer_mut());

    Could you please confirm if this understanding is correct? If not, I’d be grateful if you could point me toward relevant documentation or files for clarification.

Thanks in advance for your help and guidance! Looking forward to your insights.

@shamb0 shamb0 marked this pull request as ready for review December 26, 2024 10:50
@shamb0
Copy link
Contributor Author

shamb0 commented Dec 26, 2024

@rustbot ready

@rustbot rustbot added the S-waiting-on-review Status: Waiting for a review to complete label Dec 26, 2024
@RalfJung
Copy link
Member

RalfJung commented Dec 26, 2024

So this is blocked on #4106, I assume.

✅ Add callback support to FileDescription::read (this PR)
✅ Implement readv() using callbacks (this PR)

I would prefer if you could split this into two separate PRs. Refactoring an existing interface should not be done in the same changeset as adding new functionality, if it can be avoided. Splitting changes in smaller PRs greatly helps with making them easier to review.

@RalfJung RalfJung added the S-blocked Status: blocked on something happening somewhere else label Dec 26, 2024
@shamb0 shamb0 force-pushed the support-shim-fs-atomic-read branch from 88cbd55 to fb2d16a Compare December 26, 2024 15:41
@shamb0
Copy link
Contributor Author

shamb0 commented Dec 26, 2024

Hi @RalfJung,

Thank you for your feedback and for highlighting the review challenges—I completely understand.

This PR focuses on two key changes:

  1. ✅ Generalizing the callback system (PR #4106)
  2. ✅ Adding callback support to FileDescription::read (this PR).

Let me know if there's anything else you'd like me to adjust.

Thanks! 🙂

@shamb0
Copy link
Contributor Author

shamb0 commented Dec 26, 2024

@rustbot ready

@shamb0 shamb0 force-pushed the support-shim-fs-atomic-read branch from fb2d16a to 59f70e8 Compare December 27, 2024 11:32
@rustbot

This comment has been minimized.

Comment on lines 414 to 450
/// Represents an atomic I/O operation that handles data transfer between memory regions.
/// Supports contiguous memory layouts for efficient I/O operations.
#[derive(Clone)]
pub struct IoTransferOperation<'tcx> {
Copy link
Member

@RalfJung RalfJung Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the point of all the new things you added here.

What I would expect is simply a new argument to read that has a type like DynMachineCallback<Result<u64, IoError>>. This can replace the existing dest: &MPlaceTy<'tcx>.

Copy link
Contributor Author

@shamb0 shamb0 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the point of all the new things you added here.

What I would expect is simply a new argument to read that has a type like DynMachineCallback<Result<u64, IoError>>. This can replace the existing dest: &MPlaceTy<'tcx>.

I need your guidance on the following:

  • I have not modified the signature of the FileDescription::read() function. Instead, I introduced a new function, FileDescription::read_with_callback(), and integrated DynMachineCallback<Result<u64, IoError>> as a replacement for dest: &MPlaceTy<'tcx>. Below is the function definition for reference:

    fn read_with_callback<'tcx>(
        self: FileDescriptionRef<Self>,
        _communicate_allowed: bool,
        _ptr: Pointer,
        _len: usize,
        _completion_callback: DynFileIOCallback<'tcx>,
        _ecx: &mut MiriInterpCx<'tcx>,
    ) -> InterpResult<'tcx> { }
  • Modifying the signature of FileDescription::read() will affect the following modules. I need your input on how to adapt the signature changes for these modules:

    FileDescription for io::Stdin  
    FileDescription for io::Stdout  
    FileDescription for io::Stderr  
    FileDescription for NullOutput  
    FileDescription for AnonSocket  
    FileDescription for Epoll  
    FileDescription for EventFd  

Let me know your thoughts and suggestions on how to proceed.

@rustbot ready

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying the signature of FileDescription::read() will affect the following modules.

Correct, that's why this is a non-trivial change. But we don't want to have two read functions for everything, so this work is necessary.

I have sketched the new signature above: replace dest: &MPlaceTy<'tcx> by finish: DynMachineCallback<Result<u64, IoError>>. The existing implementations should be adjusted to call the callback instead of writing to dest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I would like your input on the top-level integration changes and testing strategy.

  • The integration of the updated fn read() signature has been completed for the following modules. Recent updates are now included as part of this patch:

        impl FileDescription for io::Stdin
        impl FileDescription for FileHandle
        impl FileDescription for AnonSocket
        impl FileDescription for EventFd
  • I ran the following command to test the changes, and the results align with the upstream master test results:

$ MIRIFLAGS=-Zmiri-disable-isolation MIRI_BACKTRACE=1 ./miri test

FAILURES:
    tests/pass/alloc-access-tracking.rs
    tests/pass/getpid.rs (revision with_isolation)
    tests/pass/shims/time-with-isolation.rs
    tests/pass/shims/env/var.rs
    tests/pass/panic/thread_panic.rs
    tests/pass/panic/concurrent-panic.rs
    tests/pass/panic/nested_panic_caught.rs
    tests/pass/panic/catch_panic.rs

test result: FAIL. 8 failed; 350 passed; 6 ignored

Error:
   0: ui tests in tests/pass for x86_64-unknown-linux-gnu failed
   1: tests failed
  • Integration for FileHandle is complete, but I am currently blocked on integrating the updated fn read() signature for the other modules (io::Stdin, AnonSocket, and EventFd).

  • Do we need to make any updates to fn emulate_foreign_item_inner() in src/shims/unix/foreign_items.rs to accommodate the changes in the fn read() signature?

@RalfJung RalfJung added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete S-blocked Status: blocked on something happening somewhere else labels Jan 2, 2025
@RalfJung
Copy link
Member

RalfJung commented Jan 2, 2025

Now that #4106 landed, please rebase this over the latest miri HEAD.

@shamb0 shamb0 force-pushed the support-shim-fs-atomic-read branch from 59f70e8 to a54505c Compare January 8, 2025 05:38
@@ -2,7 +2,7 @@ use std::collections::VecDeque;

use rustc_index::Idx;

use super::thread::DynUnblockCallback;
use super::thread::DyMachineCallback;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have changes in the first commit which are undone by the second commit -- please clea that up by squashing the two commits.

@shamb0 shamb0 force-pushed the support-shim-fs-atomic-read branch 2 times, most recently from d98f7a2 to dd7bea4 Compare January 9, 2025 03:15
@rustbot rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Jan 9, 2025
   - Implementing atomic reads for contiguous buffers
   - Supports read operations with callback-based completion.

Signed-off-by: shamb0 <[email protected]>
@shamb0 shamb0 force-pushed the support-shim-fs-atomic-read branch from dd7bea4 to 2226854 Compare January 11, 2025 13:27
Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have again dome some large-scale changes, this time adding a Mutex, without asking any questions. There shouldn't be any large-scale changes needed, other than the direct consequences of changing the signature, just changing a few lines here and there. If you find yourself doing anything else, stop and reconsider. Maybe ask questions on Zulip. It's a waste of effort to implement a bunch of stuff that shouldn't exist in the first place.

I ran the following command to test the changes, and the results align with the upstream master test results:

The command to test Miri is ./miri test; do not set MIRIFLAGS. All tests should pass.

Do we need to make any updates to fn emulate_foreign_item_inner() in src/shims/unix/foreign_items.rs to accommodate the changes in the fn read() signature?

I don't know, you will notice that as you push the changes through the repo. But I expect no, the changes likely end here.

I would like your input on the top-level integration changes and testing strategy.

I don't have the time to prepare detailed mentoring instructions for you here, unfortunately. I described the intended design of the change and gave some feedback on your concrete implementation; if that is not sufficient for you to implement the PR (modulo issues with the Miri/rustc APIs) I suggest you pick something easier. Miri is a challenging codebase to contribute to. :)

@@ -121,6 +121,10 @@ impl<T: FileDescription + 'static> FileDescriptionExt for T {

pub type DynFileDescriptionRef = FileDescriptionRef<dyn FileDescription>;

/// Represents a dynamic callback for file I/O operations that is invoked upon completion.
/// The callback receives either the number of bytes successfully read (u64) or an IoError.
pub type DynFileDescriptionCallback<'tcx> = DynMachineCallback<'tcx, Result<u64, IoError>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DynFileOpFinishCallback would be a better name IMO.

Comment on lines +228 to +236
// Write the successfully read bytes to the destination pointer
ecx.write_bytes_ptr(ptr, bytes[..actual_read_size].iter().copied())?;

let Ok(read_size) = u64::try_from(actual_read_size) else {
throw_unsup_format!(
"Read operation returned size {} which exceeds maximum allowed value",
actual_read_size
)
};
Copy link
Member

@RalfJung RalfJung Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep using return_read_success; that function should now take the callback instead of dest.

The PR shouldn't change any of the logic nor add new logic; this kind of reshuffling of code just makes it unnecessarily hard to review. For instance, why did you move the let mut bytes around, or add a new comment in code that you didn't change?

@@ -203,7 +203,7 @@ pub trait EvalContextExt<'tcx>: crate::MiriInterpCxExt<'tcx> {
interp_ok(Scalar::from_i32(this.try_unwrap_io_result(result)?))
}

/// Read data from `fd` into buffer specified by `buf` and `count`.
/// Reads data from a file descriptor using callback-based completion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this comment?

Comment on lines +259 to +260
Err(_err_code) => {
this.set_last_error_and_return(LibcError("EIO"), &result_destination)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why EIO? And why are you ignoring err_code here?

// We want to read at most `count` bytes. We are sure that `count` is not negative
// because it was a target's `usize`. Also we are sure that its smaller than
// `usize::MAX` because it is bounded by the host's `isize`.

// Clone the result destination for use in the completion callback
let result_destination = dest.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please just call this dest as we usually do.

@@ -23,6 +26,92 @@ use crate::*;
struct FileHandle {
file: File,
writable: bool,
/// Mutex for synchronizing file access across threads.
file_lock: MutexRef,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, what is this? No mutex should be added, please undo all that.

Comment on lines -298 to -300
} else {
// Synchronize with all prior `write` calls to this FD.
ecx.acquire_clock(&eventfd.clock.borrow());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are more unnecessary changes. Why did you remove the else and use an early return instead? Again, your PR should only change code that has to change for the callback to work. There are also a bunch of new comments and empty lines here for some reason.

But other than that, the changes in this file look exactly right!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Waiting for a review to complete
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants