Rust Raw Pointers

A raw pointer in Rust (such as *mut T or *const T) is a low-level pointer type similar to those found in C. Unlike references in Rust, raw pointers:

  • Do not have the same safety guarantees.
  • Do not carry any lifetime or borrowing information.
  • Can be null or dangling.
  • Must be dereferenced within an unsafe block because the compiler can’t check that they’re valid.

Representation of raw pointer of T

The Raw Pointer Itself ptr

Consider a raw pointer defined as:

let ptr: mut T = /  some pointer value */;
  • Memory Address Representation: The raw pointer ptr is essentially just a memory address. For a sized type T, ptr is stored as a single machine word (e.g., 64 bits on a 64-bit architecture) representing the address in memory where a T is (or is supposed to be) located.
    • Example: If ptr holds the value 0x7ffdf000, it references the memory at that address.
  • Fat Pointers (for DSTs): If T is an unsized type (for example, a slice [T] or a trait object), the pointer is a fat pointer. This means it contains extra metadata in addition to the memory address. For a slice, the fat pointer includes both:
    • A pointer to the data.
    • The length of the slice.

However, for most simple cases with sized types, we only deal with a single memory address.

Dereferencing the Raw Pointer *ptr

When we dereference the raw pointer with the * operator, like so:

let value: T = unsafe { *ptr };

  • What Happens: Dereferencing accesses the value stored at the memory address contained in ptr. The expression *ptr yields a value of type T.
  • Value Representation:
    • The value represented by *ptr is the sequence of bytes stored at that address, formatted according to the memory layout of T.
    • For a simple type like u32, *ptr might represent 4 bytes that encode an integer (e.g., 0x01 0x00 0x00 0x00 for the number 1 on a little-endian machine).
    • For a struct, *ptr would be the contiguous bytes for each field (along with any padding that the compiler introduces according to the struct’s layout).
  • Example: If we have:
let x: u32 = 42;
let ptr: *const u32 = &x as *const u32;

Here, ptr is the memory address where x is stored, and dereferencing it returns the value 42 (which might be represented in binary as 4 bytes, e.g., 0x2A 0x00 0x00 0x00 on a little-endian system).

Summary of Representations

  • ptr (the raw pointer itself):
    • Type: *mut T or *const T.
    • Representation:
      • For sized types, a single machine word holding the memory address.
      • For unsized types (fat pointers), a tuple of the memory address and additional metadata (like length for slices or a vtable pointer for trait objects).
  • *ptr (dereferencing the pointer):
    • Type: T.
    • Representation:
      • The actual bytes stored at the memory location pointed to by ptr, formatted according to T’s memory layout.
      • This is the concrete data that lives at that address (for example, an integer, a struct, etc.).

Unsafe for dereferencing

Consider the DDoS example using Aya from my article. In the following snippet, we see a more complex instance of creating and dereferencing raw pointers:

let ip_hdr: *mut Ipv4Hdr = get_mut_ptr_at(&ctx, EthHdr::LEN)?; // Safe: pointer creation

match unsafe { (*ip_hdr).proto } { // Unsafe: dereferencing
    IpProto::Udp => {}
    _ => return Ok(xdp_action::XDP_PASS),
}
  • Creating a Raw Pointer: Creating a raw pointer (e.g., *mut Ipv4Hdr) is safe because it only involves storing a memory address. At this stage, Rust does not enforce any safety guarantees since no memory is actually accessed.
  • Dereferencing a Raw Pointer: Dereferencing a raw pointer (e.g., *ip_hdr) is unsafe because it accesses the memory location the pointer refers to. When dereferencing, Rust cannot guarantee:
    • The pointer is non-null.
    • The memory is valid (i.e., it has not been freed or reallocated).
    • The pointer is properly aligned.
    • There are no aliasing violations (such as concurrent mutable access).

Coercing a Reference to a Raw Pointer

At runtime, a raw pointer and a reference pointing to the same piece of data have an identical representation. In fact, an &T reference will implicitly coerce to an const T raw pointer in safe code and similarly for the mut variants (both coercions can be performed explicitly with, respectively, value as *const T and value as *mut T).

Ref https://web.mit.edu/rust-lang_v1.25/arch/amd64_ubuntu1404/share/doc/rust/html/book/first-edition/raw-pointers.html

Coercion from a reference to a raw pointer is straightforward and safe. The conversion itself does not require an unsafe block because references are already guaranteed to be valid (non-null and aligned).

From an Immutable Reference ( &T):

Implicit Coercion: When we have a reference, we can directly assign it to a raw pointer of type *const T.

let value = 42;
let ref_value: &i32 = &value;
let raw_value: *const i32 = ref_value; // Implicitly coerced to a raw pointer

From a Mutable Reference ( &mut T):

To a Mutable Raw Pointer: Similarly, a mutable reference can be coerced into a *mut T.

let mut mutable_value = 100;
let ref_mut_value: &mut i32 = &mut mutable_value;
let raw_mut_value: *mut i32 = ref_mut_value; // Implicit coercion

To an Immutable Raw Pointer: A mutable reference can also be coerced into a *const T (by first being viewed as an immutable reference).

let raw_const_value: *const i32 = ref_mut_value; // Coerced as &mut i32 -> &i32 -> *const i32

Important Note: we cannot coerce an immutable reference ( &T) into a mutable raw pointer ( *mut T) because that would break Rust's guarantees about mutability and aliasing.

Moving Out of a Value Behind a Raw Pointer

Consider the following code:

struct NotCopy(String);
let x = NotCopy(String::from("Hello"));
let ptr: *const NotCopy = &x;
// Error
let y: NotCopy = unsafe { *ptr }; // cannot move out of `*ptr` which is behind a raw pointer

Understanding the Error

  1. Non-Copy Type:
    • The struct NotCopy does not implement the Copy trait because it contains a String, and String is not Copy.
    • In Rust, moving a value (as opposed to copying it) transfers ownership, and after a move the original value cannot be used.
  2. Dereferencing a Raw Pointer:
    • When we write *ptr, we are dereferencing the raw pointer. For types that are not Copy, this operation attempts to move the value out of the memory location.
    • Even though the operation is in an unsafe block (which means we’re telling the compiler “I know what I’m doing”), Rust still enforces its move semantics at compile time.
  3. Moving from Behind a Pointer:
    • The compiler error is essentially saying: "we cannot move the value out of ptr because that value is not Copy and it lives somewhere that is borrowed (via the pointer)."
    • In safe Rust, we cannot move out of a reference (or a raw pointer in this case) because that would leave the original location in an invalid state while it’s still accessible by its original owner ( x in our example).

What’s Really Happening

  • Ownership & Move Semantics:
    • The value x owns its data. When we attempt to execute unsafe *ptr, we try to transfer that ownership (i.e., move the value) from x via a raw pointer. Rust prevents this because x would still be valid after the move attempt, leading to two owners or leaving x in an undefined state, which violates Rust safety guarantees.
  • Raw Pointers and Safety:
    • Raw pointers ( const T or mut T) allow us to bypass some of Rust's safety checks, but they do not bypass ownership and move rules. The compiler ensures that dereferencing a pointer to a non- Copy type does not inadvertently move the value without proper precautions.

Use std::ptr::read to Move the Value

If we truly want to move the value out of the pointer (transferring ownership from x to y), we can use std::ptr::read:

use std::ptr;

let x = NotCopy(String::from("Hello"));
let ptr: *const NotCopy = &x;

// Move the value out of the pointer.
let y: NotCopy = unsafe { ptr::read(ptr) };

// Prevent `x` from being dropped, since its value has been moved.
std::mem::forget(x);

Important Notes:

  • After using ptr::read, the memory where x resided still exists but now holds a “moved-from” value.
  • Accessing x after this operation is undefined behavior.
  • Calling std::mem::forget(x) ensures that x's destructor is not run, preventing a potential double-drop.

So the main error error occurs because we are trying to move (transfer ownership of) a non-Copy value ( NotCopy) out of a raw pointer. Moving out of a value behind a pointer (or reference) is disallowed because it would leave the original owner (x) in an invalid state.

Share Note
rocket

© 2023 KungFuDev made with love / cd 💜

Heavily inspired/copied from shuttle.rs