Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

This book explores how different Rust constructs translate to ARM64/AArch64 assembly.

⚠🚧 The book is still under construction. New chapters will be added and the existing ones might be modified.

Since compilation involves multiple intermediate steps, we will trace through HIR, MIR and LLVM IR when it helps explain the final assembly output. We will only discuss the Rust frontend, not the LLVM backend. Still, for completeness, good documentation for the LLVM backend can be found here.

The Rust compiler overview can be found here.

Prerequisites

Rust compiler

$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

AArch64 musl libraries

$ rustup target install aarch64-unknown-linux-musl

Rust nightly toolchain

This is necessary because we will be using some nightly features.

$ rustup install nightly
$ rustup default nightly

Alternatively, just override the default toolchain in your working directory:

$ rustup override set nightly

Rust default config

$ export CARGO_BUILD_TARGET=aarch64-unknown-linux-musl
$ export CARGO_TARGET_AARCH64_UNKNOWN_LINUX_MUSL_LINKER=aarch64-linux-gnu-gcc
$ export CARGO_TARGET_AARCH64_UNKNOWN_LINUX_MUSL_RUNNER="qemu-aarch64"

Alternatively, add the following to .cargo/config.toml:

[build]
target = "aarch64-unknown-linux-musl"

[target.aarch64-unknown-linux-musl]
linker = "aarch64-linux-gnu-gcc"
runner = "qemu-aarch64"

AArch64 cross-compiler + binutils + sysroot

$ sudo dnf install gcc-aarch64-linux-gnu
$ sudo dnf install binutils-aarch64-linux-gnu
$ sudo dnf install sysroot-aarch64-fc41-glibc

LLVM

Necessary for binutils, e.g. llvm-objdump.

$ sudo dnf install llvm

rustfilt (optional)

If you need to manually demangle a symbol, rustfilt is very convenient:

$ cargo install rustfilt
$ echo _ZN8rust_lab4main17hf9a0ba7e2c977e69E | rustfilt
rust_lab::main

QEMU

$ sudo dnf install qemu-user

Ghidra

$ wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.3.2_build/ghidra_11.3.2_PUBLIC_20250415.zip

Test the cross-compilation setup

With a Cargo project:

$ cargo init
$ cargo build
$ cargo run --quiet
Hello, world!

Without a Cargo project:

$ echo 'fn main() { println!("Hello ARM64!"); }' > test.rs
$ rustc --target aarch64-unknown-linux-musl -C linker=aarch64-linux-gnu-gcc test.rs
$ qemu-aarch64 ./test
Hello ARM64!

References

Rust

ARM64

Hello, world!

Source

Initialize a new workspace with cargo init.

fn main() {
    println!("Hello, world!");
}

println! is a macro that wraps the _print function.

$ cargo rustc --release --quiet -- -Z unpretty=expanded
#![feature(prelude_import)]
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
fn main() { { ::std::io::_print(format_args!("Hello, world!\n")); }; }

Alternatively, you can see all macro expansions (including built-in ones) in the HIR:

$ cargo rustc --release --quiet -- -Z unpretty=hir
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
fn main() {
    { ::std::io::_print(format_arguments::new_const(&["Hello, world!\n"])); };
}

More information about declarative macros can be found in chapter Declarative macros and chapter Built-in declarative macros.

Build

$ cargo rustc --release

Ghidra

Load the binary into Ghidra and auto-analyze it.

rust_lab::main

As we saw above, println! is expanded to a _print call, which accepts an Arguments struct.

While reconstructing the Arguments type, the type size information is very useful. Note that the compiler might reorder the struct fields.

$ cargo rustc --release --quiet -- -Z print-type-sizes 
print-type-size type: `core::fmt::rt::Placeholder`: 48 bytes, alignment: 8 bytes
print-type-size     field `.precision`: 16 bytes
print-type-size     field `.width`: 16 bytes
print-type-size     field `.position`: 8 bytes
print-type-size     field `.flags`: 4 bytes
print-type-size     end padding: 4 bytes
print-type-size type: `std::fmt::Arguments<'_>`: 48 bytes, alignment: 8 bytes
print-type-size     field `.pieces`: 16 bytes
print-type-size     field `.args`: 16 bytes
print-type-size     field `.fmt`: 16 bytes
...
print-type-size type: `core::fmt::rt::Argument<'_>`: 16 bytes, alignment: 8 bytes
print-type-size     field `.ty`: 16 bytes
print-type-size type: `core::fmt::rt::ArgumentType<'_>`: 16 bytes, alignment: 8 bytes
print-type-size     variant `Placeholder`: 16 bytes
print-type-size         field `.value`: 8 bytes
print-type-size         field `.formatter`: 8 bytes
print-type-size         field `._lifetime`: 0 bytes
print-type-size     variant `Count`: 10 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.0`: 2 bytes, alignment: 2 bytes
print-type-size type: `core::fmt::rt::Count`: 16 bytes, alignment: 8 bytes
print-type-size     discriminant: 2 bytes
print-type-size     variant `Param`: 14 bytes
print-type-size         padding: 6 bytes
print-type-size         field `.0`: 8 bytes, alignment: 8 bytes
print-type-size     variant `Is`: 2 bytes
print-type-size         field `.0`: 2 bytes
print-type-size     variant `Implied`: 0 bytes
print-type-size type: `std::option::Option<&[core::fmt::rt::Placeholder]>`: 16 bytes, alignment: 8 bytes
print-type-size     variant `Some`: 16 bytes
print-type-size         field `.0`: 16 bytes
print-type-size     variant `None`: 0 bytes
...

The simplified Arguments type can be represented like this (explained in detail later). This is not valid C syntax of course, as &, [] or <> cannot be used in C struct names.

struct &[&str] {
    ptr64 ptr;
    usize len;
};

struct &[Argument] {
    ptr64 ptr;
    usize len;
};

struct Option<&[Placeholder]> {
    ptr64 ptr;
    usize len;
};

struct Arguments {
    struct &[&str] pieces;
    struct &[Argument] args;
    struct Option<&[Placeholder]> fmt;
};

Listing:

                             **************************************************************
                             * rust_lab::main                                             *
                             **************************************************************
                             undefined __rustcall main()
             undefined         <UNASSIGNED>   <RETURN>
             undefined8        Stack[-0x10]:8 local_10                                XREF[2]:     00401af4(W), 
                                                                                                   00401b1c(R)  
             Arguments         Stack[-0x40]   arguments                               XREF[1,2]:   00401b04(W), 
                                                                                                   00401b14(W), 
                                                                                                   00401b10(W)  
                             _ZN8rust_lab4main17hf9a0ba7e2c977e69E           XREF[3]:     main:00401b38(*), 00453c6c, 
                             rust_lab::main                                               004648a8(*)  
        00401af0 ff 03 01 d1     sub        sp,sp,#0x40
        00401af4 fe 1b 00 f9     str        x30,[sp, #local_10]
        00401af8 68 03 00 90     adrp       x8,0x46d000
        00401afc 08 61 1c 91     add        x8,x8,#0x718
        00401b00 29 00 80 52     mov        w9,#0x1
                             store pieces.ptr and pieces.len
        00401b04 e8 27 00 a9     stp        x8=>PTR_s_Hello,_world!_0046d718,x9,[sp]=>argu   = 0044c1a0
        00401b08 08 01 80 52     mov        w8,#0x8
                             move the struct address to the first argument
        00401b0c e0 03 00 91     mov        x0,sp
                             zero out args.len and fmt.ptr
        00401b10 ff ff 01 a9     stp        xzr,xzr,[sp, #arguments+0x18]
                             store args.ptr
        00401b14 e8 0b 00 f9     str        x8,[sp, #arguments.args.ptr]
        00401b18 da 63 00 94     bl         std::io::stdio::_print                           undefined _print()
        00401b1c fe 1b 40 f9     ldr        x30,[sp, #local_10]
        00401b20 ff 03 01 91     add        sp,sp,#0x40
        00401b24 c0 03 5f d6     ret

The logic is simple: it constructs an Arguments struct on the stack and passes the address of it via sp to the _print function.

Decompiled code (after creating the Arguments type in the Structure Editor and applying it in the code):

/* WARNING: Unknown calling convention: __rustcall */
/* rust_lab::main */

void __rustcall rust_lab::main(void)

{
  Arguments arguments;
  
                    /* store pieces.ptr and pieces.len */
  arguments.pieces.ptr = (ptr64)&PTR_s_Hello,_world!_0046d718;
  arguments.pieces.len = 1;
                    /* move the struct address to the first argument */
                    /* zero out args.len and fmt.ptr */
  arguments.args.len = 0;
  arguments.fmt.ptr = (ptr64)0x0;
                    /* store args.ptr */
  arguments.args.ptr = (ptr64)0x8;
  std::io::stdio::_print(&arguments);
  return;
}

From the Rust reference:

Though you should not rely on this, all pointers to DSTs are currently twice the size of the size of usize and have the same alignment.

In practice, this means that the fields of the struct Arguments are 16 bytes in memory: an 8 byte pointer and an 8 byte length. This is confirmed by the output of -Z print-type-sizes above.

pieces is a reference to a slice of str references (&str). In this case, pieces references only 1 &str which is also an 8 byte pointer and an 8 byte length.

                             PTR_s_Hello,_world!_0046d718                    XREF[1]:     main:00401b04(*)  
        0046d718 a0 c1 44        addr       s_Hello,_world!_0044c1a0                         = "Hello, world!\n"
                 00 00 00 
                 00 00
        0046d720 0e              ??         0Eh
        0046d721 00              ??         00h
        0046d722 00              ??         00h
        0046d723 00              ??         00h
        0046d724 00              ??         00h
        0046d725 00              ??         00h
        0046d726 00              ??         00h
        0046d727 00              ??         00h

args is a reference to a slice of Argument items and it references an empty slice now. Empty slices do not point to null but their size is 0. They point to valid addresses instead, depending on the alignment (8 bytes here).

print-type-size type: `core::fmt::rt::Argument<'_>`: 16 bytes, alignment: 8 bytes
print-type-size     field `.ty`: 16 bytes
fn main() {
    let empty_u8: &[u8] = &[];      // 1-byte aligned
    let empty_u32: &[u32] = &[];    // 4-byte aligned  
    let empty_u64: &[u64] = &[];    // 8-byte aligned
   
    println!("u8 address: {}", empty_u8.as_ptr() as usize);
    println!("u32 address: {}", empty_u32.as_ptr() as usize);
    println!("u64 address: {}", empty_u64.as_ptr() as usize);
}
$ cargo run --release --quiet
u8 address: 1
u32 address: 4
u64 address: 8

fmt is an optional reference to a slice of Placeholder items. For Option<&[T]>, Rust -often- uses null pointer optimization where None is represented by a null pointer. Therefore, the length field is irrelevant and is not populated in the current example.

print-type-size type: `std::option::Option<&[core::fmt::rt::Placeholder]>`: 16 bytes, alignment: 8 bytes
print-type-size     variant `Some`: 16 bytes
print-type-size         field `.0`: 16 bytes
print-type-size     variant `None`: 0 bytes

rust-gdb

We can verify the results of our static analysis using rust-gdb (or rust-lldb) which supports Rust types.

First we need to create a debug build where the function new_const constructing the Arguments struct is not optimized and inlined.

$ cargo rustc

Then we start a GDB server and connect to it with the rust-gdb client. We will examine the Arguments struct returned by new_const.

$ qemu-aarch64 -g 1234 target/aarch64-unknown-linux-musl/debug/rust-lab
$ rust-gdb -q -ex "target remote localhost:1234" target/aarch64-unknown-linux-musl/debug/rust-lab
Reading symbols from target/aarch64-unknown-linux-musl/debug/rust-lab...
Remote debugging using localhost:1234

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
0x00000000004019bc in _start ()
(gdb) b rust_lab::main
Breakpoint 1 at 0x401bd4: file src/main.rs, line 2.
(gdb) c
Continuing.

Breakpoint 1, rust_lab::main () at src/main.rs:2
2	    println!("Hello, world!");
(gdb) disas
Dump of assembler code for function _ZN8rust_lab4main17hb3ccde9ab543d852E:
   0x0000000000401bc0 <+0>:	sub	sp, sp, #0x50
   0x0000000000401bc4 <+4>:	stp	x29, x30, [sp, #64]
   0x0000000000401bc8 <+8>:	add	x29, sp, #0x40
   0x0000000000401bcc <+12>:	add	x8, sp, #0x10
   0x0000000000401bd0 <+16>:	str	x8, [sp, #8]
=> 0x0000000000401bd4 <+20>:	adrp	x0, 0x46d000
   0x0000000000401bd8 <+24>:	add	x0, x0, #0x710
   0x0000000000401bdc <+28>:	bl	0x401b54 <_ZN4core3fmt2rt38_$LT$impl$u20$core..fmt..Arguments$GT$9new_const17h2005e5bc47942c4fE>
   0x0000000000401be0 <+32>:	ldr	x0, [sp, #8]
   0x0000000000401be4 <+36>:	bl	0x41abf0 <_ZN3std2io5stdio6_print17h5a3b0843896b0124E>
   0x0000000000401be8 <+40>:	ldp	x29, x30, [sp, #64]
   0x0000000000401bec <+44>:	add	sp, sp, #0x50
   0x0000000000401bf0 <+48>:	ret
End of assembler dump.
(gdb) si 3
core::fmt::Arguments::new_const<1> (pieces=0x7f867d3fe830)
    at /home/gemesa/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:226
226	    pub const fn new_const<const N: usize>(pieces: &'a [&'static str; N]) -> Self {
(gdb) fin
Run till exit from #0  core::fmt::Arguments::new_const<1> (pieces=0x7f867d3fe830)
    at /home/gemesa/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:226
0x0000000000401be0 in rust_lab::main () at src/main.rs:2
2	    println!("Hello, world!");
Value returned is $1 = core::fmt::Arguments {pieces: &[&str](size=1) = {"Hello, world!\n"}, fmt: core::option::Option<&[core::fmt::rt::Placeholder]>::None, args: &[core::fmt::rt::Argument](size=0)}
(gdb)

The value returned matches the expected Arguments struct.

Box

Source

Initialize a new workspace with cargo init --lib.

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn create_boxed_value(n: i32) -> Box<i32> {
    Box::new(n * 2)
}

#[unsafe(no_mangle)]
pub fn process_box(boxed: Box<i32>) -> i32 {
    *boxed + 10
}
}

The Box type is described in the official docs in detail. Further information can be found here. We are using no_mangle to simplify things, the reason can be found here.

Build

$ cargo rustc --release -- --emit obj

llvm-objdump

For this example we will use llvm-objdump instead of Ghidra, as Ghidra does not support the relocation type R_AARCH64_ADR_GOT_PAGE.

Alternatively, the compiler can generate assembly output as well, although it is a bit less readable because of all the .cfi_* directives:

$ cargo rustc --release -- --emit asm
$ cat target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.s
...
create_boxed_value:
	.cfi_startproc
	stp	x29, x30, [sp, #-32]!
	.cfi_def_cfa_offset 32
	str	x19, [sp, #16]
	mov	x29, sp
	.cfi_def_cfa w29, 32
	.cfi_offset w19, -16
	.cfi_offset w30, -24
	.cfi_offset w29, -32
	.cfi_remember_state
	adrp	x8, :got:__rust_no_alloc_shim_is_unstable
	mov	w19, w0
	mov	w0, #4
	ldr	x8, [x8, :got_lo12:__rust_no_alloc_shim_is_unstable]
	mov	w1, #4
	ldrb	wzr, [x8]
	bl	_RNvCshIQntqZdYTC_7___rustc12___rust_alloc
	cbz	x0, .LBB0_2
	lsl	w8, w19, #1
	str	w8, [x0]
	.cfi_def_cfa wsp, 32
	ldr	x19, [sp, #16]
	ldp	x29, x30, [sp], #32
	.cfi_def_cfa_offset 0
	.cfi_restore w19
	.cfi_restore w30
	.cfi_restore w29
	ret
.LBB0_2:
	.cfi_restore_state
	mov	w0, #4
	mov	w1, #4
	bl	_ZN5alloc5alloc18handle_alloc_error17h0e34a69f1cc3072eE
...

create_boxed_value

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn create_boxed_value(n: i32) -> Box<i32> {
    Box::new(n * 2)
}
}

Disassembly (the output is piped through rustfilt to demangle symbols):

$ llvm-objdump -r --disassemble-symbols=create_boxed_value target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.o | rustfilt 

target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.o:	file format elf64-littleaarch64

Disassembly of section .text.create_boxed_value:

0000000000000000 <create_boxed_value>:
       0: a9be7bfd     	stp	x29, x30, [sp, #-0x20]!
       4: f9000bf3     	str	x19, [sp, #0x10]
       8: 910003fd     	mov	x29, sp
       c: 90000008     	adrp	x8, 0x0 <create_boxed_value>
		000000000000000c:  R_AARCH64_ADR_GOT_PAGE	__rust_no_alloc_shim_is_unstable
      10: 2a0003f3     	mov	w19, w0
      14: 52800080     	mov	w0, #0x4                // =4
      18: f9400108     	ldr	x8, [x8]
		0000000000000018:  R_AARCH64_LD64_GOT_LO12_NC	__rust_no_alloc_shim_is_unstable
      1c: 52800081     	mov	w1, #0x4                // =4
      20: 3940011f     	ldrb	wzr, [x8]
      24: 94000000     	bl	0x24 <create_boxed_value+0x24>
		0000000000000024:  R_AARCH64_CALL26	__rustc::__rust_alloc
      28: b40000c0     	cbz	x0, 0x40 <create_boxed_value+0x40>
      2c: 531f7a68     	lsl	w8, w19, #1
      30: b9000008     	str	w8, [x0]
      34: f9400bf3     	ldr	x19, [sp, #0x10]
      38: a8c27bfd     	ldp	x29, x30, [sp], #0x20
      3c: d65f03c0     	ret
      40: 52800080     	mov	w0, #0x4                // =4
      44: 52800081     	mov	w1, #0x4                // =4
      48: 94000000     	bl	0x48 <create_boxed_value+0x48>
		0000000000000048:  R_AARCH64_CALL26	alloc::alloc::handle_alloc_error

There is one piece of information we need to be familiar with to fully understand all of the instructions. __rust_no_alloc_shim_is_unstable is an internal variable, ldrb wzr,[x8] forces the linker to include the alloc shim. The tracking issue is here for anyone who might be interested.

Since our input is i32, __rust_alloc is called with the following arguments (size and alignment):

      14: 52800080     	mov	w0, #0x4                // =4
      1c: 52800081     	mov	w1, #0x4                // =4

If the allocation fails and returns null, handle_alloc_error is called:

      28: b40000c0     	cbz	x0, 0x40 <create_boxed_value+0x40>

After successful allocation, the input value is multiplied by 2 and stored in the allocated section:

      10: 2a0003f3     	mov	w19, w0
...
      2c: 531f7a68     	lsl	w8, w19, #1
      30: b9000008     	str	w8, [x0]

process_box

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn process_box(boxed: Box<i32>) -> i32 {
    *boxed + 10
}
}

Disassembly:

$ llvm-objdump -r --disassemble-symbols=process_box target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.o | rustfilt

target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.o:	file format elf64-littleaarch64

Disassembly of section .text.process_box:

0000000000000000 <process_box>:
       0: a9be7bfd     	stp	x29, x30, [sp, #-0x20]!
       4: f9000bf3     	str	x19, [sp, #0x10]
       8: 910003fd     	mov	x29, sp
       c: b9400013     	ldr	w19, [x0]
      10: 52800081     	mov	w1, #0x4                // =4
      14: 52800082     	mov	w2, #0x4                // =4
      18: 94000000     	bl	0x18 <process_box+0x18>
		0000000000000018:  R_AARCH64_CALL26	__rustc::__rust_dealloc
      1c: 11002a60     	add	w0, w19, #0xa
      20: f9400bf3     	ldr	x19, [sp, #0x10]
      24: a8c27bfd     	ldp	x29, x30, [sp], #0x20
      28: d65f03c0     	ret

The function takes ownership of the Box and then boxed goes out of scope at the end of the function, so the memory is automatically deallocated. Since the data stored in the Box is i32, __rust_dealloc is called with the following arguments (pointer, size and alignment):

      10: 52800081     	mov	w1, #0x4                // =4
      14: 52800082     	mov	w2, #0x4                // =4

x0 already holds the pointer to the allocated memory, since the Box is passed to the function.

10 is added to the value and the result is returned:

       c: b9400013     	ldr	w19, [x0]
...
      1c: 11002a60     	add	w0, w19, #0xa

Vec

Source

Initialize a new workspace with cargo init.

fn main() -> std::process::ExitCode {
    let mut vec = vec![1, 2, 3];
    vec.push(4);
    let ret = vec.pop().unwrap();
    std::process::ExitCode::from(ret)
}

vec! is a macro which generates different code based on the passed argument. In our test code we pass a list of elements so the following pattern matches:

#![allow(unused)]
fn main() {
    ($($x:expr),+ $(,)?) => (
        <[_]>::into_vec($crate::boxed::box_new([$($x),+]))
    );
}

More information about declarative macros can be found in chapter Declarative macros.

$ cargo rustc --release --quiet -- -Z unpretty=hir
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
fn main()
    ->
        std::process::ExitCode {
    let mut vec = <[_]>::into_vec(::alloc::boxed::box_new([1, 2, 3]));
    vec.push(4);
    let ret = vec.pop().unwrap();
    std::process::ExitCode::from(ret)
}

Build

$ cargo rustc --release

Ghidra

Load the binary into Ghidra and auto-analyze it.

rust_lab::main

The official docs explains Vec in detail. The most important parts for us:

The capacity of a vector is the amount of space allocated for any future elements that will be added onto the vector. This is not to be confused with the length of a vector, which specifies the number of actual elements within the vector. If a vector’s length exceeds its capacity, its capacity will automatically be increased, but its elements will have to be reallocated.

Vec is and always will be a (pointer, capacity, length) triplet.

The order of these fields is completely unspecified

If a Vec has allocated memory, then the memory it points to is on the heap

We can verify this:

$ cargo rustc --release --quiet -- -Zprint-type-sizes
...
print-type-size type: `std::vec::Vec<u8>`: 24 bytes, alignment: 8 bytes
print-type-size     field `.buf`: 16 bytes
print-type-size     field `.len`: 8 bytes
print-type-size type: `alloc::raw_vec::RawVec<u8>`: 16 bytes, alignment: 8 bytes
print-type-size     field `.inner`: 16 bytes
print-type-size     field `._marker`: 0 bytes
print-type-size type: `alloc::raw_vec::RawVecInner`: 16 bytes, alignment: 8 bytes
print-type-size     field `.cap`: 8 bytes
print-type-size     field `.ptr`: 8 bytes
print-type-size     field `.alloc`: 0 bytes
...

There are some abstractions involved, but ultimately the memory layout looks like this:

struct Vec {
    usize cap;
    ptr64 ptr;
    usize len;
};

Relevant types:

With this background information, we are ready to look at the decompiled code. Note that this is not the raw decompiled code. First we needed to create the Vec<u8> type in the Structure Editor and apply it in the code, then implement some further modifications, e.g., fix the prototype of __rust_alloc.

/* WARNING: Unknown calling convention: __rustcall */
/* rust_lab::main */

u8 __rustcall rust_lab::main(void)

{
  Vec<u8> vec;
  u8 ret;
  
  vec.ptr = __rustc::__rust_alloc(3,1);
  if (vec.ptr != (u8 *)0x0) {
    vec.ptr[2] = '\x03';
    vec.ptr[0] = '\x01';
    vec.ptr[1] = '\x02';
    vec.cap = 3;
    vec.len = 3;
                    /* try { // try from 00401958 to 00401967 has its CatchHandler @ 004019ac */
    alloc::raw_vec::RawVec<T,A>::grow_one(&vec,&PTR_s_src/main.rs_0046d860);
    vec.ptr[3] = 4;
    vec.len = 3;
    ret = vec.ptr[3];
    __rustc::__rust_dealloc(vec.ptr,vec.cap,1);
    return ret;
  }
                    /* WARNING: Subroutine does not return */
  alloc::alloc::handle_alloc_error(1,3);
}

The code is fairly self-explanatory at this point. But, for completeness, the logic is the following:

  • allocate a 3 byte section on the heap
  • initialize the heap data, capacity and length
  • increase the capacity (normally, length is increased as well, but in this case it is optimized out since we immediately pop the last element)
  • initialize the new heap data
  • save the last element in the return variable
  • deallocate the heap data
  • return the return variable

The only thing we might need to discuss is grow_one. If we check the implementation, we can see that it calls grow_amortized.

#![allow(unused)]
fn main() {
    fn grow_amortized(
        &mut self,
        len: usize,
        additional: usize,
        elem_layout: Layout,
    ) -> Result<(), TryReserveError> {
...
        let cap = cmp::max(self.cap.as_inner() * 2, required_cap);
        let cap = cmp::max(min_non_zero_cap(elem_layout.size()), cap);
...
}

By default, this function doubles the capacity. In our case, since the old capacity is 3, the new one would be 6, but since the Layout size is 1, it rounds up the capacity to 8.

#![allow(unused)]
fn main() {
// Tiny Vecs are dumb. Skip to:
// - 8 if the element size is 1, because any heap allocators is likely
//   to round up a request of less than 8 bytes to at least 8 bytes.
// ...
const fn min_non_zero_cap(size: usize) -> usize {
    if size == 1 {
        8
    } else if size <= 1024 {
        4
    } else {
        1
    }
}
}

Call graph of grow_one:

grow_one
    finish_grow
        __rust_realloc
        __rust_alloc

Raw decompiled code for reference:

/* WARNING: Unknown calling convention: __rustcall */
/* rust_lab::main */

undefined1 __rustcall rust_lab::main(void)

{
  undefined1 uVar1;
  undefined2 *puVar2;
  undefined8 local_38;
  undefined2 *local_30;
  undefined8 local_28;
  
  puVar2 = (undefined2 *)0x3;
  __rustc::__rust_alloc(3,1);
  if (puVar2 != (undefined2 *)0x0) {
    *(undefined1 *)(puVar2 + 1) = 3;
    *puVar2 = 0x201;
    local_38 = 3;
    local_28 = 3;
                    /* try { // try from 00401958 to 00401967 has its CatchHandler @ 004019ac */
    local_30 = puVar2;
    alloc::raw_vec::RawVec<T,A>::grow_one(&local_38,&PTR_s_src/main.rs_0046d860);
    *(undefined1 *)((long)local_30 + 3) = 4;
    local_28 = 3;
    uVar1 = *(undefined1 *)((long)local_30 + 3);
    __rustc::__rust_dealloc(local_30,local_38,1);
    return uVar1;
  }
                    /* WARNING: Subroutine does not return */
  alloc::alloc::handle_alloc_error(1,3);
}

Now that we see the big picture, we are ready to go through the listing. To easily match the listing with the decompiled code, it is annotated with pre-comments.

There is one additional piece of information we need to be familiar with to fully understand all of the instructions. __rust_no_alloc_shim_is_unstable is an internal variable, ldrb wzr,[x8]=>__rust_no_alloc_shim_is_unstable forces the linker to include the alloc shim. The tracking issue is here for anyone who might be interested.

Listing:

                             **************************************************************
                             * rust_lab::main                                             *
                             **************************************************************
                             undefined __rustcall main()
             undefined         <UNASSIGNED>   <RETURN>
             undefined8        Stack[-0x10]:8 local_10                                XREF[2]:     0040191c(W), 
                                                                                                   00401994(R)  
             undefined8        Stack[-0x20]:8 local_20                                XREF[2]:     00401918(W), 
                                                                                                   00401990(*)  
             Vec<u8>           Stack[-0x38]   vec                                     XREF[2,3]:   00401950(W), 
                                                                                                   0040197c(R), 
                                                                                                   00401968(R), 
                                                                                                   00401954(W), 
                                                                                                   00401980(W)  
             u8                HASH:5f59380   ret
                             _ZN8rust_lab4main17h4c095f7be815a79eE           XREF[4]:     main:004019f8(*), 004528cc, 
                             rust_lab::main                                               00453a8f(*), 00464cc0(*)  
        00401914 ff 03 01 d1     sub        sp,sp,#0x40
        00401918 fd 7b 02 a9     stp        x29,x30,[sp, #local_20]
        0040191c f3 1b 00 f9     str        x19,[sp, #local_10]
        00401920 fd 83 00 91     add        x29,sp,#0x20
        00401924 68 03 00 d0     adrp       x8,0x46f000
                             arg0: 3
        00401928 60 00 80 52     mov        w0,#0x3
                             arg1: 1
        0040192c 21 00 80 52     mov        w1,#0x1
        00401930 08 a1 47 f9     ldr        x8,[x8, #0xf40]=>->__rust_no_alloc_shim_is_uns   = 00470ae4
        00401934 73 00 80 52     mov        w19,#0x3
                             force the linker to include the alloc shim
        00401938 1f 01 40 39     ldrb       wzr,[x8]=>__rust_no_alloc_shim_is_unstable       = ??
        0040193c 34 00 00 94     bl         __rustc::__rust_alloc                            u8 * __rust_alloc(usize size, us
        00401940 00 03 00 b4     cbz        x0,LAB_004019a0
        00401944 28 40 80 52     mov        w8,#0x201
                             heap_data[2] = 3
        00401948 13 08 00 39     strb       w19,[x0, #0x2]
                             heap_data[0] = 1
                             heap_data[1] = 2
        0040194c 08 00 00 79     strh       w8,[x0]
                             init vec.cap and vec.ptr
        00401950 f3 83 00 a9     stp        x19,x0,[sp, #vec.cap]
                             init vec.len
        00401954 f3 0f 00 f9     str        x19,[sp, #vec.len]
                             try { // try from 00401958 to 00401967 has its CatchHandler @
                             LAB_00401958                                    XREF[1]:     00453a88(*)  
        00401958 61 03 00 90     adrp       x1,0x46d000
                             arg1: source location for debugging
        0040195c 21 80 21 91     add        x1=>PTR_s_src/main.rs_0046d860,x1,#0x860         = 0044aea0
                             arg0: address of vec
        00401960 e0 23 00 91     add        x0,sp,#0x8
                             increase cap from 3 to 8
        00401964 fd 0d 01 94     bl         alloc::raw_vec::RawVec<T,A>::grow_one            undefined grow_one()
                             } // end try from 00401958 to 00401967
                             LAB_00401968                                    XREF[1]:     00453a8d(*)  
        00401968 e8 0b 40 f9     ldr        x8,[sp, #vec.ptr]
        0040196c 89 00 80 52     mov        w9,#0x4
                             arg2: 1
        00401970 22 00 80 52     mov        w2,#0x1
                             vec.ptr[3] = 4
        00401974 09 0d 00 39     strb       w9,[x8, #0x3]
        00401978 68 00 80 52     mov        w8,#0x3
                             arg0: vec.ptr
                             arg1: vec.cap
        0040197c e1 83 40 a9     ldp        x1,x0,[sp, #vec.cap]
                             vec.len = 3
        00401980 e8 0f 00 f9     str        x8,[sp, #vec.len]
                             save return value
        00401984 13 0c 40 39     ldrb       w19,[x0, #0x3]
        00401988 22 00 00 94     bl         __rustc::__rust_dealloc                          void __rust_dealloc(u8 * ptr, us
                             return value
        0040198c e0 03 13 2a     mov        w0,w19
        00401990 fd 7b 42 a9     ldp        x29=>local_20,x30,[sp, #0x20]
        00401994 f3 1b 40 f9     ldr        x19,[sp, #local_10]
        00401998 ff 03 01 91     add        sp,sp,#0x40
        0040199c c0 03 5f d6     ret
                             LAB_004019a0                                    XREF[1]:     00401940(j)  
        004019a0 20 00 80 52     mov        w0,#0x1
        004019a4 61 00 80 52     mov        w1,#0x3
        004019a8 0f fe ff 97     bl         alloc::alloc::handle_alloc_error                 undefined handle_alloc_error()
                             -- Flow Override: CALL_RETURN (CALL_TERMINATOR)

rust-lldb

We can verify the results of our static analysis using rust-lldb.

Start a GDB server and connect to it with the rust-lldb client. We will examine our Vec struct just before deallocation.

$ qemu-aarch64 -g 1234 target/aarch64-unknown-linux-musl/release/rust-lab
$ rust-lldb --source-quietly -o "gdb-remote localhost:1234" target/aarch64-unknown-linux-musl/release/rust-lab
Current executable set to '/home/gemesa/git-repos/rust-lab/target/aarch64-unknown-linux-musl/release/rust-lab' (aarch64).
Process 1447629 stopped
* thread #1, stop reason = signal SIGTRAP
    frame #0: 0x00000000004017d4 rust-lab`_start
rust-lab`_start:
->  0x4017d4 <+0>:  mov    x29, #0x0 ; =0 
    0x4017d8 <+4>:  mov    x30, #0x0 ; =0 
    0x4017dc <+8>:  mov    x0, sp
    0x4017e0 <+12>: adrp   x1, 0
(lldb) b _ZN8rust_lab4main17h4c095f7be815a79eE
Breakpoint 2: where = rust-lab`rust_lab::main::h4c095f7be815a79e, address = 0x0000000000401914
(lldb) c
Process 1447629 resuming
Process 1447629 stopped
* thread #1, stop reason = breakpoint 2.1
    frame #0: 0x0000000000401914 rust-lab`rust_lab::main::h4c095f7be815a79e
rust-lab`rust_lab::main::h4c095f7be815a79e:
->  0x401914 <+0>:  sub    sp, sp, #0x40
    0x401918 <+4>:  stp    x29, x30, [sp, #0x20]
    0x40191c <+8>:  str    x19, [sp, #0x30]
    0x401920 <+12>: add    x29, sp, #0x20
(lldb) disas
rust-lab`rust_lab::main::h4c095f7be815a79e:
->  0x401914 <+0>:   sub    sp, sp, #0x40
    0x401918 <+4>:   stp    x29, x30, [sp, #0x20]
    0x40191c <+8>:   str    x19, [sp, #0x30]
    0x401920 <+12>:  add    x29, sp, #0x20
    0x401924 <+16>:  adrp   x8, 110
    0x401928 <+20>:  mov    w0, #0x3 ; =3 
    0x40192c <+24>:  mov    w1, #0x1 ; =1 
    0x401930 <+28>:  ldr    x8, [x8, #0xf40]
    0x401934 <+32>:  mov    w19, #0x3 ; =3 
    0x401938 <+36>:  ldrb   wzr, [x8]
    0x40193c <+40>:  bl     0x401a0c       ; __rustc::__rust_alloc
    0x401940 <+44>:  cbz    x0, 0x4019a0 ; <+140>
    0x401944 <+48>:  mov    w8, #0x201 ; =513 
    0x401948 <+52>:  strb   w19, [x0, #0x2]
    0x40194c <+56>:  strh   w8, [x0]
    0x401950 <+60>:  stp    x19, x0, [sp, #0x8]
    0x401954 <+64>:  str    x19, [sp, #0x18]
    0x401958 <+68>:  adrp   x1, 108
    0x40195c <+72>:  add    x1, x1, #0x860
    0x401960 <+76>:  add    x0, sp, #0x8
    0x401964 <+80>:  bl     0x445158       ; alloc::raw_vec::RawVec$LT$T$C$A$GT$::grow_one::h19885d150c1bd8f5
    0x401968 <+84>:  ldr    x8, [sp, #0x10]
    0x40196c <+88>:  mov    w9, #0x4 ; =4 
    0x401970 <+92>:  mov    w2, #0x1 ; =1 
    0x401974 <+96>:  strb   w9, [x8, #0x3]
    0x401978 <+100>: mov    w8, #0x3 ; =3 
    0x40197c <+104>: ldp    x1, x0, [sp, #0x8]
    0x401980 <+108>: str    x8, [sp, #0x18]
    0x401984 <+112>: ldrb   w19, [x0, #0x3]
    0x401988 <+116>: bl     0x401a10       ; __rustc::__rust_dealloc
    0x40198c <+120>: mov    w0, w19
    0x401990 <+124>: ldp    x29, x30, [sp, #0x20]
    0x401994 <+128>: ldr    x19, [sp, #0x30]
    0x401998 <+132>: add    sp, sp, #0x40
    0x40199c <+136>: ret    
    0x4019a0 <+140>: mov    w0, #0x1 ; =1 
    0x4019a4 <+144>: mov    w1, #0x3 ; =3 
    0x4019a8 <+148>: bl     0x4011e4       ; alloc::alloc::handle_alloc_error::h3005aad4027c4877
    0x4019ac <+152>: ldr    x1, [sp, #0x8]
    0x4019b0 <+156>: mov    x19, x0
    0x4019b4 <+160>: cbz    x1, 0x4019c4 ; <+176>
    0x4019b8 <+164>: ldr    x0, [sp, #0x10]
    0x4019bc <+168>: mov    w2, #0x1 ; =1 
    0x4019c0 <+172>: bl     0x401a10       ; __rustc::__rust_dealloc
    0x4019c4 <+176>: mov    x0, x19
    0x4019c8 <+180>: bl     0x433688       ; _Unwind_Resume
(lldb) b *0x401988
Breakpoint 3: where = rust-lab`rust_lab::main::h4c095f7be815a79e + 116, address = 0x0000000000401988
(lldb) c
Process 1447629 resuming
Process 1447629 stopped
* thread #1, stop reason = breakpoint 3.1
    frame #0: 0x0000000000401988 rust-lab`rust_lab::main::h4c095f7be815a79e + 116
rust-lab`rust_lab::main::h4c095f7be815a79e:
->  0x401988 <+116>: bl     0x401a10       ; __rustc::__rust_dealloc
    0x40198c <+120>: mov    w0, w19
    0x401990 <+124>: ldp    x29, x30, [sp, #0x20]
    0x401994 <+128>: ldr    x19, [sp, #0x30]
(lldb) x/g $sp+8
0x7f7d0dd8d8c8: 0x0000000000000008
(lldb) x/g $sp+16
0x7f7d0dd8d8d0: 0x00007f7d0fc0d040
(lldb) x/g 0x00007f7d0fc0d040
0x7f7d0fc0d040: 0x0000000004030201
(lldb) x/g $sp+24
0x7f7d0dd8d8d8: 0x0000000000000003

The capacity is 8, the heap data is [1, 2, 3, 4] and the length is 3, as expected.

enum

Source

Initialize a new workspace with cargo init --lib.

#![allow(unused)]
fn main() {
pub enum Color {
    Red,
    Green,
    Blue,
}

#[unsafe(no_mangle)]
pub fn u32_to_color(value: u32) -> Color {
    match value {
        0 => Color::Red,
        1 => Color::Green,
        2 => Color::Blue,
        _ => Color::Red,
    }
}

#[unsafe(no_mangle)]
pub fn simple_enum_match(color: Color) -> u8 {
    match color {
        Color::Red => 0,
        Color::Green => 1,
        Color::Blue => 2,
    }
}
}
#![allow(unused)]
fn main() {
pub enum BasicShape {
    Circle(i32),
    Point,
}

#[unsafe(no_mangle)]
pub fn basic_shape_match(shape: BasicShape) -> i32 {
    match shape {
        BasicShape::Circle(radius) => radius * radius,
        BasicShape::Point => 0,
    }
}

#[unsafe(no_mangle)]
pub fn make_basic_circle(radius: i32) -> BasicShape {
    BasicShape::Circle(radius)
}

#[unsafe(no_mangle)]
pub fn make_basic_point() -> BasicShape {
    BasicShape::Point
}
}
#![allow(unused)]
fn main() {
pub enum Shape {
    Circle(i32),
    Rectangle(i32, i32),
    Point,
}

#[unsafe(no_mangle)]
pub fn enum_with_data_match(shape: Shape) -> i32 {
    match shape {
        Shape::Circle(radius) => radius * radius,
        Shape::Rectangle(width, height) => width * height,
        Shape::Point => 0,
    }
}

#[unsafe(no_mangle)]
pub fn make_circle(radius: i32) -> Shape {
    Shape::Circle(radius)
}

#[unsafe(no_mangle)]
pub fn make_rectangle(w: i32, h: i32) -> Shape {
    Shape::Rectangle(w, h)
}

#[unsafe(no_mangle)]
pub fn make_point() -> Shape {
    Shape::Point
}
}

The enum type is described in the official docs in detail. We are using no_mangle to simplify things, the reason can be found here. The source code is split into 3 parts:

Note: for this analysis the unit-only and data-carrying types have been chosen as they are the most common ones.

Build

$ cargo rustc --release -- --emit obj

Ghidra

Load the .o file (located at target/aarch64-unknown-linux-musl/release/deps/) into Ghidra and auto-analyze it.

Layout

This description provides a good overview of the enum types and their layouts (even the ones we will not discuss such as empty enum and enum with a single variant). enums are also called tagged unions and their layout is unspecified, unless you use #[repr(...)].

Still, we will see that in our examples an enum is either represented by a discriminant only or a discriminant plus the data/payload.

Unit-only enum

#![allow(unused)]
fn main() {
pub enum Color {
    Red,
    Green,
    Blue,
}

#[unsafe(no_mangle)]
pub fn u32_to_color(value: u32) -> Color {
    match value {
        0 => Color::Red,
        1 => Color::Green,
        2 => Color::Blue,
        _ => Color::Red,
    }
}

#[unsafe(no_mangle)]
pub fn simple_enum_match(color: Color) -> u8 {
    match color {
        Color::Red => 0,
        Color::Green => 1,
        Color::Blue => 2,
    }
}
}

Since the layout is unstable, there is no fixed size for the enum, although the compiler will typically choose the smallest representation possible. In this case it is u8 or i8.

$ cargo rustc --release --quiet -- -Z print-type-sizes       
print-type-size type: `Color`: 1 bytes, alignment: 1 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Red`: 0 bytes
print-type-size     variant `Green`: 0 bytes
print-type-size     variant `Blue`: 0 bytes

If we check simple_enum_match(), we can see that it just forwards the input value (0/1/2) to the output (from w0 to w0, so effectively it does nothing), meaning that the enum uses the same values for its discriminants. u32_to_color() behaves the same, with the difference that it also handles the default case which makes the code a bit more complicated.

Listings:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined simple_enum_match()
             undefined         <UNASSIGNED>   <RETURN>
                             simple_enum_match                               XREF[3]:     Entry Point(*), 001000c0(*), 
                                                                                          _elfSectionHeaders::00000110(*)  
        00100014 c0 03 5f d6     ret

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined u32_to_color()
             undefined         <UNASSIGNED>   <RETURN>
                             u32_to_color                                    XREF[4]:     Entry Point(*), 001000ac(*), 
                                                                                          _elfSectionHeaders::00000090(*), 
                                                                                          _elfSectionHeaders::000000d0(*)  
        00100000 1f 04 00 71     cmp        w0,#0x1
                             set w8=1 if input==1, else w8=0
        00100004 e8 17 9f 1a     cset       w8,eq
        00100008 1f 08 00 71     cmp        w0,#0x2
                             if input==2, keep input, else use w8
        0010000c 00 00 88 1a     csel       w0,w0,w8,eq
        00100010 c0 03 5f d6     ret

Data-carrying enum (largest variant: tuple with 1 field)

#![allow(unused)]
fn main() {
pub enum BasicShape {
    Circle(i32),
    Point,
}

#[unsafe(no_mangle)]
pub fn basic_shape_match(shape: BasicShape) -> i32 {
    match shape {
        BasicShape::Circle(radius) => radius * radius,
        BasicShape::Point => 0,
    }
}

#[unsafe(no_mangle)]
pub fn make_basic_circle(radius: i32) -> BasicShape {
    BasicShape::Circle(radius)
}

#[unsafe(no_mangle)]
pub fn make_basic_point() -> BasicShape {
    BasicShape::Point
}
}
$ cargo rustc --release --quiet -- -Z print-type-sizes
print-type-size type: `BasicShape`: 8 bytes, alignment: 4 bytes
print-type-size     discriminant: 4 bytes
print-type-size     variant `Circle`: 4 bytes
print-type-size         field `.0`: 4 bytes
print-type-size     variant `Point`: 0 bytes

In case the largest data-carrying variant is a tuple with 1 field, the compiler chooses to pass the discriminant and data via 2 registers (w0: discriminant and w1: data). A discriminant of 0 means Circle while 1 means Point.

Listings:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined basic_shape_match()
             undefined         <UNASSIGNED>   <RETURN>
                             basic_shape_match                               XREF[3]:     Entry Point(*), 00100124(*), 
                                                                                          _elfSectionHeaders::00000250(*)  
        0010006c 28 7c 01 1b     mul        w8,w1,w1
                             check discriminant
        00100070 1f 00 00 72     tst        w0,#0x1
                             if discriminant != 0, return 0, else return w8
        00100074 e0 13 88 1a     csel       w0,wzr,w8,ne
        00100078 c0 03 5f d6     ret
                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined make_basic_circle()
             undefined         <UNASSIGNED>   <RETURN>
                             make_basic_circle                               XREF[3]:     Entry Point(*), 00100138(*), 
                                                                                          _elfSectionHeaders::00000290(*)  
        0010007c e1 03 00 2a     mov        w1,w0
        00100080 e0 03 1f 2a     mov        w0,wzr
        00100084 c0 03 5f d6     ret
                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined make_basic_point()
             undefined         <UNASSIGNED>   <RETURN>
                             make_basic_point                                XREF[3]:     Entry Point(*), 0010014c(*), 
                                                                                          _elfSectionHeaders::000002d0(*)  
        00100088 20 00 80 52     mov        w0,#0x1
        0010008c c0 03 5f d6     ret

Data-carrying enum (largest variant: tuple with 2 fields)

#![allow(unused)]
fn main() {
pub enum Shape {
    Circle(i32),
    Rectangle(i32, i32),
    Point,
}

#[unsafe(no_mangle)]
pub fn enum_with_data_match(shape: Shape) -> i32 {
    match shape {
        Shape::Circle(radius) => radius * radius,
        Shape::Rectangle(width, height) => width * height,
        Shape::Point => 0,
    }
}

#[unsafe(no_mangle)]
pub fn make_circle(radius: i32) -> Shape {
    Shape::Circle(radius)
}

#[unsafe(no_mangle)]
pub fn make_rectangle(w: i32, h: i32) -> Shape {
    Shape::Rectangle(w, h)
}

#[unsafe(no_mangle)]
pub fn make_point() -> Shape {
    Shape::Point
}
}
$ cargo rustc --release --quiet -- -Z print-type-sizes
print-type-size type: `Shape`: 12 bytes, alignment: 4 bytes
print-type-size     discriminant: 4 bytes
print-type-size     variant `Rectangle`: 8 bytes
print-type-size         field `.0`: 4 bytes
print-type-size         field `.1`: 4 bytes
print-type-size     variant `Circle`: 4 bytes
print-type-size         field `.0`: 4 bytes
print-type-size     variant `Point`: 0 bytes

In case the largest data-carrying variant is a tuple with 2 fields, the compiler chooses to pass the discriminant and data via 1 register (w0), which holds a pointer to the enum's memory location where the struct begins with the discriminant (here: offset 0), then contains the remaining fields associated with that variant (here: offset 4). A discriminant of 0 means Circle, 1 means Rectangle and 2 means Point.

Note: x8 is the indirect result register according to the AAPCS64. It is also explained here:

XR (X8) is a pointer to the memory allocated by the caller for returning the struct.

Listings:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined enum_with_data_match()
             undefined         <UNASSIGNED>   <RETURN>
                             enum_with_data_match                            XREF[3]:     Entry Point(*), 001000d4(*), 
                                                                                          _elfSectionHeaders::00000150(*)  
        00100018 08 00 40 b9     ldr        w8,[x0]
        0010001c c8 00 00 34     cbz        w8,LAB_00100034
        00100020 1f 05 00 71     cmp        w8,#0x1
        00100024 e1 00 00 54     b.ne       LAB_00100040
                             rectangle
        00100028 08 a4 40 29     ldp        w8,w9,[x0, #0x4]
        0010002c 20 7d 08 1b     mul        w0,w9,w8
        00100030 c0 03 5f d6     ret
                             circle
                             LAB_00100034                                    XREF[1]:     0010001c(j)  
        00100034 08 04 40 b9     ldr        w8,[x0, #0x4]
        00100038 00 7d 08 1b     mul        w0,w8,w8
        0010003c c0 03 5f d6     ret
                             point
                             LAB_00100040                                    XREF[1]:     00100024(j)  
        00100040 e0 03 1f 2a     mov        w0,wzr
        00100044 c0 03 5f d6     ret
                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined make_circle()
             undefined         <UNASSIGNED>   <RETURN>
                             make_circle                                     XREF[3]:     Entry Point(*), 001000e8(*), 
                                                                                          _elfSectionHeaders::00000190(*)  
        00100048 1f 01 00 29     stp        wzr,w0,[x8]
        0010004c c0 03 5f d6     ret
                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined make_rectangle()
             undefined         <UNASSIGNED>   <RETURN>
                             make_rectangle                                  XREF[3]:     Entry Point(*), 001000fc(*), 
                                                                                          _elfSectionHeaders::000001d0(*)  
        00100050 29 00 80 52     mov        w9,#0x1
        00100054 00 85 00 29     stp        w0,w1,[x8, #0x4]
        00100058 09 01 00 b9     str        w9,[x8]
        0010005c c0 03 5f d6     ret
                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined make_point()
             undefined         <UNASSIGNED>   <RETURN>
                             make_point                                      XREF[3]:     Entry Point(*), 00100110(*), 
                                                                                          _elfSectionHeaders::00000210(*)  
        00100060 49 00 80 52     mov        w9,#0x2
        00100064 09 01 00 b9     str        w9,[x8]
        00100068 c0 03 5f d6     ret

Option

Source

Initialize a new workspace with cargo init --lib.

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn safe_divide(dividend: i32, divisor: i32) -> Option<i32> {
    if divisor == 0 {
        None
    } else {
        Some(dividend / divisor)
    }
}

#[unsafe(no_mangle)]
pub fn process_option(value: Option<i32>) -> i32 {
    match value {
        Some(x) => x * 2,
        None => 0,
    }
}

#[unsafe(no_mangle)]
pub fn process_str_option(value: Option<&str>) -> usize {
    match value {
        Some(s) => s.len(),
        None => 0,
    }
}

#[unsafe(no_mangle)]
pub fn process_box_option(value: Option<Box<i32>>) -> i32 {
    match value {
        Some(boxed) => *boxed,
        None => -1,
    }
}
}

The Option type is described in the official docs in detail.

no_mangle

In some cases, simple functions e.g. process_option might be inlined by the compiler. For this reason, these are not present in the .o file, only in .rmeta. The inlining will be done based on the information (e.g. function signatures, type information and encoded MIR) available in the .rmeta file. Information about the .rmeta file format can be found here. We want to see the generated code for our sample functions in the .o file, so this optimization is undesirable for us. A possible solution is to use #[unsafe(no_mangle)] which has 2 effects:

  • Do not mangle the symbol name.
  • Export this symbol. #[unsafe(no_mangle)] implies that the function is intended to be called from outside of the current compilation unit (e.g. from C code or another Rust crate with a different LTO context). For this reason, it will be present in the .o file.

Without #[unsafe(no_mangle)]:

$ llvm-objdump --syms target/aarch64-unknown-linux-musl/release/deps/*.o

target/aarch64-unknown-linux-musl/release/deps/rust_lab-35c360f17fe9ba7d.o:     file format elf64-littleaarch64

SYMBOL TABLE:
0000000000000000 l    df *ABS*  0000000000000000 rust_lab.e234cd7f6b439d7e-cgu.0
0000000000000000 l    d  .text._ZN8rust_lab11safe_divide17h459bac753259da22E    0000000000000000 .text._ZN8rust_lab11safe_divide17h459bac753259da22E
0000000000000000 l       .text._ZN8rust_lab11safe_divide17h459bac753259da22E    0000000000000000 $x
0000000000000000 l    d  .text._ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E     0000000000000000 .text._ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E
0000000000000000 l       .text._ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E     0000000000000000 $x
0000000000000000 l    d  .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5       0000000000000000 .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5
0000000000000000 l       .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5       0000000000000000 $d
0000000000000000 l    d  .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2  0000000000000000 .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2
0000000000000000 l       .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2  0000000000000000 $d
0000000000000000 l       .comment       0000000000000000 $d
0000000000000000 l       .eh_frame      0000000000000000 $d
0000000000000000 g     F .text._ZN8rust_lab11safe_divide17h459bac753259da22E    0000000000000040 _ZN8rust_lab11safe_divide17h459bac753259da22E
0000000000000000         *UND*  0000000000000000 _ZN4core9panicking11panic_const24panic_const_div_overflow17h2ce15414ba9ec1bdE
0000000000000000 g     F .text._ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E     0000000000000044 _ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E
0000000000000000         *UND*  0000000000000000 _RNvCsdk9DaPZnL1i_7___rustc14___rust_dealloc

With #[unsafe(no_mangle)]:

$ llvm-objdump --syms target/aarch64-unknown-linux-musl/release/deps/*.o

target/aarch64-unknown-linux-musl/release/deps/rust_lab-35c360f17fe9ba7d.o:     file format elf64-littleaarch64

SYMBOL TABLE:
0000000000000000 l    df *ABS*  0000000000000000 rust_lab.e234cd7f6b439d7e-cgu.0
0000000000000000 l    d  .text.safe_divide      0000000000000000 .text.safe_divide
0000000000000000 l       .text.safe_divide      0000000000000000 $x
0000000000000000 l    d  .text.process_option   0000000000000000 .text.process_option
0000000000000000 l       .text.process_option   0000000000000000 $x
0000000000000000 l    d  .text.process_str_option       0000000000000000 .text.process_str_option
0000000000000000 l       .text.process_str_option       0000000000000000 $x
0000000000000000 l    d  .text.process_box_option       0000000000000000 .text.process_box_option
0000000000000000 l       .text.process_box_option       0000000000000000 $x
0000000000000000 l    d  .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5       0000000000000000 .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5
0000000000000000 l       .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5       0000000000000000 $d
0000000000000000 l    d  .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2  0000000000000000 .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2
0000000000000000 l       .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2  0000000000000000 $d
0000000000000000 l       .comment       0000000000000000 $d
0000000000000000 l       .eh_frame      0000000000000000 $d
0000000000000000 g     F .text.safe_divide      0000000000000040 safe_divide
0000000000000000         *UND*  0000000000000000 _ZN4core9panicking11panic_const24panic_const_div_overflow17h2ce15414ba9ec1bdE
0000000000000000 g     F .text.process_option   0000000000000010 process_option
0000000000000000 g     F .text.process_str_option       000000000000000c process_str_option
0000000000000000 g     F .text.process_box_option       0000000000000044 process_box_option
0000000000000000         *UND*  0000000000000000 _RNvCsdk9DaPZnL1i_7___rustc14___rust_dealloc

Build

$ cargo rustc --release -- --emit obj,mir,llvm-ir

Ghidra

Load the .o file (located at target/aarch64-unknown-linux-musl/release/deps/) into Ghidra and auto-analyze it.

Layout

Option is an enum type which is conceptually a tagged union with a discriminant and data (refer to chapter enum for more information about enums). Since the layout is unspecified, Rust often applies discriminant elision. For common types like references and Box<T>, None is represented using invalid bit patterns (like null pointers, see chapter Null pointer optimization) rather than a separate discriminant field, making Option<T> the same size as T. In case of the None variant, the data value is undefined. We will see this in the generated code but this is documented as well. The exact memory layout is unspecified without explicit #[repr] attributes.

safe_divide

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn safe_divide(dividend: i32, divisor: i32) -> Option<i32> {
    if divisor == 0 {
        None
    } else {
        Some(dividend / divisor)
    }
}
}

The generated assembly is straightforward, there is only one piece of background information we need to know to fully understand it. The compiler automatically generates a check which makes sure to panic if the result would overflow. In case of i32, there is only one such scenario: dividend is i32::MIN and divisor is -1. This can be seen in the MIR already:

fn safe_divide(_1: i32, _2: i32) -> Option<i32> {
    debug dividend => _1;
    debug divisor => _2;
    let mut _0: std::option::Option<i32>;
    let mut _3: bool;
    let mut _4: i32;
    let mut _5: bool;
    let mut _6: bool;
    let mut _7: bool;

    bb0: {
        _3 = Eq(copy _2, const 0_i32);
        switchInt(move _2) -> [0: bb1, otherwise: bb2];
    }

    bb1: {
        _0 = const Option::<i32>::None;
        goto -> bb5;
    }

    bb2: {
        StorageLive(_4);
        assert(!copy _3, "attempt to divide `{}` by zero", copy _1) -> [success: bb3, unwind continue];
    }

    bb3: {
        _5 = Eq(copy _2, const -1_i32);
        _6 = Eq(copy _1, const i32::MIN);
        _7 = BitAnd(move _5, move _6);
        assert(!move _7, "attempt to compute `{} / {}`, which would overflow", copy _1, copy _2) -> [success: bb4, unwind continue];
    }

    bb4: {
        _4 = Div(copy _1, copy _2);
        _0 = Option::<i32>::Some(move _4);
        StorageDead(_4);
        goto -> bb5;
    }

    bb5: {
        return;
    }
}

The result is returned using 2 registers: w0 stores the discriminant (None: 0, Some: 1) and w1 stores the data.

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined safe_divide()
             undefined         <UNASSIGNED>   <RETURN>
             undefined8        Stack[-0x10]:8 local_10                                XREF[1]:     0010002c(W)  
                             check if divisor is 0
                             safe_divide                                     XREF[4]:     Entry Point(*), 001000e4(*), 
                                                                                          _elfSectionHeaders::00000090(*), 
                                                                                          _elfSectionHeaders::000000d0(*)  
        00100000 21 01 00 34     cbz        w1,LAB_00100024
                             i32::MIN
        00100004 08 00 b0 52     mov        w8,#0x80000000
                             check if dividend is i32::MIN
        00100008 1f 00 08 6b     cmp        w0,w8
        0010000c 61 00 00 54     b.ne       LAB_00100018
                             check if divisor is -1
        00100010 3f 04 00 31     cmn        w1,#0x1
        00100014 c0 00 00 54     b.eq       LAB_0010002c
                             LAB_00100018                                    XREF[1]:     0010000c(j)  
        00100018 01 0c c1 1a     sdiv       w1,w0,w1
                             Some
        0010001c 20 00 80 52     mov        w0,#0x1
        00100020 c0 03 5f d6     ret
                             None
                             LAB_00100024                                    XREF[1]:     00100000(j)  
        00100024 e0 03 1f 2a     mov        w0,wzr
        00100028 c0 03 5f d6     ret
                             LAB_0010002c                                    XREF[1]:     00100014(j)  
        0010002c fd 7b bf a9     stp        x29,x30,[sp, #local_10]!
        00100030 fd 03 00 91     mov        x29,sp
        00100034 00 00 00 90     adrp       x0,0x100000
        00100038 00 c0 02 91     add        x0=>PTR_DAT_001000b0,x0,#0xb0                    = 001000a0
        0010003c f1 03 00 94     bl         <EXTERNAL>::core::panicking::panic_const::pani   undefined panic_const_div_overfl

process_option

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn process_option(value: Option<i32>) -> i32 {
    match value {
        Some(x) => x * 2,
        None => 0,
    }
}
}

We can see the same pattern (w0: discriminant, w1: data) when processing an Option passed to our function.

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined process_option()
             undefined         <UNASSIGNED>   <RETURN>
                             multiply by 2
                             process_option                                  XREF[3]:     Entry Point(*), 00100100(*), 
                                                                                          _elfSectionHeaders::00000150(*)  
        00100040 28 78 1f 53     lsl        w8,w1,#0x1
                             check discriminant: Z flag = 1 if None, Z flag = 0 if Some
        00100044 1f 00 00 72     tst        w0,#0x1
                             if Z=0 (Some): return w8, if Z=1 (None): return wzr
        00100048 00 11 9f 1a     csel       w0,w8,wzr,ne
        0010004c c0 03 5f d6     ret

Null pointer optimization

There are some cases where the discriminant is omitted due to optimizations. The general rule is that null pointer optimization can be used for types that can never be null. Examples include:

  • Option<&str>
  • Option<Box<i32>>

By the safety guarantees of safe Rust, a &str always points to a valid location and a Box<T> always points to a valid heap allocation. This enables the compiler to use further optimizations, for example dropping the discriminant field and using a null value to represent the None variant.

While tracing the different compilation steps, we can see that the discriminant is present in the MIR but not in the LLVM IR. This means the null pointer optimization happens during lowering MIR to LLVM IR.

process_str_option

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn process_str_option(value: Option<&str>) -> usize {
    match value {
        Some(s) => s.len(),
        None => 0,
    }
}
}

Looking at the MIR, we can see that it extracts and checks the discriminant:

...
        _2 = discriminant(_1);
        switchInt(move _2) -> [0: bb2, 1: bb3, otherwise: bb1];
...

In the LLVM IR this has been simplified and replaced with a null check. If the pointer is null, 0 is returned, if it is a valid value, the length of the referenced string is returned. (An &str consists of 2 values: a pointer and a length.)

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable
define noundef i64 @process_str_option(ptr noalias noundef readonly align 1 %0, i64 %1) unnamed_addr #1 {
start:
  %.not = icmp eq ptr %0, null
  %. = select i1 %.not, i64 0, i64 %1
  ret i64 %.
}

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined process_str_option()
             undefined         <UNASSIGNED>   <RETURN>
                             process_str_option                              XREF[3]:     Entry Point(*), 00100114(*), 
                                                                                          _elfSectionHeaders::00000190(*)  
        00100050 1f 00 00 f1     cmp        x0,#0x0
        00100054 e0 03 81 9a     csel       x0,xzr,x1,eq
        00100058 c0 03 5f d6     ret

Full MIR for reference:

fn process_str_option(_1: Option<&str>) -> usize {
    debug value => _1;
    let mut _0: usize;
    let mut _2: isize;
    let _3: &str;
    scope 1 {
        debug s => _3;
        scope 2 (inlined core::str::<impl str>::len) {
            let _4: &[u8];
            scope 3 (inlined core::str::<impl str>::as_bytes) {
            }
        }
    }

    bb0: {
        _2 = discriminant(_1);
        switchInt(move _2) -> [0: bb2, 1: bb3, otherwise: bb1];
    }

    bb1: {
        unreachable;
    }

    bb2: {
        _0 = const 0_usize;
        goto -> bb4;
    }

    bb3: {
        _3 = copy ((_1 as Some).0: &str);
        StorageLive(_4);
        _4 = copy _3 as &[u8] (Transmute);
        _0 = PtrMetadata(copy _4);
        StorageDead(_4);
        goto -> bb4;
    }

    bb4: {
        return;
    }
}

process_box_option

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn process_box_option(value: Option<Box<i32>>) -> i32 {
    match value {
        Some(boxed) => *boxed,
        None => -1,
    }
}

}

A Box<i32> consists of a single pointer pointing to a heap allocated block and can never be null in safe code. Therefore, the code can be optimized with a null check. If it is null, -1 is returned. Otherwise, the pointer is dereferenced, the heap block is deallocated and the value is returned.

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined process_box_option()
             undefined         <UNASSIGNED>   <RETURN>
             undefined8        Stack[-0x10]:8 local_10                                XREF[3]:     00100060(W), 
                                                                                                   00100080(R), 
                                                                                                   00100094(R)  
             undefined8        Stack[-0x20]:8 local_20                                XREF[3]:     0010005c(W), 
                                                                                                   00100084(*), 
                                                                                                   00100098(*)  
                             process_box_option                              XREF[3]:     Entry Point(*), 00100128(*), 
                                                                                          _elfSectionHeaders::000001d0(*)  
        0010005c fd 7b be a9     stp        x29,x30,[sp, #local_20]!
        00100060 f3 0b 00 f9     str        x19,[sp, #local_10]
        00100064 fd 03 00 91     mov        x29,sp
                             null check
        00100068 20 01 00 b4     cbz        x0,LAB_0010008c
                             dereference
        0010006c 13 00 40 b9     ldr        w19,[x0]
        00100070 81 00 80 52     mov        w1,#0x4
        00100074 82 00 80 52     mov        w2,#0x4
        00100078 e4 03 00 94     bl         <EXTERNAL>::__rustc[a3537046f032bc96]::__rust_   undefined __rust_dealloc()
        0010007c e0 03 13 2a     mov        w0,w19
        00100080 f3 0b 40 f9     ldr        x19,[sp, #local_10]
        00100084 fd 7b c2 a8     ldp        x29=>local_20,x30,[sp], #0x20
        00100088 c0 03 5f d6     ret
                             LAB_0010008c                                    XREF[1]:     00100068(j)  
        0010008c 13 00 80 12     mov        w19,#0xffffffff
        00100090 e0 03 13 2a     mov        w0,w19
        00100094 f3 0b 40 f9     ldr        x19,[sp, #local_10]
        00100098 fd 7b c2 a8     ldp        x29=>local_20,x30,[sp], #0x20
        0010009c c0 03 5f d6     ret

Result

Source

Initialize a new workspace with cargo init --lib.

#![allow(unused)]
fn main() {
pub enum MathError {
    DivisionByZero,
    Overflow,
}

#[unsafe(no_mangle)]
pub fn divide_with_str_error(dividend: i32, divisor: i32) -> Result<i32, &'static str> {
    if divisor == 0 {
        Err("Division by zero")
    } else {
        Ok(dividend / divisor)
    }
}

#[unsafe(no_mangle)]
pub fn divide_with_enum_error(a: i32, b: i32) -> Result<i32, MathError> {
    if b == 0 {
        return Err(MathError::DivisionByZero);
    }

    if a == i32::MIN && b == -1 {
        return Err(MathError::Overflow);
    }

    Ok(a / b)
}

#[unsafe(no_mangle)]
pub fn process_result_str_error(value: Result<i32, &str>) -> i32 {
    match value {
        Ok(x) => x * 2,
        Err(_) => -1,
    }
}

#[unsafe(no_mangle)]
pub fn process_result_box_enum(value: Result<Box<i32>, MathError>) -> i32 {
    match value {
        Err(MathError::DivisionByZero) => -1,
        Err(MathError::Overflow) => -2,
        Ok(value) => *value,
    }
}
}

The Result type is described in the official docs in detail. We are using no_mangle to simplify things, the reason can be found here.

Build

$ cargo rustc --release -- --emit obj

Ghidra

Load the .o file (located at target/aarch64-unknown-linux-musl/release/deps/) into Ghidra and auto-analyze it.

Layout

Result is an enum type which is conceptually a tagged union with a discriminant and data (refer to chapter enum for more information about enums). The layout is unspecified, which similarly to Option enables certain niche optimizations.

divide_with_str_error

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn divide_with_str_error(dividend: i32, divisor: i32) -> Result<i32, &'static str> {
    if divisor == 0 {
        Err("Division by zero")
    } else {
        Ok(dividend / divisor)
    }
}
}

If we look at the layout, we see that there is no discriminant field. Since the compiler knows that references always point to valid locations, it can use the null value as a discriminant. &str is a fat reference (16 bytes: pointer + length), therefore variant Err is 16 bytes.

$ cargo rustc --release --quiet -- -Z print-type-sizes
...
print-type-size type: `std::result::Result<i32, &str>`: 16 bytes, alignment: 8 bytes
print-type-size     variant `Err`: 16 bytes
print-type-size         field `.0`: 16 bytes
print-type-size     variant `Ok`: 12 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.0`: 4 bytes, alignment: 4 bytes
...

Before looking at the assembly, we must mention that the compiler automatically generates a check that causes a panic if the result would overflow. For i32, there is only one such scenario: dividend is i32::MIN and divisor is -1.

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined divide_with_str_error()
             undefined         <UNASSIGNED>   <RETURN>
             undefined8        Stack[-0x10]:8 local_10                                XREF[1]:     0010003c(W)  
                             check if divisor is 0
                             divide_with_str_error                           XREF[4]:     Entry Point(*), 0010015c(*), 
                                                                                          _elfSectionHeaders::00000090(*), 
                                                                                          _elfSectionHeaders::000000d0(*)  
        00100000 41 01 00 34     cbz        w1,LAB_00100028
                             i32::MIN
        00100004 09 00 b0 52     mov        w9,#0x80000000
                             check if dividend is i32::MIN
        00100008 1f 00 09 6b     cmp        w0,w9
        0010000c 61 00 00 54     b.ne       LAB_00100018
                             check if divisor is -1
        00100010 3f 04 00 31     cmn        w1,#0x1
        00100014 40 01 00 54     b.eq       LAB_0010003c
                             LAB_00100018                                    XREF[1]:     0010000c(j)  
        00100018 09 0c c1 1a     sdiv       w9,w0,w1
                             Ok
        0010001c 1f 01 00 f9     str        xzr,[x8]
        00100020 09 09 00 b9     str        w9,[x8, #0x8]
        00100024 c0 03 5f d6     ret
                             LAB_00100028                                    XREF[1]:     00100000(j)  
        00100028 09 00 00 90     adrp       x9,0x100000
        0010002c 29 21 04 91     add        x9,x9,#0x108
                             length of "Division by zero"
        00100030 0a 02 80 52     mov        w10,#0x10
                             Err
        00100034 09 29 00 a9     stp        x9=>s_Division_by_zero_00100108,x10,[x8]         = "Division by zero"
        00100038 c0 03 5f d6     ret
                             LAB_0010003c                                    XREF[1]:     00100014(j)  
        0010003c fd 7b bf a9     stp        x29,x30,[sp, #local_10]!
        00100040 fd 03 00 91     mov        x29,sp
        00100044 00 00 00 90     adrp       x0,0x100000
        00100048 00 a0 04 91     add        x0=>PTR_DAT_00100128,x0,#0x128                   = 00100118
        0010004c ed 03 00 94     bl         <EXTERNAL>::core::panicking::panic_const::pani   undefined panic_const_div_overfl

divide_with_enum_error

#![allow(unused)]
fn main() {
pub enum MathError {
    DivisionByZero,
    Overflow,
}

#[unsafe(no_mangle)]
pub fn divide_with_enum_error(a: i32, b: i32) -> Result<i32, MathError> {
    if b == 0 {
        return Err(MathError::DivisionByZero);
    }

    if a == i32::MIN && b == -1 {
        return Err(MathError::Overflow);
    }

    Ok(a / b)
}
}

Here we have a discriminant which is either followed by the value (Ok) or the error type (Err).

$ cargo rustc --release --quiet -- -Z print-type-sizes
...
print-type-size type: `std::result::Result<i32, MathError>`: 8 bytes, alignment: 4 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Ok`: 7 bytes
print-type-size         padding: 3 bytes
print-type-size         field `.0`: 4 bytes, alignment: 4 bytes
print-type-size     variant `Err`: 1 bytes
print-type-size         field `.0`: 1 bytes
...

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined divide_with_enum_error()
             undefined         <UNASSIGNED>   <RETURN>
                             check if divisor is 0
                             divide_with_enum_error                          XREF[3]:     Entry Point(*), 0010017c(*), 
                                                                                          _elfSectionHeaders::00000150(*)  
        00100050 41 01 00 34     cbz        w1,LAB_00100078
                             i32::MIN
        00100054 08 00 b0 52     mov        w8,#0x80000000
                             check if dividend is i32::MIN
        00100058 1f 00 08 6b     cmp        w0,w8
        0010005c 41 01 00 54     b.ne       LAB_00100084
                             check if divisor is -1
        00100060 3f 04 00 31     cmn        w1,#0x1
        00100064 01 01 00 54     b.ne       LAB_00100084
                             Overflow
        00100068 08 20 80 52     mov        w8,#0x100
                             Err
        0010006c 29 00 80 52     mov        w9,#0x1
                             Err(Overflow) encoded
        00100070 00 01 09 aa     orr        x0,x8,x9
        00100074 c0 03 5f d6     ret
                             Err
                             LAB_00100078                                    XREF[1]:     00100050(j)  
        00100078 29 00 80 52     mov        w9,#0x1
                             Err(DivisionByZero) encoded
        0010007c e0 03 09 aa     mov        x0,x9
        00100080 c0 03 5f d6     ret
                             LAB_00100084                                    XREF[2]:     0010005c(j), 00100064(j)  
        00100084 08 0c c1 1a     sdiv       w8,w0,w1
                             Ok (value is upper 4 bytes)
        00100088 08 7d 60 d3     lsl        x8,x8,#0x20
        0010008c 00 01 1f aa     orr        x0,x8,xzr
        00100090 c0 03 5f d6     ret

process_result_str_error

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub fn process_result_str_error(value: Result<i32, &str>) -> i32 {
    match value {
        Ok(x) => x * 2,
        Err(_) => -1,
    }
}
}

The layout of the Result is the same as in divide_with_str_error. If the first 8 bytes are 0 it means Ok and the value is multiplied by 2 and returned, otherwise -1 is returned (wzr inverted: 0xffff).

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined process_result_str_error()
             undefined         <UNASSIGNED>   <RETURN>
                             process_result_str_error                        XREF[3]:     Entry Point(*), 00100190(*), 
                                                                                          _elfSectionHeaders::00000190(*)  
        00100094 08 08 40 b9     ldr        w8,[x0, #0x8]
        00100098 09 00 40 f9     ldr        x9,[x0]
        0010009c 08 79 1f 53     lsl        w8,w8,#0x1
        001000a0 3f 01 00 f1     cmp        x9,#0x0
        001000a4 00 01 9f 5a     csinv      w0,w8,wzr,eq
        001000a8 c0 03 5f d6     ret

process_result_box_enum

#![allow(unused)]
fn main() {
pub enum MathError {
    DivisionByZero,
    Overflow,
}

#[unsafe(no_mangle)]
pub fn process_result_box_enum(value: Result<Box<i32>, MathError>) -> i32 {
    match value {
        Err(MathError::DivisionByZero) => -1,
        Err(MathError::Overflow) => -2,
        Ok(value) => *value,
    }
}
}

Box<T> is a thin pointer pointing to a heap location, the memory is automatically deallocated by __rust_dealloc. For more information refer to chapter Box. The layout is similar to what we saw in the case of divide_with_enum_error. The difference is that the size of the Ok data is 8 bytes instead of 4.

$ cargo rustc --release --quiet -- -Z print-type-sizes
...
print-type-size type: `std::result::Result<std::boxed::Box<i32>, MathError>`: 16 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Ok`: 15 bytes
print-type-size         padding: 7 bytes
print-type-size         field `.0`: 8 bytes, alignment: 8 bytes
print-type-size     variant `Err`: 1 bytes
print-type-size         field `.0`: 1 bytes
...

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined process_result_box_enum()
             undefined         <UNASSIGNED>   <RETURN>
             undefined8        Stack[-0x10]:8 local_10                                XREF[3]:     001000b0(W), 
                                                                                                   001000d8(R), 
                                                                                                   001000fc(R)  
             undefined8        Stack[-0x20]:8 local_20                                XREF[3]:     001000ac(W), 
                                                                                                   001000dc(*), 
                                                                                                   00100100(*)  
                             process_result_box_enum                         XREF[3]:     Entry Point(*), 001001a4(*), 
                                                                                          _elfSectionHeaders::000001d0(*)  
        001000ac fd 7b be a9     stp        x29,x30,[sp, #local_20]!
        001000b0 f3 0b 00 f9     str        x19,[sp, #local_10]
        001000b4 fd 03 00 91     mov        x29,sp
                             load discriminant
        001000b8 08 00 40 39     ldrb       w8,[x0]
                             Err
        001000bc 1f 05 00 71     cmp        w8,#0x1
        001000c0 21 01 00 54     b.ne       LAB_001000e4
        001000c4 08 04 40 39     ldrb       w8,[x0, #0x1]
                             DivisionByZero
        001000c8 1f 01 00 71     cmp        w8,#0x0
        001000cc 28 00 80 12     mov        w8,#0xfffffffe
                             if DivisionByZero: ret val is 0xfffffffe + 1 = -1,
                             otherwise (Overflow): 0xfffffffe = -2
        001000d0 13 15 88 1a     cinc       w19,w8,eq
        001000d4 e0 03 13 2a     mov        w0,w19
        001000d8 f3 0b 40 f9     ldr        x19,[sp, #local_10]
        001000dc fd 7b c2 a8     ldp        x29=>local_20,x30,[sp], #0x20
        001000e0 c0 03 5f d6     ret
                             arg0: ptr
                             LAB_001000e4                                    XREF[1]:     001000c0(j)  
        001000e4 00 04 40 f9     ldr        x0,[x0, #0x8]
                             arg1: size
        001000e8 81 00 80 52     mov        w1,#0x4
                             arg2: align
        001000ec 82 00 80 52     mov        w2,#0x4
                             save value
        001000f0 13 00 40 b9     ldr        w19,[x0]
        001000f4 c5 03 00 94     bl         <EXTERNAL>::__rustc[eb192786f4da5ea1]::__rust_   undefined __rust_dealloc()
        001000f8 e0 03 13 2a     mov        w0,w19
        001000fc f3 0b 40 f9     ldr        x19,[sp, #local_10]
        00100100 fd 7b c2 a8     ldp        x29=>local_20,x30,[sp], #0x20
        00100104 c0 03 5f d6     ret

Declarative macros

Source

Initialize a new workspace with cargo init --lib.

#![allow(unused)]
fn main() {
pub fn declarative_macro_vec_empty() -> Vec<i32> {
    let vec: Vec<i32> = vec![];
    vec
}

pub fn declarative_macro_vec_repeat() -> Vec<i32> {
    let vec: Vec<i32> = vec![1; 10];
    vec
}

pub fn declarative_macro_vec_list() -> Vec<i32> {
    let vec: Vec<i32> = vec![1, 2, 3];
    vec
}
}

Declarative macros are defined with the macro_rules! language construct and they work through pattern matching on the syntax tree. The implementation handling the macro compilation can be found here and here.

Since declarative macros can be expanded to arbitrary Rust code based on the implementation of the macro, we will focus on the expanded Rust code rather than the generated binary files in this chapter.

If we look at the vec! macro as an example, we see that it has 3 arms:

  • the 1. one matching vec![]
  • the 2. one matching vec![1; 10]
  • the 3. one matching vec![1, 2, 3]
#![allow(unused)]
fn main() {
macro_rules! vec {
    () => (
        $crate::vec::Vec::new()
    );
    ($elem:expr; $n:expr) => (
        $crate::vec::from_elem($elem, $n)
    );
    ($($x:expr),+ $(,)?) => (
        <[_]>::into_vec(
            // Using the intrinsic produces a dramatic improvement in stack usage for
            // unoptimized programs using this code path to construct large Vecs.
            $crate::boxed::box_new([$($x),+])
        )
    );
}
}

declarative_macro_vec_empty

#![allow(unused)]
fn main() {
pub fn declarative_macro_vec_empty() -> Vec<i32> {
    let vec: Vec<i32> = vec![];
    vec
}
}
$ cargo expand
...
pub fn declarative_macro_vec_empty() -> Vec<i32> {
    let vec: Vec<i32> = ::alloc::vec::Vec::new();
    vec
}

declarative_macro_vec_repeat

#![allow(unused)]
fn main() {
pub fn declarative_macro_vec_repeat() -> Vec<i32> {
    let vec: Vec<i32> = vec![1; 10];
    vec
}
}
$ cargo expand
...
pub fn declarative_macro_vec_repeat() -> Vec<i32> {
    let vec: Vec<i32> = ::alloc::vec::from_elem(1, 10);
    vec
}

declarative_macro_vec_list

#![allow(unused)]
fn main() {
pub fn declarative_macro_vec_list() -> Vec<i32> {
    let vec: Vec<i32> = vec![1, 2, 3];
    vec
}
}
$ cargo expand
...
pub fn declarative_macro_vec_list() -> Vec<i32> {
    let vec: Vec<i32> = <[_]>::into_vec(::alloc::boxed::box_new([1, 2, 3]));
    vec
}

Built-in declarative macros

Source

Initialize a new workspace with cargo init --lib.

#![allow(unused)]
fn main() {
pub fn format_args_built_in() {
    println!("Hello, world!");
}
}

Built-in declarative macros are a special type of declarative macros. They are marked with #[rustc_builtin_macro] and expanded with an internal expander function. The list of the built-in macros can be found here.

If we look at the format_args! macro as an example, we see that it is marked with #[rustc_builtin_macro] and the expander function is expand_format_args which calls expand_format_args_impl.

#![allow(unused)]
fn main() {
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_diagnostic_item = "format_args_macro"]
#[allow_internal_unsafe]
#[allow_internal_unstable(fmt_internals)]
#[rustc_builtin_macro]
#[macro_export]
macro_rules! format_args {
    ($fmt:expr) => {{ /* compiler built-in */ }};
    ($fmt:expr, $($args:tt)*) => {{ /* compiler built-in */ }};
}
}

format_args_built_in

#![allow(unused)]
fn main() {
pub fn format_args_built_in() {
    println!("Hello, world!");
}
}

Built-in macros are not expanded to Rust code which means cargo expand cannot show us their expanded form:

$ cargo expand
...
pub fn format_args_built_in() {
    {
        ::std::io::_print(format_args!("Hello, world!\n"));
    };
}

Instead, they are expanded in the HIR:

$ cargo rustc --release --quiet -- -Z unpretty=hir
...
fn format_args_built_in() {
    { ::std::io::_print(format_arguments::new_const(&["Hello, world!\n"])); };
}

Procedural macros

Source

Initialize a new workspace with cargo init --lib and add tokio to the dependencies via cargo add tokio --features full.

#![allow(unused)]
fn main() {
#[tokio::main]
async fn proc_macro_main() {}
}

Procedural macros can manipulate the syntax tree directly. They work with TokenStreams which are sequences of tokens representing Rust code. A proc macro receives a TokenStream as input and returns a TokenStream as output.

Since proc macros can be expanded to arbitrary Rust code based on the implementation of the macro, we will focus on the expanded Rust code rather than the generated binary files in this chapter.

If we look at the #[tokio::main] attribute proc macro as an example, we see that it is expanded to this or similar code:

fn main() {
    tokio::runtime::Builder::new_current_thread()
        .enable_all()
        .unhandled_panic(UnhandledPanic::ShutdownRuntime)
        .build()
        .unwrap()
        .block_on(async {
            let _ = tokio::spawn(async {
                panic!("This panic will shutdown the runtime.");
            }).await;
        })
}

proc_macro_main

#![allow(unused)]
fn main() {
#[tokio::main]
async fn proc_macro_main() {}
}
$ cargo expand
...
fn proc_macro_main() {
    let body = async {};
    #[allow(
        clippy::expect_used,
        clippy::diverging_sub_expression,
        clippy::needless_return
    )]
    {
        return tokio::runtime::Builder::new_multi_thread()
            .enable_all()
            .build()
            .expect("Failed building the Runtime")
            .block_on(body);
    }
}

Built-in attributes

Source

Initialize a new workspace with cargo init --lib.

#![allow(unused)]
fn main() {
#[derive(Clone)]
#[repr(C)]
pub struct Person {
    name: String,
    age: u32,
}

pub fn attribute_person() -> Person {
    let person = Person {
        name: "Rustacean".to_string(),
        age: 22,
    };
    person
}
}

Attributes are metadata either attached to the containing item (inner attribute) or attached to the following item (outer attribute). There are many types of built-in attributes but from code generation point of view, we only care about the ones directly influencing the generated code, such as derive, repr or inline, where derive generates additional code, repr affects the memory layout and inline is passed as a hint to the LLVM backend.

attribute_person

#![allow(unused)]
fn main() {
#[derive(Clone)]
#[repr(C)]
pub struct Person {
    name: String,
    age: u32,
}

pub fn attribute_person() -> Person {
    let person = Person {
        name: "Rustacean".to_string(),
        age: 22,
    };
    person
}
}
$ cargo expand
...
#[repr(C)]
pub struct Person {
    name: String,
    age: u32,
}
#[automatically_derived]
impl ::core::clone::Clone for Person {
    #[inline]
    fn clone(&self) -> Person {
        Person {
            name: ::core::clone::Clone::clone(&self.name),
            age: ::core::clone::Clone::clone(&self.age),
        }
    }
}
pub fn attribute_person() -> Person {
    let person = Person {
        name: "Rustacean".to_string(),
        age: 22,
    };
    person
}

Macro attributes

Attributes are metadata either attached to the containing item (inner attribute) or attached to the following item (outer attribute). To extend the built-in attributes, proc macros must be used. This enables implementation of both proc macro attributes and derive macro helper attributes.

Introduction

Source

Initialize a new workspace with cargo init then run cargo add tokio --features full.

use tokio::time::{Duration, sleep};

#[tokio::main]
async fn main() {
    make_coffee().await;
    toast_bread().await;
}

async fn make_coffee() {
    sleep(Duration::from_secs(3)).await;
}

async fn toast_bread() {
    sleep(Duration::from_secs(2)).await;
}

The Tokio crate, the async/await keywords and the Future trait are described in many places online. Some of the best are the long and detailed explanations by Jon Gjengset. This book assumes you are already familiar with these async constructs.

Build

$ cargo rustc --release

Ghidra

Load the ELF file (located at target/aarch64-unknown-linux-musl/release/) into Ghidra and auto-analyze it.

Locating the async tasks

The first thing to do is locate the implemented async tasks and their relationships. To start, we can see in the expanded code how the runtime is constructed and our tasks are passed to block_on.

$ cargo expand
#![feature(prelude_import)]
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
use tokio::time::{Duration, sleep};
fn main() {
    let body = async {
        make_coffee().await;
        toast_bread().await;
    };
    #[allow(
        clippy::expect_used,
        clippy::diverging_sub_expression,
        clippy::needless_return
    )]
    {
        return tokio::runtime::Builder::new_multi_thread()
            .enable_all()
            .build()
            .expect("Failed building the Runtime")
            .block_on(body);
    }
}
async fn make_coffee() {
    sleep(Duration::from_secs(3)).await;
}
async fn toast_bread() {
    sleep(Duration::from_secs(2)).await;
}

Based on this output, we expect to see the runtime built by the new_multi_thread(), enable_all() and expect sequence and the tasks executed by block_on.

Decompiled code:


/* WARNING: Unknown calling convention: __rustcall */
/* rust_lab::main */

void __rustcall rust_lab::main(void)

{
...
  undefined1 auStack_248 [205];
  undefined2 local_17b;
...
                    /* try { // try from 0040a6b4 to 0040a6bb has its CatchHandler @ 0040a90c */
  tokio::runtime::builder::Builder::new_multi_thread(auStack_248);
  local_17b = 0x101;
                    /* try { // try from 0040a6cc to 0040a6d7 has its CatchHandler @ 0040a914 */
  tokio::runtime::builder::Builder::build(&local_d0,auStack_248);
  if (local_d0 == 2) {
    local_170[0] = lStack_c8;
                    /* try { // try from 0040a884 to 0040a8a7 has its CatchHandler @ 0040a91c */
                    /* WARNING: Subroutine does not return */
    core::result::unwrap_failed
              ("Failed building the Runtime",0x1b,local_170,
               &PTR_drop_in_place<std::io::error::Error>_004baf90,&PTR_s_src/main.rs_004bb040);
  }
...

The enable_all function call is nowhere to be seen. The reason is that it has been replaced by local_17b = 0x101;, which sets the enable_io and enable_time fields (at offset 205).

$ cargo rustc --release -- -Zprint-type-sizes
...
print-type-size type: `tokio::runtime::Builder`: 216 bytes, alignment: 8 bytes
print-type-size     field `.worker_threads`: 16 bytes
print-type-size     field `.thread_stack_size`: 16 bytes
print-type-size     field `.global_queue_interval`: 8 bytes
print-type-size     field `.keep_alive`: 16 bytes
print-type-size     field `.thread_name`: 16 bytes
print-type-size     field `.nevents`: 8 bytes
print-type-size     field `.max_blocking_threads`: 8 bytes
print-type-size     field `.after_start`: 16 bytes
print-type-size     field `.before_stop`: 16 bytes
print-type-size     field `.before_park`: 16 bytes
print-type-size     field `.after_unpark`: 16 bytes
print-type-size     field `.before_spawn`: 16 bytes
print-type-size     field `.after_termination`: 16 bytes
print-type-size     field `.seed_generator`: 16 bytes
print-type-size     field `.event_interval`: 4 bytes
print-type-size     field `.kind`: 1 bytes
print-type-size     field `.enable_io`: 1 bytes
print-type-size     field `.enable_time`: 1 bytes
print-type-size     field `.start_paused`: 1 bytes
print-type-size     field `.disable_lifo_slot`: 1 bytes
print-type-size     field `.metrics_poll_count_histogram_enable`: 1 bytes
print-type-size     field `.metrics_poll_count_histogram`: 0 bytes
print-type-size     end padding: 6 bytes
...

Additionally, expect has been replaced with unwrap_failed.

Scrolling through the remaining code in main we cannot locate block_on. The reason for that is that it is called via enter_runtime:

...
                    /* try { // try from 0040a6b4 to 0040a6bb has its CatchHandler @ 0040a90c */
  tokio::runtime::builder::Builder::new_multi_thread(auStack_248);
  local_17b = 0x101;
                    /* try { // try from 0040a6cc to 0040a6d7 has its CatchHandler @ 0040a914 */
  tokio::runtime::builder::Builder::build(&local_d0,auStack_248);
  if (local_d0 == 2) {
    local_170[0] = lStack_c8;
                    /* try { // try from 0040a884 to 0040a8a7 has its CatchHandler @ 0040a91c */
                    /* WARNING: Subroutine does not return */
    core::result::unwrap_failed
              ("Failed building the Runtime",0x1b,local_170,
               &PTR_drop_in_place<std::io::error::Error>_004baf90,&PTR_s_src/main.rs_004bb040);
  }
...
                    /* try { // try from 0040a71c to 0040a727 has its CatchHandler @ 0040a8ec */
  tokio::runtime::runtime::Runtime::enter(&local_e8,&local_2a0);
  if ((int)local_2a0 == 1) {
    local_d0 = (ulong)uStack_31f << 8;
                    /* try { // try from 0040a758 to 0040a76f has its CatchHandler @ 0040a8c8 */
    tokio::runtime::context::runtime::enter_runtime
              (&uStack_270,1,&local_d0,
               &PTR_anon.e686db471eac9d0c22db85cdbc9be48c.37.llvm.11646938216170472302_004bafb0);
  }
...

/* WARNING: Unknown calling convention: __rustcall */
/* tokio::runtime::context::runtime::enter_runtime */

void __rustcall
tokio::runtime::context::runtime::enter_runtime
          (int *param_1,undefined4 param_2,undefined8 *param_3,undefined8 param_4)

{
...
      park::CachedParkThread::block_on(pppuVar5,&local_e0);
...

async tasks

Now that we have located block_on, we can investigate how the async tasks are handled.

Digging deeper and looking at the HIR we can see how the async and await keywords are resolved. Simply put: the tasks are polled until they are ready. When a task is pending, it yields which means it signals to the runtime that the runtime can handle other tasks and check back later.

$ cargo rustc --release --quiet -- -Z unpretty=hir                                     
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
use ::{};
use tokio::time::Duration;
use tokio::time::sleep;

fn main() {
    let body =
        |mut _task_context: ResumeTy|
            {
                match #[lang = "into_future"](make_coffee()) {
                    mut __awaitee =>
                        loop {
                            match unsafe {
                                    #[lang = "poll"](#[lang = "new_unchecked"](&mut __awaitee),
                                        #[lang = "get_context"](_task_context))
                                } {
                                #[lang = "Ready"] {  0: result } => break result,
                                #[lang = "Pending"] {} => { }
                            }
                            _task_context = (yield ());
                        },
                };
                match #[lang = "into_future"](toast_bread()) {
                    mut __awaitee =>
                        loop {
                            match unsafe {
                                    #[lang = "poll"](#[lang = "new_unchecked"](&mut __awaitee),
                                        #[lang = "get_context"](_task_context))
                                } {
                                #[lang = "Ready"] {  0: result } => break result,
                                #[lang = "Pending"] {} => { }
                            }
                            _task_context = (yield ());
                        },
                };
            };
    #[allow(clippy :: expect_used, clippy :: diverging_sub_expression, clippy
    :: needless_return)]
    {
        return tokio::runtime::Builder::new_multi_thread().enable_all().build().expect("Failed building the Runtime").block_on(body);
    }
}

async fn make_coffee()
    ->
        /*impl Trait*/ |mut _task_context: ResumeTy|
    {
        {
            let _t =
                {
                    match #[lang = "into_future"](sleep(Duration::from_secs(3)))
                        {
                        mut __awaitee =>
                            loop {
                                match unsafe {
                                        #[lang = "poll"](#[lang = "new_unchecked"](&mut __awaitee),
                                            #[lang = "get_context"](_task_context))
                                    } {
                                    #[lang = "Ready"] {  0: result } => break result,
                                    #[lang = "Pending"] {} => { }
                                }
                                _task_context = (yield ());
                            },
                    };
                };
            _t
        }
    }

async fn toast_bread()
    ->
        /*impl Trait*/ |mut _task_context: ResumeTy|
    {
        {
            let _t =
                {
                    match #[lang = "into_future"](sleep(Duration::from_secs(2)))
                        {
                        mut __awaitee =>
                            loop {
                                match unsafe {
                                        #[lang = "poll"](#[lang = "new_unchecked"](&mut __awaitee),
                                            #[lang = "get_context"](_task_context))
                                    } {
                                    #[lang = "Ready"] {  0: result } => break result,
                                    #[lang = "Pending"] {} => { }
                                }
                                _task_context = (yield ());
                            },
                    };
                };
            _t
        }
    }

Decompiled code:


/* WARNING: Unknown calling convention: __rustcall */
/* tokio::runtime::park::CachedParkThread::block_on */

bool __rustcall tokio::runtime::park::CachedParkThread::block_on(long param_1,char *param_2)

{
...
    do {
      if ( ... ) {
...
        tokio::time::sleep::sleep(3,0,&PTR_s_src/main.rs_004bb080);
...
        uVar5 = _<>::poll();
        if ((uVar5 & 1) == 0) {
...
          tokio::time::sleep::sleep(2,0,&PTR_s_src/main.rs_004bb0b0);
...
          goto LAB_0040b2b0;
        }
...
joined_r0x0040b25c:
        bVar2 = true;
...
      }
      else { ... }
LAB_0040b2b0:
...
        uVar5 = _<>::poll();
        if ((uVar5 & 1) != 0) {
...
          goto joined_r0x0040b25c;
        }
...
      if (!bVar2) goto LAB_0040b324;
      park(param_1);
...
    } while( true );
...
LAB_0040b358:
  return lVar6 == 0;
LAB_0040b324:
...
  goto LAB_0040b358;
}

Explanation:

  • call sleep(3)
  • call poll to check the progress
    • if it returns 0 (ready), then call sleep(2)
      • call poll to check the progress
        • if it returns 1 (pending), set bVar2 to true which means the thread will be parked and resumed later
    • if it returns 1 (pending), set bVar2 to true which means the thread will be parked and resumed later

Additionally, Tokio uses variables to track the progress of the tasks, so it knows where to continue the thread. For readability, these variables have been removed from the decompiled code.

Startup

This section shows how to locate the user-defined main function and trace the call chain that leads to its execution.

Source

Initialize a new workspace with cargo init.

The source code of chapter Hello, world! is reused here.

Build

$ cargo rustc --release

Ghidra

Load the binary into Ghidra and auto-analyze it.

Locating main

In an std environment (as opposed to no_std), the user-defined main function (here rust_lab::main) is called by lang_start_internal.

Call graph:

_start
    _start_c
        __libc_start_main
            main
                lang_start_internal
                    rust_lab::main

Decompiled code:

void main(int param_1,undefined8 param_2)

{
  code *pcStack_8;
  
  pcStack_8 = rust_lab::main;
  std::rt::lang_start_internal(&pcStack_8,&DAT_0046d6e8,(long)param_1,param_2,0);
  return;
}

lang_start_internal can be easily recognized, even if symbols are stripped. The first parameter is the rust_lab::main function being passed.

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined main()
             undefined         <UNASSIGNED>   <RETURN>
             undefined8        Stack[-0x10]:8 local_10                                XREF[1]:     00401b38(W)  
                             main                                            XREF[5]:     Entry Point(*), 
                                                                                          _start_c:004019f4(*), 00453c74, 
                                                                                          004648c4(*), 0046ff90(*)  
        00401b28 08 00 00 90     adrp       x8,0x401000
                             adrp + add loads the address of rust_lab::main
        00401b2c 08 c1 2b 91     add        x8,x8,#0xaf0
                             arg3: argv
        00401b30 e3 03 01 aa     mov        x3,x1
                             arg2: argc
        00401b34 02 7c 40 93     sxtw       x2,w0
                             store link register and address of rust_lab::main on the stack
        00401b38 fe 23 bf a9     stp        x30,x8=>rust_lab::main,[sp, #local_10]!
        00401b3c 61 03 00 90     adrp       x1,0x46d000
                             arg1: vtable pointer of the trait object
        00401b40 21 a0 1b 91     add        x1=>DAT_0046d6e8,x1,#0x6e8
                             arg0: data pointer of the trait object
        00401b44 e0 23 00 91     add        x0,sp,#0x8
                             arg4: 0
        00401b48 e4 03 1f 2a     mov        w4,wzr
        00401b4c ad 5b 00 94     bl         std::rt::lang_start_internal                     undefined lang_start_internal()
        00401b50 fe 07 41 f8     ldr        x30,[sp], #0x10
        00401b54 c0 03 5f d6     ret

Understanding the lang_start_internal arguments

If we look at the signature of lang_start_internal, we can see that it accepts 4 arguments, but the decompiled code above shows 5.

#![allow(unused)]
fn main() {
// To reduce the generated code of the new `lang_start`, this function is doing
// the real work.
#[cfg(not(test))]
fn lang_start_internal(
    main: &(dyn Fn() -> i32 + Sync + crate::panic::RefUnwindSafe),
    argc: isize,
    argv: *const *const u8,
    sigpipe: u8,
) -> isize {
...
}

This is because the first argument is a closure that is converted to a trait object when lang_start calls lang_start_internal.

#![allow(unused)]
fn main() {
#[cfg(not(any(test, doctest)))]
#[lang = "start"]
fn lang_start<T: crate::process::Termination + 'static>(
    main: fn() -> T,
    argc: isize,
    argv: *const *const u8,
    sigpipe: u8,
) -> isize {
    lang_start_internal(
        &move || crate::sys::backtrace::__rust_begin_short_backtrace(main).report().to_i32(),
        argc,
        argv,
        sigpipe,
    )
}
}

Within the closure body, __rust_begin_short_backtrace is called, which then calls rust_lab::main.

#![allow(unused)]
fn main() {
/// Fixed frame used to clean the backtrace with `RUST_BACKTRACE=1`. Note that
/// this is only inline(never) when backtraces in std are enabled, otherwise
/// it's fine to optimize away.
#[cfg_attr(feature = "backtrace", inline(never))]
pub fn __rust_begin_short_backtrace<F, T>(f: F) -> T
where
    F: FnOnce() -> T,
{
    let result = f();

    // prevent this frame from being tail-call optimised away
    crate::hint::black_box(());

    result
}
}

Trait objects are represented by a data pointer (here: address of rust_lab::main) and a vtable pointer (here: DAT_0046d6e8).

                             DAT_0046d6e8                                    XREF[1]:     main:00401b40(*)  
        0046d6e8 00              undefined1 00h
        0046d6e9 00              ??         00h
        ...
        0046d700 d8 1a 40        addr       core::ops::function::FnOnce::call_once{{vtable
                 00 00 00 
                 00 00
        0046d708 b0 1a 40        addr       std::rt::lang_start::_{{closure}}
                 00 00 00 
                 00 00
        0046d710 b0 1a 40        addr       std::rt::lang_start::_{{closure}}
                 00 00 00 
                 00 00

If we look at the disassembly of lang_start_internal, we can see which vtable entry it uses to execute rust_lab::main:


/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */
/* WARNING: Unknown calling convention: __rustcall */
/* std::rt::lang_start_internal */

long __rustcall
std::rt::lang_start_internal
          (undefined8 param_1,long param_2,undefined8 param_3,undefined8 param_4,byte param_5)

{
...
  (**(code **)(param_2 + 0x28))(param_1);
...

Using this offset and the vtable address, we can calculate the address of the vtable entry which contains the address of std::rt::lang_start::_{{closure}}:

0x0046d6e8 + 0x28 = 0x0046d710
/* WARNING: Unknown calling convention: __rustcall */
/* std::rt::lang_start::_{{closure}} */

undefined8 __rustcall std::rt::lang_start::_{{closure}}(undefined8 *param_1)

{
  sys::backtrace::__rust_begin_short_backtrace(*param_1);
  return 0;
}

As we also saw earlier, __rust_begin_short_backtrace calls rust_lab::main in the end.

/* WARNING: Unknown calling convention: __rustcall */
/* std::sys::backtrace::__rust_begin_short_backtrace */

void __rustcall std::sys::backtrace::__rust_begin_short_backtrace(code *param_1)

{
  (*param_1)();
  return;
}

Panic: unwind vs abort

When a Rust program panics, it can handle the failure in two ways: unwind or abort. Unwind mode cleans up resources as the panic travels up the call stack, while abort mode immediately terminates the program. The chosen mode affects the generated code.

Source

Initialize a new workspace with cargo init.

The source code of chapter Vec is reused here.

Build

$ cargo rustc --release

By default the option panic=unwind is used.

Ghidra

Load the binary into Ghidra and auto-analyze it.

rust_lab::main

While scrolling through the listing, we can see that Ghidra adds try-catch comments to the following parts, and we notice XREFs from addresses that are far away.

                             try { // try from 00401958 to 00401967 has its CatchHandler @
                             LAB_00401958                                    XREF[1]:     00453a88(*)  
        00401958 61 03 00 90     adrp       x1,0x46d000
                             arg1: source location for debugging
        0040195c 21 80 21 91     add        x1=>PTR_s_src/main.rs_0046d860,x1,#0x860         = 0044aea0
                             arg0: address of vec
        00401960 e0 23 00 91     add        x0,sp,#0x8
                             increase cap from 3 to 8
        00401964 fd 0d 01 94     bl         alloc::raw_vec::RawVec<T,A>::grow_one            undefined grow_one()
                             } // end try from 00401958 to 00401967
                             catch() { ... } // from try @ 00401958 with catch @ 004019ac
                             LAB_004019ac                                    XREF[1]:     00453a8a(*)  
        004019ac e1 07 40 f9     ldr        x1,[sp, #0x8]
        004019b0 f3 03 00 aa     mov        x19,x0
        004019b4 81 00 00 b4     cbz        x1,LAB_004019c4
        004019b8 e0 0b 40 f9     ldr        x0,[sp, #0x10]
        004019bc 22 00 80 52     mov        w2,#0x1
        004019c0 14 00 00 94     bl         __rustc::__rust_dealloc                          void __rust_dealloc(u8 * ptr, us
                             LAB_004019c4                                    XREF[1]:     004019b4(j)  
        004019c4 e0 03 13 aa     mov        x0,x19
        004019c8 30 c7 00 94     bl         _Unwind_Resume                                   undefined _Unwind_Resume()
                             -- Flow Override: CALL_RETURN (CALL_TERMINATOR)

If we follow the XREFs, we can see they are under the LSDA (Language-Specific Data Area) located in section .gcc_except_table.

                             //
                             // .gcc_except_table 
                             // SHT_PROGBITS  [0x453a84 - 0x454c67]
                             // ram:00453a84-ram:00454c67
                             //
                             **************************************************************
                             * Language-Specific Data Area                                *
                             **************************************************************
...
        00453a88 44              uleb128    LAB_00401958                                     (LSDA Call Site) IP Offset
        00453a89 10              uleb128    10h                                              (LSDA Call Site) IP Range Length
        00453a8a 98 01           uleb128    LAB_004019ac                                     (LSDA Call Site) Landing Pad Add
        00453a8c 00              uleb128    0h                                               (LSDA Call Site) Action Table Of
...

This means, if a panic occurs while the execution is between 0x00401958 and 0x00401958 + 0x10 = 0x00401968, during unwinding, the code located at 0x004019ac will be executed. In this case, if necessary (the vector capacity is not null) it deallocates the allocated block and continues the unwinding process.