Introduction
This book explores how different Rust constructs translate to ARM64/AArch64 assembly.
⚠🚧 The book is still under construction. New chapters will be added and the existing ones might be modified.
Since compilation involves multiple intermediate steps, we will trace through HIR, MIR and LLVM IR when it helps explain the final assembly output. We will only discuss the Rust frontend, not the LLVM backend. Still, for completeness, good documentation for the LLVM backend can be found here.
The Rust compiler overview can be found here.
Prerequisites
Rust compiler
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
AArch64 musl libraries
$ rustup target install aarch64-unknown-linux-musl
Rust nightly toolchain
This is necessary because we will be using some nightly features.
$ rustup install nightly
$ rustup default nightly
Alternatively, just override the default toolchain in your working directory:
$ rustup override set nightly
Rust default config
$ export CARGO_BUILD_TARGET=aarch64-unknown-linux-musl
$ export CARGO_TARGET_AARCH64_UNKNOWN_LINUX_MUSL_LINKER=aarch64-linux-gnu-gcc
$ export CARGO_TARGET_AARCH64_UNKNOWN_LINUX_MUSL_RUNNER="qemu-aarch64"
Alternatively, add the following to .cargo/config.toml
:
[build]
target = "aarch64-unknown-linux-musl"
[target.aarch64-unknown-linux-musl]
linker = "aarch64-linux-gnu-gcc"
runner = "qemu-aarch64"
AArch64 cross-compiler + binutils + sysroot
$ sudo dnf install gcc-aarch64-linux-gnu
$ sudo dnf install binutils-aarch64-linux-gnu
$ sudo dnf install sysroot-aarch64-fc41-glibc
LLVM
Necessary for binutils, e.g. llvm-objdump
.
$ sudo dnf install llvm
rustfilt
(optional)
If you need to manually demangle a symbol, rustfilt
is very convenient:
$ cargo install rustfilt
$ echo _ZN8rust_lab4main17hf9a0ba7e2c977e69E | rustfilt
rust_lab::main
QEMU
$ sudo dnf install qemu-user
Ghidra
$ wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.3.2_build/ghidra_11.3.2_PUBLIC_20250415.zip
Test the cross-compilation setup
With a Cargo project:
$ cargo init
$ cargo build
$ cargo run --quiet
Hello, world!
Without a Cargo project:
$ echo 'fn main() { println!("Hello ARM64!"); }' > test.rs
$ rustc --target aarch64-unknown-linux-musl -C linker=aarch64-linux-gnu-gcc test.rs
$ qemu-aarch64 ./test
Hello ARM64!
References
Rust
ARM64
- A64 Instruction Set Architecture Guide
- Arm Architecture Reference Manual
- ABI for the Arm Architecture (Base Standard)
- Blue Fox: Arm Assembly Internals and Reverse Engineering
Hello, world!
Source
Initialize a new workspace with cargo init
.
fn main() { println!("Hello, world!"); }
println!
is a macro that wraps the _print
function.
$ cargo rustc --release --quiet -- -Z unpretty=expanded
#![feature(prelude_import)]
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
fn main() { { ::std::io::_print(format_args!("Hello, world!\n")); }; }
Alternatively, you can see all macro expansions (including built-in ones) in the HIR:
$ cargo rustc --release --quiet -- -Z unpretty=hir
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
fn main() {
{ ::std::io::_print(format_arguments::new_const(&["Hello, world!\n"])); };
}
More information about declarative macros can be found in chapter Declarative macros and chapter Built-in declarative macros.
Build
$ cargo rustc --release
Ghidra
Load the binary into Ghidra and auto-analyze it.
rust_lab::main
As we saw above, println!
is expanded to a _print
call, which accepts an Arguments
struct.
While reconstructing the Arguments
type, the type size information is very useful. Note that the compiler might reorder the struct fields.
$ cargo rustc --release --quiet -- -Z print-type-sizes
print-type-size type: `core::fmt::rt::Placeholder`: 48 bytes, alignment: 8 bytes
print-type-size field `.precision`: 16 bytes
print-type-size field `.width`: 16 bytes
print-type-size field `.position`: 8 bytes
print-type-size field `.flags`: 4 bytes
print-type-size end padding: 4 bytes
print-type-size type: `std::fmt::Arguments<'_>`: 48 bytes, alignment: 8 bytes
print-type-size field `.pieces`: 16 bytes
print-type-size field `.args`: 16 bytes
print-type-size field `.fmt`: 16 bytes
...
print-type-size type: `core::fmt::rt::Argument<'_>`: 16 bytes, alignment: 8 bytes
print-type-size field `.ty`: 16 bytes
print-type-size type: `core::fmt::rt::ArgumentType<'_>`: 16 bytes, alignment: 8 bytes
print-type-size variant `Placeholder`: 16 bytes
print-type-size field `.value`: 8 bytes
print-type-size field `.formatter`: 8 bytes
print-type-size field `._lifetime`: 0 bytes
print-type-size variant `Count`: 10 bytes
print-type-size padding: 8 bytes
print-type-size field `.0`: 2 bytes, alignment: 2 bytes
print-type-size type: `core::fmt::rt::Count`: 16 bytes, alignment: 8 bytes
print-type-size discriminant: 2 bytes
print-type-size variant `Param`: 14 bytes
print-type-size padding: 6 bytes
print-type-size field `.0`: 8 bytes, alignment: 8 bytes
print-type-size variant `Is`: 2 bytes
print-type-size field `.0`: 2 bytes
print-type-size variant `Implied`: 0 bytes
print-type-size type: `std::option::Option<&[core::fmt::rt::Placeholder]>`: 16 bytes, alignment: 8 bytes
print-type-size variant `Some`: 16 bytes
print-type-size field `.0`: 16 bytes
print-type-size variant `None`: 0 bytes
...
The simplified Arguments
type can be represented like this (explained in detail later). This is not valid C syntax of course, as &
, []
or <>
cannot be used in C struct names.
struct &[&str] {
ptr64 ptr;
usize len;
};
struct &[Argument] {
ptr64 ptr;
usize len;
};
struct Option<&[Placeholder]> {
ptr64 ptr;
usize len;
};
struct Arguments {
struct &[&str] pieces;
struct &[Argument] args;
struct Option<&[Placeholder]> fmt;
};
Listing:
**************************************************************
* rust_lab::main *
**************************************************************
undefined __rustcall main()
undefined <UNASSIGNED> <RETURN>
undefined8 Stack[-0x10]:8 local_10 XREF[2]: 00401af4(W),
00401b1c(R)
Arguments Stack[-0x40] arguments XREF[1,2]: 00401b04(W),
00401b14(W),
00401b10(W)
_ZN8rust_lab4main17hf9a0ba7e2c977e69E XREF[3]: main:00401b38(*), 00453c6c,
rust_lab::main 004648a8(*)
00401af0 ff 03 01 d1 sub sp,sp,#0x40
00401af4 fe 1b 00 f9 str x30,[sp, #local_10]
00401af8 68 03 00 90 adrp x8,0x46d000
00401afc 08 61 1c 91 add x8,x8,#0x718
00401b00 29 00 80 52 mov w9,#0x1
store pieces.ptr and pieces.len
00401b04 e8 27 00 a9 stp x8=>PTR_s_Hello,_world!_0046d718,x9,[sp]=>argu = 0044c1a0
00401b08 08 01 80 52 mov w8,#0x8
move the struct address to the first argument
00401b0c e0 03 00 91 mov x0,sp
zero out args.len and fmt.ptr
00401b10 ff ff 01 a9 stp xzr,xzr,[sp, #arguments+0x18]
store args.ptr
00401b14 e8 0b 00 f9 str x8,[sp, #arguments.args.ptr]
00401b18 da 63 00 94 bl std::io::stdio::_print undefined _print()
00401b1c fe 1b 40 f9 ldr x30,[sp, #local_10]
00401b20 ff 03 01 91 add sp,sp,#0x40
00401b24 c0 03 5f d6 ret
The logic is simple: it constructs an Arguments
struct on the stack and passes the address of it via sp
to the _print
function.
Decompiled code (after creating the Arguments
type in the Structure Editor and applying it in the code):
/* WARNING: Unknown calling convention: __rustcall */
/* rust_lab::main */
void __rustcall rust_lab::main(void)
{
Arguments arguments;
/* store pieces.ptr and pieces.len */
arguments.pieces.ptr = (ptr64)&PTR_s_Hello,_world!_0046d718;
arguments.pieces.len = 1;
/* move the struct address to the first argument */
/* zero out args.len and fmt.ptr */
arguments.args.len = 0;
arguments.fmt.ptr = (ptr64)0x0;
/* store args.ptr */
arguments.args.ptr = (ptr64)0x8;
std::io::stdio::_print(&arguments);
return;
}
From the Rust reference:
Though you should not rely on this, all pointers to DSTs are currently twice the size of the size of usize and have the same alignment.
In practice, this means that the fields of the struct Arguments
are 16 bytes in memory: an 8 byte pointer and an 8 byte length. This is confirmed by the output of -Z print-type-sizes
above.
pieces
is a reference to a slice of str
references (&str
). In this case, pieces
references only 1 &str
which is also an 8 byte pointer and an 8 byte length.
PTR_s_Hello,_world!_0046d718 XREF[1]: main:00401b04(*)
0046d718 a0 c1 44 addr s_Hello,_world!_0044c1a0 = "Hello, world!\n"
00 00 00
00 00
0046d720 0e ?? 0Eh
0046d721 00 ?? 00h
0046d722 00 ?? 00h
0046d723 00 ?? 00h
0046d724 00 ?? 00h
0046d725 00 ?? 00h
0046d726 00 ?? 00h
0046d727 00 ?? 00h
args
is a reference to a slice of Argument
items and it references an empty slice now. Empty slices do not point to null but their size is 0. They point to valid addresses instead, depending on the alignment (8 bytes here).
print-type-size type: `core::fmt::rt::Argument<'_>`: 16 bytes, alignment: 8 bytes
print-type-size field `.ty`: 16 bytes
fn main() { let empty_u8: &[u8] = &[]; // 1-byte aligned let empty_u32: &[u32] = &[]; // 4-byte aligned let empty_u64: &[u64] = &[]; // 8-byte aligned println!("u8 address: {}", empty_u8.as_ptr() as usize); println!("u32 address: {}", empty_u32.as_ptr() as usize); println!("u64 address: {}", empty_u64.as_ptr() as usize); }
$ cargo run --release --quiet
u8 address: 1
u32 address: 4
u64 address: 8
fmt
is an optional reference to a slice of Placeholder
items. For Option<&[T]>
, Rust -often- uses null pointer optimization where None
is represented by a null pointer. Therefore, the length field is irrelevant and is not populated in the current example.
print-type-size type: `std::option::Option<&[core::fmt::rt::Placeholder]>`: 16 bytes, alignment: 8 bytes
print-type-size variant `Some`: 16 bytes
print-type-size field `.0`: 16 bytes
print-type-size variant `None`: 0 bytes
rust-gdb
We can verify the results of our static analysis using rust-gdb
(or rust-lldb
) which supports Rust types.
First we need to create a debug build where the function new_const
constructing the Arguments
struct is not optimized and inlined.
$ cargo rustc
Then we start a GDB server and connect to it with the rust-gdb
client. We will examine the Arguments
struct returned by new_const
.
$ qemu-aarch64 -g 1234 target/aarch64-unknown-linux-musl/debug/rust-lab
$ rust-gdb -q -ex "target remote localhost:1234" target/aarch64-unknown-linux-musl/debug/rust-lab
Reading symbols from target/aarch64-unknown-linux-musl/debug/rust-lab...
Remote debugging using localhost:1234
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
0x00000000004019bc in _start ()
(gdb) b rust_lab::main
Breakpoint 1 at 0x401bd4: file src/main.rs, line 2.
(gdb) c
Continuing.
Breakpoint 1, rust_lab::main () at src/main.rs:2
2 println!("Hello, world!");
(gdb) disas
Dump of assembler code for function _ZN8rust_lab4main17hb3ccde9ab543d852E:
0x0000000000401bc0 <+0>: sub sp, sp, #0x50
0x0000000000401bc4 <+4>: stp x29, x30, [sp, #64]
0x0000000000401bc8 <+8>: add x29, sp, #0x40
0x0000000000401bcc <+12>: add x8, sp, #0x10
0x0000000000401bd0 <+16>: str x8, [sp, #8]
=> 0x0000000000401bd4 <+20>: adrp x0, 0x46d000
0x0000000000401bd8 <+24>: add x0, x0, #0x710
0x0000000000401bdc <+28>: bl 0x401b54 <_ZN4core3fmt2rt38_$LT$impl$u20$core..fmt..Arguments$GT$9new_const17h2005e5bc47942c4fE>
0x0000000000401be0 <+32>: ldr x0, [sp, #8]
0x0000000000401be4 <+36>: bl 0x41abf0 <_ZN3std2io5stdio6_print17h5a3b0843896b0124E>
0x0000000000401be8 <+40>: ldp x29, x30, [sp, #64]
0x0000000000401bec <+44>: add sp, sp, #0x50
0x0000000000401bf0 <+48>: ret
End of assembler dump.
(gdb) si 3
core::fmt::Arguments::new_const<1> (pieces=0x7f867d3fe830)
at /home/gemesa/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:226
226 pub const fn new_const<const N: usize>(pieces: &'a [&'static str; N]) -> Self {
(gdb) fin
Run till exit from #0 core::fmt::Arguments::new_const<1> (pieces=0x7f867d3fe830)
at /home/gemesa/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:226
0x0000000000401be0 in rust_lab::main () at src/main.rs:2
2 println!("Hello, world!");
Value returned is $1 = core::fmt::Arguments {pieces: &[&str](size=1) = {"Hello, world!\n"}, fmt: core::option::Option<&[core::fmt::rt::Placeholder]>::None, args: &[core::fmt::rt::Argument](size=0)}
(gdb)
The value returned matches the expected Arguments
struct.
Box
Source
Initialize a new workspace with cargo init --lib
.
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn create_boxed_value(n: i32) -> Box<i32> { Box::new(n * 2) } #[unsafe(no_mangle)] pub fn process_box(boxed: Box<i32>) -> i32 { *boxed + 10 } }
The Box
type is described in the official docs in detail. Further information can be found here. We are using no_mangle
to simplify things, the reason can be found here.
Build
$ cargo rustc --release -- --emit obj
llvm-objdump
For this example we will use llvm-objdump
instead of Ghidra, as Ghidra does not support the relocation type R_AARCH64_ADR_GOT_PAGE
.
Alternatively, the compiler can generate assembly output as well, although it is a bit less readable because of all the .cfi_*
directives:
$ cargo rustc --release -- --emit asm
$ cat target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.s
...
create_boxed_value:
.cfi_startproc
stp x29, x30, [sp, #-32]!
.cfi_def_cfa_offset 32
str x19, [sp, #16]
mov x29, sp
.cfi_def_cfa w29, 32
.cfi_offset w19, -16
.cfi_offset w30, -24
.cfi_offset w29, -32
.cfi_remember_state
adrp x8, :got:__rust_no_alloc_shim_is_unstable
mov w19, w0
mov w0, #4
ldr x8, [x8, :got_lo12:__rust_no_alloc_shim_is_unstable]
mov w1, #4
ldrb wzr, [x8]
bl _RNvCshIQntqZdYTC_7___rustc12___rust_alloc
cbz x0, .LBB0_2
lsl w8, w19, #1
str w8, [x0]
.cfi_def_cfa wsp, 32
ldr x19, [sp, #16]
ldp x29, x30, [sp], #32
.cfi_def_cfa_offset 0
.cfi_restore w19
.cfi_restore w30
.cfi_restore w29
ret
.LBB0_2:
.cfi_restore_state
mov w0, #4
mov w1, #4
bl _ZN5alloc5alloc18handle_alloc_error17h0e34a69f1cc3072eE
...
create_boxed_value
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn create_boxed_value(n: i32) -> Box<i32> { Box::new(n * 2) } }
Disassembly (the output is piped through rustfilt
to demangle symbols):
$ llvm-objdump -r --disassemble-symbols=create_boxed_value target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.o | rustfilt
target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.o: file format elf64-littleaarch64
Disassembly of section .text.create_boxed_value:
0000000000000000 <create_boxed_value>:
0: a9be7bfd stp x29, x30, [sp, #-0x20]!
4: f9000bf3 str x19, [sp, #0x10]
8: 910003fd mov x29, sp
c: 90000008 adrp x8, 0x0 <create_boxed_value>
000000000000000c: R_AARCH64_ADR_GOT_PAGE __rust_no_alloc_shim_is_unstable
10: 2a0003f3 mov w19, w0
14: 52800080 mov w0, #0x4 // =4
18: f9400108 ldr x8, [x8]
0000000000000018: R_AARCH64_LD64_GOT_LO12_NC __rust_no_alloc_shim_is_unstable
1c: 52800081 mov w1, #0x4 // =4
20: 3940011f ldrb wzr, [x8]
24: 94000000 bl 0x24 <create_boxed_value+0x24>
0000000000000024: R_AARCH64_CALL26 __rustc::__rust_alloc
28: b40000c0 cbz x0, 0x40 <create_boxed_value+0x40>
2c: 531f7a68 lsl w8, w19, #1
30: b9000008 str w8, [x0]
34: f9400bf3 ldr x19, [sp, #0x10]
38: a8c27bfd ldp x29, x30, [sp], #0x20
3c: d65f03c0 ret
40: 52800080 mov w0, #0x4 // =4
44: 52800081 mov w1, #0x4 // =4
48: 94000000 bl 0x48 <create_boxed_value+0x48>
0000000000000048: R_AARCH64_CALL26 alloc::alloc::handle_alloc_error
There is one piece of information we need to be familiar with to fully understand all of the instructions. __rust_no_alloc_shim_is_unstable
is an internal variable, ldrb wzr,[x8]
forces the linker to include the alloc shim. The tracking issue is here for anyone who might be interested.
Since our input is i32
, __rust_alloc
is called with the following arguments (size and alignment):
14: 52800080 mov w0, #0x4 // =4
1c: 52800081 mov w1, #0x4 // =4
If the allocation fails and returns null, handle_alloc_error
is called:
28: b40000c0 cbz x0, 0x40 <create_boxed_value+0x40>
After successful allocation, the input value is multiplied by 2 and stored in the allocated section:
10: 2a0003f3 mov w19, w0
...
2c: 531f7a68 lsl w8, w19, #1
30: b9000008 str w8, [x0]
process_box
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn process_box(boxed: Box<i32>) -> i32 { *boxed + 10 } }
Disassembly:
$ llvm-objdump -r --disassemble-symbols=process_box target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.o | rustfilt
target/aarch64-unknown-linux-musl/release/deps/rust_lab-15e16dbcf37b825b.o: file format elf64-littleaarch64
Disassembly of section .text.process_box:
0000000000000000 <process_box>:
0: a9be7bfd stp x29, x30, [sp, #-0x20]!
4: f9000bf3 str x19, [sp, #0x10]
8: 910003fd mov x29, sp
c: b9400013 ldr w19, [x0]
10: 52800081 mov w1, #0x4 // =4
14: 52800082 mov w2, #0x4 // =4
18: 94000000 bl 0x18 <process_box+0x18>
0000000000000018: R_AARCH64_CALL26 __rustc::__rust_dealloc
1c: 11002a60 add w0, w19, #0xa
20: f9400bf3 ldr x19, [sp, #0x10]
24: a8c27bfd ldp x29, x30, [sp], #0x20
28: d65f03c0 ret
The function takes ownership of the Box
and then boxed
goes out of scope at the end of the function, so the memory is automatically deallocated. Since the data stored in the Box
is i32
, __rust_dealloc
is called with the following arguments (pointer, size and alignment):
10: 52800081 mov w1, #0x4 // =4
14: 52800082 mov w2, #0x4 // =4
x0
already holds the pointer to the allocated memory, since the Box
is passed to the function.
10 is added to the value and the result is returned:
c: b9400013 ldr w19, [x0]
...
1c: 11002a60 add w0, w19, #0xa
Vec
Source
Initialize a new workspace with cargo init
.
fn main() -> std::process::ExitCode { let mut vec = vec![1, 2, 3]; vec.push(4); let ret = vec.pop().unwrap(); std::process::ExitCode::from(ret) }
vec!
is a macro which generates different code based on the passed argument. In our test code we pass a list of elements so the following pattern matches:
#![allow(unused)] fn main() { ($($x:expr),+ $(,)?) => ( <[_]>::into_vec($crate::boxed::box_new([$($x),+])) ); }
More information about declarative macros can be found in chapter Declarative macros.
$ cargo rustc --release --quiet -- -Z unpretty=hir
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
fn main()
->
std::process::ExitCode {
let mut vec = <[_]>::into_vec(::alloc::boxed::box_new([1, 2, 3]));
vec.push(4);
let ret = vec.pop().unwrap();
std::process::ExitCode::from(ret)
}
Build
$ cargo rustc --release
Ghidra
Load the binary into Ghidra and auto-analyze it.
rust_lab::main
The official docs explains Vec
in detail. The most important parts for us:
The capacity of a vector is the amount of space allocated for any future elements that will be added onto the vector. This is not to be confused with the length of a vector, which specifies the number of actual elements within the vector. If a vector’s length exceeds its capacity, its capacity will automatically be increased, but its elements will have to be reallocated.
Vec is and always will be a (pointer, capacity, length) triplet.
The order of these fields is completely unspecified
If a
Vec
has allocated memory, then the memory it points to is on the heap
We can verify this:
$ cargo rustc --release --quiet -- -Zprint-type-sizes
...
print-type-size type: `std::vec::Vec<u8>`: 24 bytes, alignment: 8 bytes
print-type-size field `.buf`: 16 bytes
print-type-size field `.len`: 8 bytes
print-type-size type: `alloc::raw_vec::RawVec<u8>`: 16 bytes, alignment: 8 bytes
print-type-size field `.inner`: 16 bytes
print-type-size field `._marker`: 0 bytes
print-type-size type: `alloc::raw_vec::RawVecInner`: 16 bytes, alignment: 8 bytes
print-type-size field `.cap`: 8 bytes
print-type-size field `.ptr`: 8 bytes
print-type-size field `.alloc`: 0 bytes
...
There are some abstractions involved, but ultimately the memory layout looks like this:
struct Vec {
usize cap;
ptr64 ptr;
usize len;
};
Relevant types:
With this background information, we are ready to look at the decompiled code. Note that this is not the raw decompiled code. First we needed to create the Vec<u8>
type in the Structure Editor and apply it in the code, then implement some further modifications, e.g., fix the prototype of __rust_alloc
.
/* WARNING: Unknown calling convention: __rustcall */
/* rust_lab::main */
u8 __rustcall rust_lab::main(void)
{
Vec<u8> vec;
u8 ret;
vec.ptr = __rustc::__rust_alloc(3,1);
if (vec.ptr != (u8 *)0x0) {
vec.ptr[2] = '\x03';
vec.ptr[0] = '\x01';
vec.ptr[1] = '\x02';
vec.cap = 3;
vec.len = 3;
/* try { // try from 00401958 to 00401967 has its CatchHandler @ 004019ac */
alloc::raw_vec::RawVec<T,A>::grow_one(&vec,&PTR_s_src/main.rs_0046d860);
vec.ptr[3] = 4;
vec.len = 3;
ret = vec.ptr[3];
__rustc::__rust_dealloc(vec.ptr,vec.cap,1);
return ret;
}
/* WARNING: Subroutine does not return */
alloc::alloc::handle_alloc_error(1,3);
}
The code is fairly self-explanatory at this point. But, for completeness, the logic is the following:
- allocate a 3 byte section on the heap
- initialize the heap data, capacity and length
- increase the capacity (normally, length is increased as well, but in this case it is optimized out since we immediately pop the last element)
- initialize the new heap data
- save the last element in the return variable
- deallocate the heap data
- return the return variable
The only thing we might need to discuss is grow_one
. If we check the implementation, we can see that it calls grow_amortized
.
#![allow(unused)] fn main() { fn grow_amortized( &mut self, len: usize, additional: usize, elem_layout: Layout, ) -> Result<(), TryReserveError> { ... let cap = cmp::max(self.cap.as_inner() * 2, required_cap); let cap = cmp::max(min_non_zero_cap(elem_layout.size()), cap); ... }
By default, this function doubles the capacity. In our case, since the old capacity is 3, the new one would be 6, but since the Layout
size is 1, it rounds up the capacity to 8.
#![allow(unused)] fn main() { // Tiny Vecs are dumb. Skip to: // - 8 if the element size is 1, because any heap allocators is likely // to round up a request of less than 8 bytes to at least 8 bytes. // ... const fn min_non_zero_cap(size: usize) -> usize { if size == 1 { 8 } else if size <= 1024 { 4 } else { 1 } } }
Call graph of grow_one
:
grow_one
finish_grow
__rust_realloc
__rust_alloc
Raw decompiled code for reference:
/* WARNING: Unknown calling convention: __rustcall */
/* rust_lab::main */
undefined1 __rustcall rust_lab::main(void)
{
undefined1 uVar1;
undefined2 *puVar2;
undefined8 local_38;
undefined2 *local_30;
undefined8 local_28;
puVar2 = (undefined2 *)0x3;
__rustc::__rust_alloc(3,1);
if (puVar2 != (undefined2 *)0x0) {
*(undefined1 *)(puVar2 + 1) = 3;
*puVar2 = 0x201;
local_38 = 3;
local_28 = 3;
/* try { // try from 00401958 to 00401967 has its CatchHandler @ 004019ac */
local_30 = puVar2;
alloc::raw_vec::RawVec<T,A>::grow_one(&local_38,&PTR_s_src/main.rs_0046d860);
*(undefined1 *)((long)local_30 + 3) = 4;
local_28 = 3;
uVar1 = *(undefined1 *)((long)local_30 + 3);
__rustc::__rust_dealloc(local_30,local_38,1);
return uVar1;
}
/* WARNING: Subroutine does not return */
alloc::alloc::handle_alloc_error(1,3);
}
Now that we see the big picture, we are ready to go through the listing. To easily match the listing with the decompiled code, it is annotated with pre-comments.
There is one additional piece of information we need to be familiar with to fully understand all of the instructions. __rust_no_alloc_shim_is_unstable
is an internal variable, ldrb wzr,[x8]=>__rust_no_alloc_shim_is_unstable
forces the linker to include the alloc shim. The tracking issue is here for anyone who might be interested.
Listing:
**************************************************************
* rust_lab::main *
**************************************************************
undefined __rustcall main()
undefined <UNASSIGNED> <RETURN>
undefined8 Stack[-0x10]:8 local_10 XREF[2]: 0040191c(W),
00401994(R)
undefined8 Stack[-0x20]:8 local_20 XREF[2]: 00401918(W),
00401990(*)
Vec<u8> Stack[-0x38] vec XREF[2,3]: 00401950(W),
0040197c(R),
00401968(R),
00401954(W),
00401980(W)
u8 HASH:5f59380 ret
_ZN8rust_lab4main17h4c095f7be815a79eE XREF[4]: main:004019f8(*), 004528cc,
rust_lab::main 00453a8f(*), 00464cc0(*)
00401914 ff 03 01 d1 sub sp,sp,#0x40
00401918 fd 7b 02 a9 stp x29,x30,[sp, #local_20]
0040191c f3 1b 00 f9 str x19,[sp, #local_10]
00401920 fd 83 00 91 add x29,sp,#0x20
00401924 68 03 00 d0 adrp x8,0x46f000
arg0: 3
00401928 60 00 80 52 mov w0,#0x3
arg1: 1
0040192c 21 00 80 52 mov w1,#0x1
00401930 08 a1 47 f9 ldr x8,[x8, #0xf40]=>->__rust_no_alloc_shim_is_uns = 00470ae4
00401934 73 00 80 52 mov w19,#0x3
force the linker to include the alloc shim
00401938 1f 01 40 39 ldrb wzr,[x8]=>__rust_no_alloc_shim_is_unstable = ??
0040193c 34 00 00 94 bl __rustc::__rust_alloc u8 * __rust_alloc(usize size, us
00401940 00 03 00 b4 cbz x0,LAB_004019a0
00401944 28 40 80 52 mov w8,#0x201
heap_data[2] = 3
00401948 13 08 00 39 strb w19,[x0, #0x2]
heap_data[0] = 1
heap_data[1] = 2
0040194c 08 00 00 79 strh w8,[x0]
init vec.cap and vec.ptr
00401950 f3 83 00 a9 stp x19,x0,[sp, #vec.cap]
init vec.len
00401954 f3 0f 00 f9 str x19,[sp, #vec.len]
try { // try from 00401958 to 00401967 has its CatchHandler @
LAB_00401958 XREF[1]: 00453a88(*)
00401958 61 03 00 90 adrp x1,0x46d000
arg1: source location for debugging
0040195c 21 80 21 91 add x1=>PTR_s_src/main.rs_0046d860,x1,#0x860 = 0044aea0
arg0: address of vec
00401960 e0 23 00 91 add x0,sp,#0x8
increase cap from 3 to 8
00401964 fd 0d 01 94 bl alloc::raw_vec::RawVec<T,A>::grow_one undefined grow_one()
} // end try from 00401958 to 00401967
LAB_00401968 XREF[1]: 00453a8d(*)
00401968 e8 0b 40 f9 ldr x8,[sp, #vec.ptr]
0040196c 89 00 80 52 mov w9,#0x4
arg2: 1
00401970 22 00 80 52 mov w2,#0x1
vec.ptr[3] = 4
00401974 09 0d 00 39 strb w9,[x8, #0x3]
00401978 68 00 80 52 mov w8,#0x3
arg0: vec.ptr
arg1: vec.cap
0040197c e1 83 40 a9 ldp x1,x0,[sp, #vec.cap]
vec.len = 3
00401980 e8 0f 00 f9 str x8,[sp, #vec.len]
save return value
00401984 13 0c 40 39 ldrb w19,[x0, #0x3]
00401988 22 00 00 94 bl __rustc::__rust_dealloc void __rust_dealloc(u8 * ptr, us
return value
0040198c e0 03 13 2a mov w0,w19
00401990 fd 7b 42 a9 ldp x29=>local_20,x30,[sp, #0x20]
00401994 f3 1b 40 f9 ldr x19,[sp, #local_10]
00401998 ff 03 01 91 add sp,sp,#0x40
0040199c c0 03 5f d6 ret
LAB_004019a0 XREF[1]: 00401940(j)
004019a0 20 00 80 52 mov w0,#0x1
004019a4 61 00 80 52 mov w1,#0x3
004019a8 0f fe ff 97 bl alloc::alloc::handle_alloc_error undefined handle_alloc_error()
-- Flow Override: CALL_RETURN (CALL_TERMINATOR)
rust-lldb
We can verify the results of our static analysis using rust-lldb
.
Start a GDB server and connect to it with the rust-lldb
client. We will examine our Vec
struct just before deallocation.
$ qemu-aarch64 -g 1234 target/aarch64-unknown-linux-musl/release/rust-lab
$ rust-lldb --source-quietly -o "gdb-remote localhost:1234" target/aarch64-unknown-linux-musl/release/rust-lab
Current executable set to '/home/gemesa/git-repos/rust-lab/target/aarch64-unknown-linux-musl/release/rust-lab' (aarch64).
Process 1447629 stopped
* thread #1, stop reason = signal SIGTRAP
frame #0: 0x00000000004017d4 rust-lab`_start
rust-lab`_start:
-> 0x4017d4 <+0>: mov x29, #0x0 ; =0
0x4017d8 <+4>: mov x30, #0x0 ; =0
0x4017dc <+8>: mov x0, sp
0x4017e0 <+12>: adrp x1, 0
(lldb) b _ZN8rust_lab4main17h4c095f7be815a79eE
Breakpoint 2: where = rust-lab`rust_lab::main::h4c095f7be815a79e, address = 0x0000000000401914
(lldb) c
Process 1447629 resuming
Process 1447629 stopped
* thread #1, stop reason = breakpoint 2.1
frame #0: 0x0000000000401914 rust-lab`rust_lab::main::h4c095f7be815a79e
rust-lab`rust_lab::main::h4c095f7be815a79e:
-> 0x401914 <+0>: sub sp, sp, #0x40
0x401918 <+4>: stp x29, x30, [sp, #0x20]
0x40191c <+8>: str x19, [sp, #0x30]
0x401920 <+12>: add x29, sp, #0x20
(lldb) disas
rust-lab`rust_lab::main::h4c095f7be815a79e:
-> 0x401914 <+0>: sub sp, sp, #0x40
0x401918 <+4>: stp x29, x30, [sp, #0x20]
0x40191c <+8>: str x19, [sp, #0x30]
0x401920 <+12>: add x29, sp, #0x20
0x401924 <+16>: adrp x8, 110
0x401928 <+20>: mov w0, #0x3 ; =3
0x40192c <+24>: mov w1, #0x1 ; =1
0x401930 <+28>: ldr x8, [x8, #0xf40]
0x401934 <+32>: mov w19, #0x3 ; =3
0x401938 <+36>: ldrb wzr, [x8]
0x40193c <+40>: bl 0x401a0c ; __rustc::__rust_alloc
0x401940 <+44>: cbz x0, 0x4019a0 ; <+140>
0x401944 <+48>: mov w8, #0x201 ; =513
0x401948 <+52>: strb w19, [x0, #0x2]
0x40194c <+56>: strh w8, [x0]
0x401950 <+60>: stp x19, x0, [sp, #0x8]
0x401954 <+64>: str x19, [sp, #0x18]
0x401958 <+68>: adrp x1, 108
0x40195c <+72>: add x1, x1, #0x860
0x401960 <+76>: add x0, sp, #0x8
0x401964 <+80>: bl 0x445158 ; alloc::raw_vec::RawVec$LT$T$C$A$GT$::grow_one::h19885d150c1bd8f5
0x401968 <+84>: ldr x8, [sp, #0x10]
0x40196c <+88>: mov w9, #0x4 ; =4
0x401970 <+92>: mov w2, #0x1 ; =1
0x401974 <+96>: strb w9, [x8, #0x3]
0x401978 <+100>: mov w8, #0x3 ; =3
0x40197c <+104>: ldp x1, x0, [sp, #0x8]
0x401980 <+108>: str x8, [sp, #0x18]
0x401984 <+112>: ldrb w19, [x0, #0x3]
0x401988 <+116>: bl 0x401a10 ; __rustc::__rust_dealloc
0x40198c <+120>: mov w0, w19
0x401990 <+124>: ldp x29, x30, [sp, #0x20]
0x401994 <+128>: ldr x19, [sp, #0x30]
0x401998 <+132>: add sp, sp, #0x40
0x40199c <+136>: ret
0x4019a0 <+140>: mov w0, #0x1 ; =1
0x4019a4 <+144>: mov w1, #0x3 ; =3
0x4019a8 <+148>: bl 0x4011e4 ; alloc::alloc::handle_alloc_error::h3005aad4027c4877
0x4019ac <+152>: ldr x1, [sp, #0x8]
0x4019b0 <+156>: mov x19, x0
0x4019b4 <+160>: cbz x1, 0x4019c4 ; <+176>
0x4019b8 <+164>: ldr x0, [sp, #0x10]
0x4019bc <+168>: mov w2, #0x1 ; =1
0x4019c0 <+172>: bl 0x401a10 ; __rustc::__rust_dealloc
0x4019c4 <+176>: mov x0, x19
0x4019c8 <+180>: bl 0x433688 ; _Unwind_Resume
(lldb) b *0x401988
Breakpoint 3: where = rust-lab`rust_lab::main::h4c095f7be815a79e + 116, address = 0x0000000000401988
(lldb) c
Process 1447629 resuming
Process 1447629 stopped
* thread #1, stop reason = breakpoint 3.1
frame #0: 0x0000000000401988 rust-lab`rust_lab::main::h4c095f7be815a79e + 116
rust-lab`rust_lab::main::h4c095f7be815a79e:
-> 0x401988 <+116>: bl 0x401a10 ; __rustc::__rust_dealloc
0x40198c <+120>: mov w0, w19
0x401990 <+124>: ldp x29, x30, [sp, #0x20]
0x401994 <+128>: ldr x19, [sp, #0x30]
(lldb) x/g $sp+8
0x7f7d0dd8d8c8: 0x0000000000000008
(lldb) x/g $sp+16
0x7f7d0dd8d8d0: 0x00007f7d0fc0d040
(lldb) x/g 0x00007f7d0fc0d040
0x7f7d0fc0d040: 0x0000000004030201
(lldb) x/g $sp+24
0x7f7d0dd8d8d8: 0x0000000000000003
The capacity is 8, the heap data is [1, 2, 3, 4]
and the length is 3, as expected.
enum
Source
Initialize a new workspace with cargo init --lib
.
#![allow(unused)] fn main() { pub enum Color { Red, Green, Blue, } #[unsafe(no_mangle)] pub fn u32_to_color(value: u32) -> Color { match value { 0 => Color::Red, 1 => Color::Green, 2 => Color::Blue, _ => Color::Red, } } #[unsafe(no_mangle)] pub fn simple_enum_match(color: Color) -> u8 { match color { Color::Red => 0, Color::Green => 1, Color::Blue => 2, } } }
#![allow(unused)] fn main() { pub enum BasicShape { Circle(i32), Point, } #[unsafe(no_mangle)] pub fn basic_shape_match(shape: BasicShape) -> i32 { match shape { BasicShape::Circle(radius) => radius * radius, BasicShape::Point => 0, } } #[unsafe(no_mangle)] pub fn make_basic_circle(radius: i32) -> BasicShape { BasicShape::Circle(radius) } #[unsafe(no_mangle)] pub fn make_basic_point() -> BasicShape { BasicShape::Point } }
#![allow(unused)] fn main() { pub enum Shape { Circle(i32), Rectangle(i32, i32), Point, } #[unsafe(no_mangle)] pub fn enum_with_data_match(shape: Shape) -> i32 { match shape { Shape::Circle(radius) => radius * radius, Shape::Rectangle(width, height) => width * height, Shape::Point => 0, } } #[unsafe(no_mangle)] pub fn make_circle(radius: i32) -> Shape { Shape::Circle(radius) } #[unsafe(no_mangle)] pub fn make_rectangle(w: i32, h: i32) -> Shape { Shape::Rectangle(w, h) } #[unsafe(no_mangle)] pub fn make_point() -> Shape { Shape::Point } }
The enum
type is described in the official docs in detail. We are using no_mangle
to simplify things, the reason can be found here. The source code is split into 3 parts:
- Unit-only
enum
- Data-carrying
enum
(largest variant: tuple with 1 field) - Data-carrying
enum
(largest variant: tuple with 2 fields)
Note: for this analysis the unit-only and data-carrying types have been chosen as they are the most common ones.
Build
$ cargo rustc --release -- --emit obj
Ghidra
Load the .o
file (located at target/aarch64-unknown-linux-musl/release/deps/
) into Ghidra and auto-analyze it.
Layout
This description provides a good overview of the enum
types and their layouts (even the ones we will not discuss such as empty enum
and enum
with a single variant). enum
s are also called tagged unions and their layout is unspecified, unless you use #[repr(...)]
.
Still, we will see that in our examples an enum
is either represented by a discriminant only or a discriminant plus the data/payload.
Unit-only enum
#![allow(unused)] fn main() { pub enum Color { Red, Green, Blue, } #[unsafe(no_mangle)] pub fn u32_to_color(value: u32) -> Color { match value { 0 => Color::Red, 1 => Color::Green, 2 => Color::Blue, _ => Color::Red, } } #[unsafe(no_mangle)] pub fn simple_enum_match(color: Color) -> u8 { match color { Color::Red => 0, Color::Green => 1, Color::Blue => 2, } } }
Since the layout is unstable, there is no fixed size for the enum
, although the compiler will typically choose the smallest representation possible. In this case it is u8
or i8
.
$ cargo rustc --release --quiet -- -Z print-type-sizes
print-type-size type: `Color`: 1 bytes, alignment: 1 bytes
print-type-size discriminant: 1 bytes
print-type-size variant `Red`: 0 bytes
print-type-size variant `Green`: 0 bytes
print-type-size variant `Blue`: 0 bytes
If we check simple_enum_match()
, we can see that it just forwards the input value (0/1/2) to the output (from w0
to w0
, so effectively it does nothing), meaning that the enum
uses the same values for its discriminants. u32_to_color()
behaves the same, with the difference that it also handles the default case which makes the code a bit more complicated.
Listings:
**************************************************************
* FUNCTION *
**************************************************************
undefined simple_enum_match()
undefined <UNASSIGNED> <RETURN>
simple_enum_match XREF[3]: Entry Point(*), 001000c0(*),
_elfSectionHeaders::00000110(*)
00100014 c0 03 5f d6 ret
**************************************************************
* FUNCTION *
**************************************************************
undefined u32_to_color()
undefined <UNASSIGNED> <RETURN>
u32_to_color XREF[4]: Entry Point(*), 001000ac(*),
_elfSectionHeaders::00000090(*),
_elfSectionHeaders::000000d0(*)
00100000 1f 04 00 71 cmp w0,#0x1
set w8=1 if input==1, else w8=0
00100004 e8 17 9f 1a cset w8,eq
00100008 1f 08 00 71 cmp w0,#0x2
if input==2, keep input, else use w8
0010000c 00 00 88 1a csel w0,w0,w8,eq
00100010 c0 03 5f d6 ret
Data-carrying enum
(largest variant: tuple with 1 field)
#![allow(unused)] fn main() { pub enum BasicShape { Circle(i32), Point, } #[unsafe(no_mangle)] pub fn basic_shape_match(shape: BasicShape) -> i32 { match shape { BasicShape::Circle(radius) => radius * radius, BasicShape::Point => 0, } } #[unsafe(no_mangle)] pub fn make_basic_circle(radius: i32) -> BasicShape { BasicShape::Circle(radius) } #[unsafe(no_mangle)] pub fn make_basic_point() -> BasicShape { BasicShape::Point } }
$ cargo rustc --release --quiet -- -Z print-type-sizes
print-type-size type: `BasicShape`: 8 bytes, alignment: 4 bytes
print-type-size discriminant: 4 bytes
print-type-size variant `Circle`: 4 bytes
print-type-size field `.0`: 4 bytes
print-type-size variant `Point`: 0 bytes
In case the largest data-carrying variant is a tuple with 1 field, the compiler chooses to pass the discriminant and data via 2 registers (w0
: discriminant and w1
: data). A discriminant of 0 means Circle
while 1 means Point
.
Listings:
**************************************************************
* FUNCTION *
**************************************************************
undefined basic_shape_match()
undefined <UNASSIGNED> <RETURN>
basic_shape_match XREF[3]: Entry Point(*), 00100124(*),
_elfSectionHeaders::00000250(*)
0010006c 28 7c 01 1b mul w8,w1,w1
check discriminant
00100070 1f 00 00 72 tst w0,#0x1
if discriminant != 0, return 0, else return w8
00100074 e0 13 88 1a csel w0,wzr,w8,ne
00100078 c0 03 5f d6 ret
**************************************************************
* FUNCTION *
**************************************************************
undefined make_basic_circle()
undefined <UNASSIGNED> <RETURN>
make_basic_circle XREF[3]: Entry Point(*), 00100138(*),
_elfSectionHeaders::00000290(*)
0010007c e1 03 00 2a mov w1,w0
00100080 e0 03 1f 2a mov w0,wzr
00100084 c0 03 5f d6 ret
**************************************************************
* FUNCTION *
**************************************************************
undefined make_basic_point()
undefined <UNASSIGNED> <RETURN>
make_basic_point XREF[3]: Entry Point(*), 0010014c(*),
_elfSectionHeaders::000002d0(*)
00100088 20 00 80 52 mov w0,#0x1
0010008c c0 03 5f d6 ret
Data-carrying enum
(largest variant: tuple with 2 fields)
#![allow(unused)] fn main() { pub enum Shape { Circle(i32), Rectangle(i32, i32), Point, } #[unsafe(no_mangle)] pub fn enum_with_data_match(shape: Shape) -> i32 { match shape { Shape::Circle(radius) => radius * radius, Shape::Rectangle(width, height) => width * height, Shape::Point => 0, } } #[unsafe(no_mangle)] pub fn make_circle(radius: i32) -> Shape { Shape::Circle(radius) } #[unsafe(no_mangle)] pub fn make_rectangle(w: i32, h: i32) -> Shape { Shape::Rectangle(w, h) } #[unsafe(no_mangle)] pub fn make_point() -> Shape { Shape::Point } }
$ cargo rustc --release --quiet -- -Z print-type-sizes
print-type-size type: `Shape`: 12 bytes, alignment: 4 bytes
print-type-size discriminant: 4 bytes
print-type-size variant `Rectangle`: 8 bytes
print-type-size field `.0`: 4 bytes
print-type-size field `.1`: 4 bytes
print-type-size variant `Circle`: 4 bytes
print-type-size field `.0`: 4 bytes
print-type-size variant `Point`: 0 bytes
In case the largest data-carrying variant is a tuple with 2 fields, the compiler chooses to pass the discriminant and data via 1 register (w0
), which holds a pointer to the enum
's memory location where the struct begins with the discriminant (here: offset 0), then contains the remaining fields associated with that variant (here: offset 4). A discriminant of 0 means Circle
, 1 means Rectangle
and 2 means Point
.
Note: x8
is the indirect result register according to the AAPCS64. It is also explained here:
XR
(X8
) is a pointer to the memory allocated by the caller for returning the struct.
Listings:
**************************************************************
* FUNCTION *
**************************************************************
undefined enum_with_data_match()
undefined <UNASSIGNED> <RETURN>
enum_with_data_match XREF[3]: Entry Point(*), 001000d4(*),
_elfSectionHeaders::00000150(*)
00100018 08 00 40 b9 ldr w8,[x0]
0010001c c8 00 00 34 cbz w8,LAB_00100034
00100020 1f 05 00 71 cmp w8,#0x1
00100024 e1 00 00 54 b.ne LAB_00100040
rectangle
00100028 08 a4 40 29 ldp w8,w9,[x0, #0x4]
0010002c 20 7d 08 1b mul w0,w9,w8
00100030 c0 03 5f d6 ret
circle
LAB_00100034 XREF[1]: 0010001c(j)
00100034 08 04 40 b9 ldr w8,[x0, #0x4]
00100038 00 7d 08 1b mul w0,w8,w8
0010003c c0 03 5f d6 ret
point
LAB_00100040 XREF[1]: 00100024(j)
00100040 e0 03 1f 2a mov w0,wzr
00100044 c0 03 5f d6 ret
**************************************************************
* FUNCTION *
**************************************************************
undefined make_circle()
undefined <UNASSIGNED> <RETURN>
make_circle XREF[3]: Entry Point(*), 001000e8(*),
_elfSectionHeaders::00000190(*)
00100048 1f 01 00 29 stp wzr,w0,[x8]
0010004c c0 03 5f d6 ret
**************************************************************
* FUNCTION *
**************************************************************
undefined make_rectangle()
undefined <UNASSIGNED> <RETURN>
make_rectangle XREF[3]: Entry Point(*), 001000fc(*),
_elfSectionHeaders::000001d0(*)
00100050 29 00 80 52 mov w9,#0x1
00100054 00 85 00 29 stp w0,w1,[x8, #0x4]
00100058 09 01 00 b9 str w9,[x8]
0010005c c0 03 5f d6 ret
**************************************************************
* FUNCTION *
**************************************************************
undefined make_point()
undefined <UNASSIGNED> <RETURN>
make_point XREF[3]: Entry Point(*), 00100110(*),
_elfSectionHeaders::00000210(*)
00100060 49 00 80 52 mov w9,#0x2
00100064 09 01 00 b9 str w9,[x8]
00100068 c0 03 5f d6 ret
Option
Source
Initialize a new workspace with cargo init --lib
.
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn safe_divide(dividend: i32, divisor: i32) -> Option<i32> { if divisor == 0 { None } else { Some(dividend / divisor) } } #[unsafe(no_mangle)] pub fn process_option(value: Option<i32>) -> i32 { match value { Some(x) => x * 2, None => 0, } } #[unsafe(no_mangle)] pub fn process_str_option(value: Option<&str>) -> usize { match value { Some(s) => s.len(), None => 0, } } #[unsafe(no_mangle)] pub fn process_box_option(value: Option<Box<i32>>) -> i32 { match value { Some(boxed) => *boxed, None => -1, } } }
The Option
type is described in the official docs in detail.
no_mangle
In some cases, simple functions e.g. process_option
might be inlined by the compiler. For this reason, these are not present in the .o
file, only in .rmeta
. The inlining will be done based on the information (e.g. function signatures, type information and encoded MIR) available in the .rmeta
file. Information about the .rmeta
file format can be found here. We want to see the generated code for our sample functions in the .o
file, so this optimization is undesirable for us. A possible solution is to use #[unsafe(no_mangle)]
which has 2 effects:
- Do not mangle the symbol name.
- Export this symbol.
#[unsafe(no_mangle)]
implies that the function is intended to be called from outside of the current compilation unit (e.g. from C code or another Rust crate with a different LTO context). For this reason, it will be present in the.o
file.
Without #[unsafe(no_mangle)]
:
$ llvm-objdump --syms target/aarch64-unknown-linux-musl/release/deps/*.o
target/aarch64-unknown-linux-musl/release/deps/rust_lab-35c360f17fe9ba7d.o: file format elf64-littleaarch64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 rust_lab.e234cd7f6b439d7e-cgu.0
0000000000000000 l d .text._ZN8rust_lab11safe_divide17h459bac753259da22E 0000000000000000 .text._ZN8rust_lab11safe_divide17h459bac753259da22E
0000000000000000 l .text._ZN8rust_lab11safe_divide17h459bac753259da22E 0000000000000000 $x
0000000000000000 l d .text._ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E 0000000000000000 .text._ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E
0000000000000000 l .text._ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E 0000000000000000 $x
0000000000000000 l d .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5 0000000000000000 .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5
0000000000000000 l .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5 0000000000000000 $d
0000000000000000 l d .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2 0000000000000000 .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2
0000000000000000 l .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2 0000000000000000 $d
0000000000000000 l .comment 0000000000000000 $d
0000000000000000 l .eh_frame 0000000000000000 $d
0000000000000000 g F .text._ZN8rust_lab11safe_divide17h459bac753259da22E 0000000000000040 _ZN8rust_lab11safe_divide17h459bac753259da22E
0000000000000000 *UND* 0000000000000000 _ZN4core9panicking11panic_const24panic_const_div_overflow17h2ce15414ba9ec1bdE
0000000000000000 g F .text._ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E 0000000000000044 _ZN8rust_lab18process_box_option17h2a7f2c1809e960d7E
0000000000000000 *UND* 0000000000000000 _RNvCsdk9DaPZnL1i_7___rustc14___rust_dealloc
With #[unsafe(no_mangle)]
:
$ llvm-objdump --syms target/aarch64-unknown-linux-musl/release/deps/*.o
target/aarch64-unknown-linux-musl/release/deps/rust_lab-35c360f17fe9ba7d.o: file format elf64-littleaarch64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 rust_lab.e234cd7f6b439d7e-cgu.0
0000000000000000 l d .text.safe_divide 0000000000000000 .text.safe_divide
0000000000000000 l .text.safe_divide 0000000000000000 $x
0000000000000000 l d .text.process_option 0000000000000000 .text.process_option
0000000000000000 l .text.process_option 0000000000000000 $x
0000000000000000 l d .text.process_str_option 0000000000000000 .text.process_str_option
0000000000000000 l .text.process_str_option 0000000000000000 $x
0000000000000000 l d .text.process_box_option 0000000000000000 .text.process_box_option
0000000000000000 l .text.process_box_option 0000000000000000 $x
0000000000000000 l d .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5 0000000000000000 .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5
0000000000000000 l .rodata..Lalloc_f5ffd2fd1476bab43ad89fb40c72d0c5 0000000000000000 $d
0000000000000000 l d .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2 0000000000000000 .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2
0000000000000000 l .data.rel.ro..Lalloc_0ea055d83440e297c58eb113a9bcb2e2 0000000000000000 $d
0000000000000000 l .comment 0000000000000000 $d
0000000000000000 l .eh_frame 0000000000000000 $d
0000000000000000 g F .text.safe_divide 0000000000000040 safe_divide
0000000000000000 *UND* 0000000000000000 _ZN4core9panicking11panic_const24panic_const_div_overflow17h2ce15414ba9ec1bdE
0000000000000000 g F .text.process_option 0000000000000010 process_option
0000000000000000 g F .text.process_str_option 000000000000000c process_str_option
0000000000000000 g F .text.process_box_option 0000000000000044 process_box_option
0000000000000000 *UND* 0000000000000000 _RNvCsdk9DaPZnL1i_7___rustc14___rust_dealloc
Build
$ cargo rustc --release -- --emit obj,mir,llvm-ir
Ghidra
Load the .o
file (located at target/aarch64-unknown-linux-musl/release/deps/
) into Ghidra and auto-analyze it.
Layout
Option
is an enum
type which is conceptually a tagged union with a discriminant and data (refer to chapter enum
for more information about enum
s). Since the layout is unspecified, Rust often applies discriminant elision. For common types like references and Box<T>
, None
is represented using invalid bit patterns (like null pointers, see chapter Null pointer optimization) rather than a separate discriminant field, making Option<T>
the same size as T
. In case of the None
variant, the data value is undefined. We will see this in the generated code but this is documented as well. The exact memory layout is unspecified without explicit #[repr]
attributes.
safe_divide
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn safe_divide(dividend: i32, divisor: i32) -> Option<i32> { if divisor == 0 { None } else { Some(dividend / divisor) } } }
The generated assembly is straightforward, there is only one piece of background information we need to know to fully understand it. The compiler automatically generates a check which makes sure to panic if the result would overflow. In case of i32
, there is only one such scenario: dividend is i32::MIN
and divisor is -1
. This can be seen in the MIR already:
fn safe_divide(_1: i32, _2: i32) -> Option<i32> {
debug dividend => _1;
debug divisor => _2;
let mut _0: std::option::Option<i32>;
let mut _3: bool;
let mut _4: i32;
let mut _5: bool;
let mut _6: bool;
let mut _7: bool;
bb0: {
_3 = Eq(copy _2, const 0_i32);
switchInt(move _2) -> [0: bb1, otherwise: bb2];
}
bb1: {
_0 = const Option::<i32>::None;
goto -> bb5;
}
bb2: {
StorageLive(_4);
assert(!copy _3, "attempt to divide `{}` by zero", copy _1) -> [success: bb3, unwind continue];
}
bb3: {
_5 = Eq(copy _2, const -1_i32);
_6 = Eq(copy _1, const i32::MIN);
_7 = BitAnd(move _5, move _6);
assert(!move _7, "attempt to compute `{} / {}`, which would overflow", copy _1, copy _2) -> [success: bb4, unwind continue];
}
bb4: {
_4 = Div(copy _1, copy _2);
_0 = Option::<i32>::Some(move _4);
StorageDead(_4);
goto -> bb5;
}
bb5: {
return;
}
}
The result is returned using 2 registers: w0
stores the discriminant (None
: 0
, Some
: 1
) and w1
stores the data.
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined safe_divide()
undefined <UNASSIGNED> <RETURN>
undefined8 Stack[-0x10]:8 local_10 XREF[1]: 0010002c(W)
check if divisor is 0
safe_divide XREF[4]: Entry Point(*), 001000e4(*),
_elfSectionHeaders::00000090(*),
_elfSectionHeaders::000000d0(*)
00100000 21 01 00 34 cbz w1,LAB_00100024
i32::MIN
00100004 08 00 b0 52 mov w8,#0x80000000
check if dividend is i32::MIN
00100008 1f 00 08 6b cmp w0,w8
0010000c 61 00 00 54 b.ne LAB_00100018
check if divisor is -1
00100010 3f 04 00 31 cmn w1,#0x1
00100014 c0 00 00 54 b.eq LAB_0010002c
LAB_00100018 XREF[1]: 0010000c(j)
00100018 01 0c c1 1a sdiv w1,w0,w1
Some
0010001c 20 00 80 52 mov w0,#0x1
00100020 c0 03 5f d6 ret
None
LAB_00100024 XREF[1]: 00100000(j)
00100024 e0 03 1f 2a mov w0,wzr
00100028 c0 03 5f d6 ret
LAB_0010002c XREF[1]: 00100014(j)
0010002c fd 7b bf a9 stp x29,x30,[sp, #local_10]!
00100030 fd 03 00 91 mov x29,sp
00100034 00 00 00 90 adrp x0,0x100000
00100038 00 c0 02 91 add x0=>PTR_DAT_001000b0,x0,#0xb0 = 001000a0
0010003c f1 03 00 94 bl <EXTERNAL>::core::panicking::panic_const::pani undefined panic_const_div_overfl
process_option
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn process_option(value: Option<i32>) -> i32 { match value { Some(x) => x * 2, None => 0, } } }
We can see the same pattern (w0
: discriminant, w1
: data) when processing an Option
passed to our function.
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined process_option()
undefined <UNASSIGNED> <RETURN>
multiply by 2
process_option XREF[3]: Entry Point(*), 00100100(*),
_elfSectionHeaders::00000150(*)
00100040 28 78 1f 53 lsl w8,w1,#0x1
check discriminant: Z flag = 1 if None, Z flag = 0 if Some
00100044 1f 00 00 72 tst w0,#0x1
if Z=0 (Some): return w8, if Z=1 (None): return wzr
00100048 00 11 9f 1a csel w0,w8,wzr,ne
0010004c c0 03 5f d6 ret
Null pointer optimization
There are some cases where the discriminant is omitted due to optimizations. The general rule is that null pointer optimization can be used for types that can never be null. Examples include:
Option<&str>
Option<Box<i32>>
By the safety guarantees of safe Rust, a &str
always points to a valid location and a Box<T>
always points to a valid heap allocation. This enables the compiler to use further optimizations, for example dropping the discriminant field and using a null value to represent the None
variant.
While tracing the different compilation steps, we can see that the discriminant is present in the MIR but not in the LLVM IR. This means the null pointer optimization happens during lowering MIR to LLVM IR.
process_str_option
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn process_str_option(value: Option<&str>) -> usize { match value { Some(s) => s.len(), None => 0, } } }
Looking at the MIR, we can see that it extracts and checks the discriminant:
...
_2 = discriminant(_1);
switchInt(move _2) -> [0: bb2, 1: bb3, otherwise: bb1];
...
In the LLVM IR this has been simplified and replaced with a null check. If the pointer is null, 0 is returned, if it is a valid value, the length of the referenced string is returned. (An &str
consists of 2 values: a pointer and a length.)
; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable
define noundef i64 @process_str_option(ptr noalias noundef readonly align 1 %0, i64 %1) unnamed_addr #1 {
start:
%.not = icmp eq ptr %0, null
%. = select i1 %.not, i64 0, i64 %1
ret i64 %.
}
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined process_str_option()
undefined <UNASSIGNED> <RETURN>
process_str_option XREF[3]: Entry Point(*), 00100114(*),
_elfSectionHeaders::00000190(*)
00100050 1f 00 00 f1 cmp x0,#0x0
00100054 e0 03 81 9a csel x0,xzr,x1,eq
00100058 c0 03 5f d6 ret
Full MIR for reference:
fn process_str_option(_1: Option<&str>) -> usize {
debug value => _1;
let mut _0: usize;
let mut _2: isize;
let _3: &str;
scope 1 {
debug s => _3;
scope 2 (inlined core::str::<impl str>::len) {
let _4: &[u8];
scope 3 (inlined core::str::<impl str>::as_bytes) {
}
}
}
bb0: {
_2 = discriminant(_1);
switchInt(move _2) -> [0: bb2, 1: bb3, otherwise: bb1];
}
bb1: {
unreachable;
}
bb2: {
_0 = const 0_usize;
goto -> bb4;
}
bb3: {
_3 = copy ((_1 as Some).0: &str);
StorageLive(_4);
_4 = copy _3 as &[u8] (Transmute);
_0 = PtrMetadata(copy _4);
StorageDead(_4);
goto -> bb4;
}
bb4: {
return;
}
}
process_box_option
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn process_box_option(value: Option<Box<i32>>) -> i32 { match value { Some(boxed) => *boxed, None => -1, } } }
A Box<i32>
consists of a single pointer pointing to a heap allocated block and can never be null in safe code. Therefore, the code can be optimized with a null check. If it is null, -1
is returned. Otherwise, the pointer is dereferenced, the heap block is deallocated and the value is returned.
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined process_box_option()
undefined <UNASSIGNED> <RETURN>
undefined8 Stack[-0x10]:8 local_10 XREF[3]: 00100060(W),
00100080(R),
00100094(R)
undefined8 Stack[-0x20]:8 local_20 XREF[3]: 0010005c(W),
00100084(*),
00100098(*)
process_box_option XREF[3]: Entry Point(*), 00100128(*),
_elfSectionHeaders::000001d0(*)
0010005c fd 7b be a9 stp x29,x30,[sp, #local_20]!
00100060 f3 0b 00 f9 str x19,[sp, #local_10]
00100064 fd 03 00 91 mov x29,sp
null check
00100068 20 01 00 b4 cbz x0,LAB_0010008c
dereference
0010006c 13 00 40 b9 ldr w19,[x0]
00100070 81 00 80 52 mov w1,#0x4
00100074 82 00 80 52 mov w2,#0x4
00100078 e4 03 00 94 bl <EXTERNAL>::__rustc[a3537046f032bc96]::__rust_ undefined __rust_dealloc()
0010007c e0 03 13 2a mov w0,w19
00100080 f3 0b 40 f9 ldr x19,[sp, #local_10]
00100084 fd 7b c2 a8 ldp x29=>local_20,x30,[sp], #0x20
00100088 c0 03 5f d6 ret
LAB_0010008c XREF[1]: 00100068(j)
0010008c 13 00 80 12 mov w19,#0xffffffff
00100090 e0 03 13 2a mov w0,w19
00100094 f3 0b 40 f9 ldr x19,[sp, #local_10]
00100098 fd 7b c2 a8 ldp x29=>local_20,x30,[sp], #0x20
0010009c c0 03 5f d6 ret
Result
Source
Initialize a new workspace with cargo init --lib
.
#![allow(unused)] fn main() { pub enum MathError { DivisionByZero, Overflow, } #[unsafe(no_mangle)] pub fn divide_with_str_error(dividend: i32, divisor: i32) -> Result<i32, &'static str> { if divisor == 0 { Err("Division by zero") } else { Ok(dividend / divisor) } } #[unsafe(no_mangle)] pub fn divide_with_enum_error(a: i32, b: i32) -> Result<i32, MathError> { if b == 0 { return Err(MathError::DivisionByZero); } if a == i32::MIN && b == -1 { return Err(MathError::Overflow); } Ok(a / b) } #[unsafe(no_mangle)] pub fn process_result_str_error(value: Result<i32, &str>) -> i32 { match value { Ok(x) => x * 2, Err(_) => -1, } } #[unsafe(no_mangle)] pub fn process_result_box_enum(value: Result<Box<i32>, MathError>) -> i32 { match value { Err(MathError::DivisionByZero) => -1, Err(MathError::Overflow) => -2, Ok(value) => *value, } } }
The Result
type is described in the official docs in detail. We are using no_mangle
to simplify things, the reason can be found here.
Build
$ cargo rustc --release -- --emit obj
Ghidra
Load the .o
file (located at target/aarch64-unknown-linux-musl/release/deps/
) into Ghidra and auto-analyze it.
Layout
Result
is an enum
type which is conceptually a tagged union with a discriminant and data (refer to chapter enum
for more information about enum
s). The layout is unspecified, which similarly to Option
enables certain niche optimizations.
divide_with_str_error
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn divide_with_str_error(dividend: i32, divisor: i32) -> Result<i32, &'static str> { if divisor == 0 { Err("Division by zero") } else { Ok(dividend / divisor) } } }
If we look at the layout, we see that there is no discriminant field. Since the compiler knows that references always point to valid locations, it can use the null value as a discriminant. &str
is a fat reference (16 bytes: pointer + length), therefore variant Err
is 16 bytes.
$ cargo rustc --release --quiet -- -Z print-type-sizes
...
print-type-size type: `std::result::Result<i32, &str>`: 16 bytes, alignment: 8 bytes
print-type-size variant `Err`: 16 bytes
print-type-size field `.0`: 16 bytes
print-type-size variant `Ok`: 12 bytes
print-type-size padding: 8 bytes
print-type-size field `.0`: 4 bytes, alignment: 4 bytes
...
Before looking at the assembly, we must mention that the compiler automatically generates a check that causes a panic if the result would overflow. For i32
, there is only one such scenario: dividend is i32::MIN
and divisor is -1
.
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined divide_with_str_error()
undefined <UNASSIGNED> <RETURN>
undefined8 Stack[-0x10]:8 local_10 XREF[1]: 0010003c(W)
check if divisor is 0
divide_with_str_error XREF[4]: Entry Point(*), 0010015c(*),
_elfSectionHeaders::00000090(*),
_elfSectionHeaders::000000d0(*)
00100000 41 01 00 34 cbz w1,LAB_00100028
i32::MIN
00100004 09 00 b0 52 mov w9,#0x80000000
check if dividend is i32::MIN
00100008 1f 00 09 6b cmp w0,w9
0010000c 61 00 00 54 b.ne LAB_00100018
check if divisor is -1
00100010 3f 04 00 31 cmn w1,#0x1
00100014 40 01 00 54 b.eq LAB_0010003c
LAB_00100018 XREF[1]: 0010000c(j)
00100018 09 0c c1 1a sdiv w9,w0,w1
Ok
0010001c 1f 01 00 f9 str xzr,[x8]
00100020 09 09 00 b9 str w9,[x8, #0x8]
00100024 c0 03 5f d6 ret
LAB_00100028 XREF[1]: 00100000(j)
00100028 09 00 00 90 adrp x9,0x100000
0010002c 29 21 04 91 add x9,x9,#0x108
length of "Division by zero"
00100030 0a 02 80 52 mov w10,#0x10
Err
00100034 09 29 00 a9 stp x9=>s_Division_by_zero_00100108,x10,[x8] = "Division by zero"
00100038 c0 03 5f d6 ret
LAB_0010003c XREF[1]: 00100014(j)
0010003c fd 7b bf a9 stp x29,x30,[sp, #local_10]!
00100040 fd 03 00 91 mov x29,sp
00100044 00 00 00 90 adrp x0,0x100000
00100048 00 a0 04 91 add x0=>PTR_DAT_00100128,x0,#0x128 = 00100118
0010004c ed 03 00 94 bl <EXTERNAL>::core::panicking::panic_const::pani undefined panic_const_div_overfl
divide_with_enum_error
#![allow(unused)] fn main() { pub enum MathError { DivisionByZero, Overflow, } #[unsafe(no_mangle)] pub fn divide_with_enum_error(a: i32, b: i32) -> Result<i32, MathError> { if b == 0 { return Err(MathError::DivisionByZero); } if a == i32::MIN && b == -1 { return Err(MathError::Overflow); } Ok(a / b) } }
Here we have a discriminant which is either followed by the value (Ok
) or the error type (Err
).
$ cargo rustc --release --quiet -- -Z print-type-sizes
...
print-type-size type: `std::result::Result<i32, MathError>`: 8 bytes, alignment: 4 bytes
print-type-size discriminant: 1 bytes
print-type-size variant `Ok`: 7 bytes
print-type-size padding: 3 bytes
print-type-size field `.0`: 4 bytes, alignment: 4 bytes
print-type-size variant `Err`: 1 bytes
print-type-size field `.0`: 1 bytes
...
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined divide_with_enum_error()
undefined <UNASSIGNED> <RETURN>
check if divisor is 0
divide_with_enum_error XREF[3]: Entry Point(*), 0010017c(*),
_elfSectionHeaders::00000150(*)
00100050 41 01 00 34 cbz w1,LAB_00100078
i32::MIN
00100054 08 00 b0 52 mov w8,#0x80000000
check if dividend is i32::MIN
00100058 1f 00 08 6b cmp w0,w8
0010005c 41 01 00 54 b.ne LAB_00100084
check if divisor is -1
00100060 3f 04 00 31 cmn w1,#0x1
00100064 01 01 00 54 b.ne LAB_00100084
Overflow
00100068 08 20 80 52 mov w8,#0x100
Err
0010006c 29 00 80 52 mov w9,#0x1
Err(Overflow) encoded
00100070 00 01 09 aa orr x0,x8,x9
00100074 c0 03 5f d6 ret
Err
LAB_00100078 XREF[1]: 00100050(j)
00100078 29 00 80 52 mov w9,#0x1
Err(DivisionByZero) encoded
0010007c e0 03 09 aa mov x0,x9
00100080 c0 03 5f d6 ret
LAB_00100084 XREF[2]: 0010005c(j), 00100064(j)
00100084 08 0c c1 1a sdiv w8,w0,w1
Ok (value is upper 4 bytes)
00100088 08 7d 60 d3 lsl x8,x8,#0x20
0010008c 00 01 1f aa orr x0,x8,xzr
00100090 c0 03 5f d6 ret
process_result_str_error
#![allow(unused)] fn main() { #[unsafe(no_mangle)] pub fn process_result_str_error(value: Result<i32, &str>) -> i32 { match value { Ok(x) => x * 2, Err(_) => -1, } } }
The layout of the Result
is the same as in divide_with_str_error
. If the first 8 bytes are 0 it means Ok
and the value is multiplied by 2 and returned, otherwise -1
is returned (wzr
inverted: 0xffff).
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined process_result_str_error()
undefined <UNASSIGNED> <RETURN>
process_result_str_error XREF[3]: Entry Point(*), 00100190(*),
_elfSectionHeaders::00000190(*)
00100094 08 08 40 b9 ldr w8,[x0, #0x8]
00100098 09 00 40 f9 ldr x9,[x0]
0010009c 08 79 1f 53 lsl w8,w8,#0x1
001000a0 3f 01 00 f1 cmp x9,#0x0
001000a4 00 01 9f 5a csinv w0,w8,wzr,eq
001000a8 c0 03 5f d6 ret
process_result_box_enum
#![allow(unused)] fn main() { pub enum MathError { DivisionByZero, Overflow, } #[unsafe(no_mangle)] pub fn process_result_box_enum(value: Result<Box<i32>, MathError>) -> i32 { match value { Err(MathError::DivisionByZero) => -1, Err(MathError::Overflow) => -2, Ok(value) => *value, } } }
Box<T>
is a thin pointer pointing to a heap location, the memory is automatically deallocated by __rust_dealloc
. For more information refer to chapter Box
. The layout is similar to what we saw in the case of divide_with_enum_error
. The difference is that the size of the Ok
data is 8 bytes instead of 4.
$ cargo rustc --release --quiet -- -Z print-type-sizes
...
print-type-size type: `std::result::Result<std::boxed::Box<i32>, MathError>`: 16 bytes, alignment: 8 bytes
print-type-size discriminant: 1 bytes
print-type-size variant `Ok`: 15 bytes
print-type-size padding: 7 bytes
print-type-size field `.0`: 8 bytes, alignment: 8 bytes
print-type-size variant `Err`: 1 bytes
print-type-size field `.0`: 1 bytes
...
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined process_result_box_enum()
undefined <UNASSIGNED> <RETURN>
undefined8 Stack[-0x10]:8 local_10 XREF[3]: 001000b0(W),
001000d8(R),
001000fc(R)
undefined8 Stack[-0x20]:8 local_20 XREF[3]: 001000ac(W),
001000dc(*),
00100100(*)
process_result_box_enum XREF[3]: Entry Point(*), 001001a4(*),
_elfSectionHeaders::000001d0(*)
001000ac fd 7b be a9 stp x29,x30,[sp, #local_20]!
001000b0 f3 0b 00 f9 str x19,[sp, #local_10]
001000b4 fd 03 00 91 mov x29,sp
load discriminant
001000b8 08 00 40 39 ldrb w8,[x0]
Err
001000bc 1f 05 00 71 cmp w8,#0x1
001000c0 21 01 00 54 b.ne LAB_001000e4
001000c4 08 04 40 39 ldrb w8,[x0, #0x1]
DivisionByZero
001000c8 1f 01 00 71 cmp w8,#0x0
001000cc 28 00 80 12 mov w8,#0xfffffffe
if DivisionByZero: ret val is 0xfffffffe + 1 = -1,
otherwise (Overflow): 0xfffffffe = -2
001000d0 13 15 88 1a cinc w19,w8,eq
001000d4 e0 03 13 2a mov w0,w19
001000d8 f3 0b 40 f9 ldr x19,[sp, #local_10]
001000dc fd 7b c2 a8 ldp x29=>local_20,x30,[sp], #0x20
001000e0 c0 03 5f d6 ret
arg0: ptr
LAB_001000e4 XREF[1]: 001000c0(j)
001000e4 00 04 40 f9 ldr x0,[x0, #0x8]
arg1: size
001000e8 81 00 80 52 mov w1,#0x4
arg2: align
001000ec 82 00 80 52 mov w2,#0x4
save value
001000f0 13 00 40 b9 ldr w19,[x0]
001000f4 c5 03 00 94 bl <EXTERNAL>::__rustc[eb192786f4da5ea1]::__rust_ undefined __rust_dealloc()
001000f8 e0 03 13 2a mov w0,w19
001000fc f3 0b 40 f9 ldr x19,[sp, #local_10]
00100100 fd 7b c2 a8 ldp x29=>local_20,x30,[sp], #0x20
00100104 c0 03 5f d6 ret
Declarative macros
Source
Initialize a new workspace with cargo init --lib
.
#![allow(unused)] fn main() { pub fn declarative_macro_vec_empty() -> Vec<i32> { let vec: Vec<i32> = vec![]; vec } pub fn declarative_macro_vec_repeat() -> Vec<i32> { let vec: Vec<i32> = vec![1; 10]; vec } pub fn declarative_macro_vec_list() -> Vec<i32> { let vec: Vec<i32> = vec![1, 2, 3]; vec } }
Declarative macros are defined with the macro_rules!
language construct and they work through pattern matching on the syntax tree. The implementation handling the macro compilation can be found here and here.
Since declarative macros can be expanded to arbitrary Rust code based on the implementation of the macro, we will focus on the expanded Rust code rather than the generated binary files in this chapter.
If we look at the vec!
macro as an example, we see that it has 3 arms:
- the 1. one matching
vec![]
- the 2. one matching
vec![1; 10]
- the 3. one matching
vec![1, 2, 3]
#![allow(unused)] fn main() { macro_rules! vec { () => ( $crate::vec::Vec::new() ); ($elem:expr; $n:expr) => ( $crate::vec::from_elem($elem, $n) ); ($($x:expr),+ $(,)?) => ( <[_]>::into_vec( // Using the intrinsic produces a dramatic improvement in stack usage for // unoptimized programs using this code path to construct large Vecs. $crate::boxed::box_new([$($x),+]) ) ); } }
declarative_macro_vec_empty
#![allow(unused)] fn main() { pub fn declarative_macro_vec_empty() -> Vec<i32> { let vec: Vec<i32> = vec![]; vec } }
$ cargo expand
...
pub fn declarative_macro_vec_empty() -> Vec<i32> {
let vec: Vec<i32> = ::alloc::vec::Vec::new();
vec
}
declarative_macro_vec_repeat
#![allow(unused)] fn main() { pub fn declarative_macro_vec_repeat() -> Vec<i32> { let vec: Vec<i32> = vec![1; 10]; vec } }
$ cargo expand
...
pub fn declarative_macro_vec_repeat() -> Vec<i32> {
let vec: Vec<i32> = ::alloc::vec::from_elem(1, 10);
vec
}
declarative_macro_vec_list
#![allow(unused)] fn main() { pub fn declarative_macro_vec_list() -> Vec<i32> { let vec: Vec<i32> = vec![1, 2, 3]; vec } }
$ cargo expand
...
pub fn declarative_macro_vec_list() -> Vec<i32> {
let vec: Vec<i32> = <[_]>::into_vec(::alloc::boxed::box_new([1, 2, 3]));
vec
}
Built-in declarative macros
Source
Initialize a new workspace with cargo init --lib
.
#![allow(unused)] fn main() { pub fn format_args_built_in() { println!("Hello, world!"); } }
Built-in declarative macros are a special type of declarative macros. They are marked with #[rustc_builtin_macro]
and expanded with an internal expander function. The list of the built-in macros can be found here.
If we look at the format_args!
macro as an example, we see that it is marked with #[rustc_builtin_macro]
and the expander function is expand_format_args
which calls expand_format_args_impl
.
#![allow(unused)] fn main() { #[stable(feature = "rust1", since = "1.0.0")] #[rustc_diagnostic_item = "format_args_macro"] #[allow_internal_unsafe] #[allow_internal_unstable(fmt_internals)] #[rustc_builtin_macro] #[macro_export] macro_rules! format_args { ($fmt:expr) => {{ /* compiler built-in */ }}; ($fmt:expr, $($args:tt)*) => {{ /* compiler built-in */ }}; } }
format_args_built_in
#![allow(unused)] fn main() { pub fn format_args_built_in() { println!("Hello, world!"); } }
Built-in macros are not expanded to Rust code which means cargo expand
cannot show us their expanded form:
$ cargo expand
...
pub fn format_args_built_in() {
{
::std::io::_print(format_args!("Hello, world!\n"));
};
}
Instead, they are expanded in the HIR:
$ cargo rustc --release --quiet -- -Z unpretty=hir
...
fn format_args_built_in() {
{ ::std::io::_print(format_arguments::new_const(&["Hello, world!\n"])); };
}
Procedural macros
Source
Initialize a new workspace with cargo init --lib
and add tokio
to the dependencies via cargo add tokio --features full
.
#![allow(unused)] fn main() { #[tokio::main] async fn proc_macro_main() {} }
Procedural macros can manipulate the syntax tree directly. They work with TokenStream
s which are sequences of tokens representing Rust code. A proc macro receives a TokenStream
as input and returns a TokenStream
as output.
Since proc macros can be expanded to arbitrary Rust code based on the implementation of the macro, we will focus on the expanded Rust code rather than the generated binary files in this chapter.
If we look at the #[tokio::main]
attribute proc macro as an example, we see that it is expanded to this or similar code:
fn main() { tokio::runtime::Builder::new_current_thread() .enable_all() .unhandled_panic(UnhandledPanic::ShutdownRuntime) .build() .unwrap() .block_on(async { let _ = tokio::spawn(async { panic!("This panic will shutdown the runtime."); }).await; }) }
proc_macro_main
#![allow(unused)] fn main() { #[tokio::main] async fn proc_macro_main() {} }
$ cargo expand
...
fn proc_macro_main() {
let body = async {};
#[allow(
clippy::expect_used,
clippy::diverging_sub_expression,
clippy::needless_return
)]
{
return tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
.expect("Failed building the Runtime")
.block_on(body);
}
}
Built-in attributes
Source
Initialize a new workspace with cargo init --lib
.
#![allow(unused)] fn main() { #[derive(Clone)] #[repr(C)] pub struct Person { name: String, age: u32, } pub fn attribute_person() -> Person { let person = Person { name: "Rustacean".to_string(), age: 22, }; person } }
Attributes are metadata either attached to the containing item (inner attribute) or attached to the following item (outer attribute). There are many types of built-in attributes but from code generation point of view, we only care about the ones directly influencing the generated code, such as derive
, repr
or inline
, where derive
generates additional code, repr
affects the memory layout and inline
is passed as a hint to the LLVM backend.
attribute_person
#![allow(unused)] fn main() { #[derive(Clone)] #[repr(C)] pub struct Person { name: String, age: u32, } pub fn attribute_person() -> Person { let person = Person { name: "Rustacean".to_string(), age: 22, }; person } }
$ cargo expand
...
#[repr(C)]
pub struct Person {
name: String,
age: u32,
}
#[automatically_derived]
impl ::core::clone::Clone for Person {
#[inline]
fn clone(&self) -> Person {
Person {
name: ::core::clone::Clone::clone(&self.name),
age: ::core::clone::Clone::clone(&self.age),
}
}
}
pub fn attribute_person() -> Person {
let person = Person {
name: "Rustacean".to_string(),
age: 22,
};
person
}
Macro attributes
Attributes are metadata either attached to the containing item (inner attribute) or attached to the following item (outer attribute). To extend the built-in attributes, proc macros must be used. This enables implementation of both proc macro attributes and derive macro helper attributes.
Introduction
Source
Initialize a new workspace with cargo init
then run cargo add tokio --features full
.
use tokio::time::{Duration, sleep}; #[tokio::main] async fn main() { make_coffee().await; toast_bread().await; } async fn make_coffee() { sleep(Duration::from_secs(3)).await; } async fn toast_bread() { sleep(Duration::from_secs(2)).await; }
The Tokio crate, the async
/await
keywords and the Future
trait are described in many places online. Some of the best are the long and detailed explanations by Jon Gjengset. This book assumes you are already familiar with these async
constructs.
Build
$ cargo rustc --release
Ghidra
Load the ELF file (located at target/aarch64-unknown-linux-musl/release/
) into Ghidra and auto-analyze it.
Locating the async
tasks
The first thing to do is locate the implemented async
tasks and their relationships. To start, we can see in the expanded code how the runtime is constructed and our tasks are passed to block_on
.
$ cargo expand
#![feature(prelude_import)] #[prelude_import] use std::prelude::rust_2024::*; #[macro_use] extern crate std; use tokio::time::{Duration, sleep}; fn main() { let body = async { make_coffee().await; toast_bread().await; }; #[allow( clippy::expect_used, clippy::diverging_sub_expression, clippy::needless_return )] { return tokio::runtime::Builder::new_multi_thread() .enable_all() .build() .expect("Failed building the Runtime") .block_on(body); } } async fn make_coffee() { sleep(Duration::from_secs(3)).await; } async fn toast_bread() { sleep(Duration::from_secs(2)).await; }
Based on this output, we expect to see the runtime built by the new_multi_thread()
, enable_all()
and expect
sequence and the tasks executed by block_on
.
Decompiled code:
/* WARNING: Unknown calling convention: __rustcall */
/* rust_lab::main */
void __rustcall rust_lab::main(void)
{
...
undefined1 auStack_248 [205];
undefined2 local_17b;
...
/* try { // try from 0040a6b4 to 0040a6bb has its CatchHandler @ 0040a90c */
tokio::runtime::builder::Builder::new_multi_thread(auStack_248);
local_17b = 0x101;
/* try { // try from 0040a6cc to 0040a6d7 has its CatchHandler @ 0040a914 */
tokio::runtime::builder::Builder::build(&local_d0,auStack_248);
if (local_d0 == 2) {
local_170[0] = lStack_c8;
/* try { // try from 0040a884 to 0040a8a7 has its CatchHandler @ 0040a91c */
/* WARNING: Subroutine does not return */
core::result::unwrap_failed
("Failed building the Runtime",0x1b,local_170,
&PTR_drop_in_place<std::io::error::Error>_004baf90,&PTR_s_src/main.rs_004bb040);
}
...
The enable_all
function call is nowhere to be seen. The reason is that it has been replaced by local_17b = 0x101;
, which sets the enable_io
and enable_time
fields (at offset 205).
$ cargo rustc --release -- -Zprint-type-sizes
...
print-type-size type: `tokio::runtime::Builder`: 216 bytes, alignment: 8 bytes
print-type-size field `.worker_threads`: 16 bytes
print-type-size field `.thread_stack_size`: 16 bytes
print-type-size field `.global_queue_interval`: 8 bytes
print-type-size field `.keep_alive`: 16 bytes
print-type-size field `.thread_name`: 16 bytes
print-type-size field `.nevents`: 8 bytes
print-type-size field `.max_blocking_threads`: 8 bytes
print-type-size field `.after_start`: 16 bytes
print-type-size field `.before_stop`: 16 bytes
print-type-size field `.before_park`: 16 bytes
print-type-size field `.after_unpark`: 16 bytes
print-type-size field `.before_spawn`: 16 bytes
print-type-size field `.after_termination`: 16 bytes
print-type-size field `.seed_generator`: 16 bytes
print-type-size field `.event_interval`: 4 bytes
print-type-size field `.kind`: 1 bytes
print-type-size field `.enable_io`: 1 bytes
print-type-size field `.enable_time`: 1 bytes
print-type-size field `.start_paused`: 1 bytes
print-type-size field `.disable_lifo_slot`: 1 bytes
print-type-size field `.metrics_poll_count_histogram_enable`: 1 bytes
print-type-size field `.metrics_poll_count_histogram`: 0 bytes
print-type-size end padding: 6 bytes
...
Additionally, expect
has been replaced with unwrap_failed
.
Scrolling through the remaining code in main
we cannot locate block_on
. The reason for that is that it is called via enter_runtime
:
...
/* try { // try from 0040a6b4 to 0040a6bb has its CatchHandler @ 0040a90c */
tokio::runtime::builder::Builder::new_multi_thread(auStack_248);
local_17b = 0x101;
/* try { // try from 0040a6cc to 0040a6d7 has its CatchHandler @ 0040a914 */
tokio::runtime::builder::Builder::build(&local_d0,auStack_248);
if (local_d0 == 2) {
local_170[0] = lStack_c8;
/* try { // try from 0040a884 to 0040a8a7 has its CatchHandler @ 0040a91c */
/* WARNING: Subroutine does not return */
core::result::unwrap_failed
("Failed building the Runtime",0x1b,local_170,
&PTR_drop_in_place<std::io::error::Error>_004baf90,&PTR_s_src/main.rs_004bb040);
}
...
/* try { // try from 0040a71c to 0040a727 has its CatchHandler @ 0040a8ec */
tokio::runtime::runtime::Runtime::enter(&local_e8,&local_2a0);
if ((int)local_2a0 == 1) {
local_d0 = (ulong)uStack_31f << 8;
/* try { // try from 0040a758 to 0040a76f has its CatchHandler @ 0040a8c8 */
tokio::runtime::context::runtime::enter_runtime
(&uStack_270,1,&local_d0,
&PTR_anon.e686db471eac9d0c22db85cdbc9be48c.37.llvm.11646938216170472302_004bafb0);
}
...
/* WARNING: Unknown calling convention: __rustcall */
/* tokio::runtime::context::runtime::enter_runtime */
void __rustcall
tokio::runtime::context::runtime::enter_runtime
(int *param_1,undefined4 param_2,undefined8 *param_3,undefined8 param_4)
{
...
park::CachedParkThread::block_on(pppuVar5,&local_e0);
...
async
tasks
Now that we have located block_on
, we can investigate how the async
tasks are handled.
Digging deeper and looking at the HIR we can see how the async
and await
keywords are resolved. Simply put: the tasks are polled until they are ready. When a task is pending, it yields which means it signals to the runtime that the runtime can handle other tasks and check back later.
$ cargo rustc --release --quiet -- -Z unpretty=hir
#[prelude_import]
use std::prelude::rust_2024::*;
#[macro_use]
extern crate std;
use ::{};
use tokio::time::Duration;
use tokio::time::sleep;
fn main() {
let body =
|mut _task_context: ResumeTy|
{
match #[lang = "into_future"](make_coffee()) {
mut __awaitee =>
loop {
match unsafe {
#[lang = "poll"](#[lang = "new_unchecked"](&mut __awaitee),
#[lang = "get_context"](_task_context))
} {
#[lang = "Ready"] { 0: result } => break result,
#[lang = "Pending"] {} => { }
}
_task_context = (yield ());
},
};
match #[lang = "into_future"](toast_bread()) {
mut __awaitee =>
loop {
match unsafe {
#[lang = "poll"](#[lang = "new_unchecked"](&mut __awaitee),
#[lang = "get_context"](_task_context))
} {
#[lang = "Ready"] { 0: result } => break result,
#[lang = "Pending"] {} => { }
}
_task_context = (yield ());
},
};
};
#[allow(clippy :: expect_used, clippy :: diverging_sub_expression, clippy
:: needless_return)]
{
return tokio::runtime::Builder::new_multi_thread().enable_all().build().expect("Failed building the Runtime").block_on(body);
}
}
async fn make_coffee()
->
/*impl Trait*/ |mut _task_context: ResumeTy|
{
{
let _t =
{
match #[lang = "into_future"](sleep(Duration::from_secs(3)))
{
mut __awaitee =>
loop {
match unsafe {
#[lang = "poll"](#[lang = "new_unchecked"](&mut __awaitee),
#[lang = "get_context"](_task_context))
} {
#[lang = "Ready"] { 0: result } => break result,
#[lang = "Pending"] {} => { }
}
_task_context = (yield ());
},
};
};
_t
}
}
async fn toast_bread()
->
/*impl Trait*/ |mut _task_context: ResumeTy|
{
{
let _t =
{
match #[lang = "into_future"](sleep(Duration::from_secs(2)))
{
mut __awaitee =>
loop {
match unsafe {
#[lang = "poll"](#[lang = "new_unchecked"](&mut __awaitee),
#[lang = "get_context"](_task_context))
} {
#[lang = "Ready"] { 0: result } => break result,
#[lang = "Pending"] {} => { }
}
_task_context = (yield ());
},
};
};
_t
}
}
Decompiled code:
/* WARNING: Unknown calling convention: __rustcall */
/* tokio::runtime::park::CachedParkThread::block_on */
bool __rustcall tokio::runtime::park::CachedParkThread::block_on(long param_1,char *param_2)
{
...
do {
if ( ... ) {
...
tokio::time::sleep::sleep(3,0,&PTR_s_src/main.rs_004bb080);
...
uVar5 = _<>::poll();
if ((uVar5 & 1) == 0) {
...
tokio::time::sleep::sleep(2,0,&PTR_s_src/main.rs_004bb0b0);
...
goto LAB_0040b2b0;
}
...
joined_r0x0040b25c:
bVar2 = true;
...
}
else { ... }
LAB_0040b2b0:
...
uVar5 = _<>::poll();
if ((uVar5 & 1) != 0) {
...
goto joined_r0x0040b25c;
}
...
if (!bVar2) goto LAB_0040b324;
park(param_1);
...
} while( true );
...
LAB_0040b358:
return lVar6 == 0;
LAB_0040b324:
...
goto LAB_0040b358;
}
Explanation:
- call
sleep(3)
- call
poll
to check the progress- if it returns 0 (ready), then call
sleep(2)
- call
poll
to check the progress- if it returns 1 (pending), set
bVar2
totrue
which means the thread will bepark
ed and resumed later
- if it returns 1 (pending), set
- call
- if it returns 1 (pending), set
bVar2
totrue
which means the thread will bepark
ed and resumed later
- if it returns 0 (ready), then call
Additionally, Tokio uses variables to track the progress of the tasks, so it knows where to continue the thread. For readability, these variables have been removed from the decompiled code.
Startup
This section shows how to locate the user-defined main
function and trace the call chain that leads to its execution.
Source
Initialize a new workspace with cargo init
.
The source code of chapter Hello, world!
is reused here.
Build
$ cargo rustc --release
Ghidra
Load the binary into Ghidra and auto-analyze it.
Locating main
In an std
environment (as opposed to no_std
), the user-defined main
function (here rust_lab::main
) is called by lang_start_internal
.
Call graph:
_start
_start_c
__libc_start_main
main
lang_start_internal
rust_lab::main
Decompiled code:
void main(int param_1,undefined8 param_2)
{
code *pcStack_8;
pcStack_8 = rust_lab::main;
std::rt::lang_start_internal(&pcStack_8,&DAT_0046d6e8,(long)param_1,param_2,0);
return;
}
lang_start_internal
can be easily recognized, even if symbols are stripped. The first parameter is the rust_lab::main
function being passed.
Listing:
**************************************************************
* FUNCTION *
**************************************************************
undefined main()
undefined <UNASSIGNED> <RETURN>
undefined8 Stack[-0x10]:8 local_10 XREF[1]: 00401b38(W)
main XREF[5]: Entry Point(*),
_start_c:004019f4(*), 00453c74,
004648c4(*), 0046ff90(*)
00401b28 08 00 00 90 adrp x8,0x401000
adrp + add loads the address of rust_lab::main
00401b2c 08 c1 2b 91 add x8,x8,#0xaf0
arg3: argv
00401b30 e3 03 01 aa mov x3,x1
arg2: argc
00401b34 02 7c 40 93 sxtw x2,w0
store link register and address of rust_lab::main on the stack
00401b38 fe 23 bf a9 stp x30,x8=>rust_lab::main,[sp, #local_10]!
00401b3c 61 03 00 90 adrp x1,0x46d000
arg1: vtable pointer of the trait object
00401b40 21 a0 1b 91 add x1=>DAT_0046d6e8,x1,#0x6e8
arg0: data pointer of the trait object
00401b44 e0 23 00 91 add x0,sp,#0x8
arg4: 0
00401b48 e4 03 1f 2a mov w4,wzr
00401b4c ad 5b 00 94 bl std::rt::lang_start_internal undefined lang_start_internal()
00401b50 fe 07 41 f8 ldr x30,[sp], #0x10
00401b54 c0 03 5f d6 ret
Understanding the lang_start_internal
arguments
If we look at the signature of lang_start_internal
, we can see that it accepts 4 arguments, but the decompiled code above shows 5.
#![allow(unused)] fn main() { // To reduce the generated code of the new `lang_start`, this function is doing // the real work. #[cfg(not(test))] fn lang_start_internal( main: &(dyn Fn() -> i32 + Sync + crate::panic::RefUnwindSafe), argc: isize, argv: *const *const u8, sigpipe: u8, ) -> isize { ... }
This is because the first argument is a closure that is converted to a trait object when lang_start
calls lang_start_internal
.
#![allow(unused)] fn main() { #[cfg(not(any(test, doctest)))] #[lang = "start"] fn lang_start<T: crate::process::Termination + 'static>( main: fn() -> T, argc: isize, argv: *const *const u8, sigpipe: u8, ) -> isize { lang_start_internal( &move || crate::sys::backtrace::__rust_begin_short_backtrace(main).report().to_i32(), argc, argv, sigpipe, ) } }
Within the closure body, __rust_begin_short_backtrace
is called, which then calls rust_lab::main
.
#![allow(unused)] fn main() { /// Fixed frame used to clean the backtrace with `RUST_BACKTRACE=1`. Note that /// this is only inline(never) when backtraces in std are enabled, otherwise /// it's fine to optimize away. #[cfg_attr(feature = "backtrace", inline(never))] pub fn __rust_begin_short_backtrace<F, T>(f: F) -> T where F: FnOnce() -> T, { let result = f(); // prevent this frame from being tail-call optimised away crate::hint::black_box(()); result } }
Trait objects are represented by a data pointer (here: address of rust_lab::main
) and a vtable pointer (here: DAT_0046d6e8
).
DAT_0046d6e8 XREF[1]: main:00401b40(*)
0046d6e8 00 undefined1 00h
0046d6e9 00 ?? 00h
...
0046d700 d8 1a 40 addr core::ops::function::FnOnce::call_once{{vtable
00 00 00
00 00
0046d708 b0 1a 40 addr std::rt::lang_start::_{{closure}}
00 00 00
00 00
0046d710 b0 1a 40 addr std::rt::lang_start::_{{closure}}
00 00 00
00 00
If we look at the disassembly of lang_start_internal
, we can see which vtable entry it uses to execute rust_lab::main
:
/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */
/* WARNING: Unknown calling convention: __rustcall */
/* std::rt::lang_start_internal */
long __rustcall
std::rt::lang_start_internal
(undefined8 param_1,long param_2,undefined8 param_3,undefined8 param_4,byte param_5)
{
...
(**(code **)(param_2 + 0x28))(param_1);
...
Using this offset and the vtable address, we can calculate the address of the vtable entry which contains the address of std::rt::lang_start::_{{closure}}
:
0x0046d6e8 + 0x28 = 0x0046d710
/* WARNING: Unknown calling convention: __rustcall */
/* std::rt::lang_start::_{{closure}} */
undefined8 __rustcall std::rt::lang_start::_{{closure}}(undefined8 *param_1)
{
sys::backtrace::__rust_begin_short_backtrace(*param_1);
return 0;
}
As we also saw earlier, __rust_begin_short_backtrace
calls rust_lab::main
in the end.
/* WARNING: Unknown calling convention: __rustcall */
/* std::sys::backtrace::__rust_begin_short_backtrace */
void __rustcall std::sys::backtrace::__rust_begin_short_backtrace(code *param_1)
{
(*param_1)();
return;
}
Panic: unwind vs abort
When a Rust program panics, it can handle the failure in two ways: unwind or abort. Unwind mode cleans up resources as the panic travels up the call stack, while abort mode immediately terminates the program. The chosen mode affects the generated code.
Source
Initialize a new workspace with cargo init
.
The source code of chapter Vec
is reused here.
Build
$ cargo rustc --release
By default the option panic=unwind
is used.
Ghidra
Load the binary into Ghidra and auto-analyze it.
rust_lab::main
While scrolling through the listing, we can see that Ghidra adds try-catch comments to the following parts, and we notice XREFs from addresses that are far away.
try { // try from 00401958 to 00401967 has its CatchHandler @
LAB_00401958 XREF[1]: 00453a88(*)
00401958 61 03 00 90 adrp x1,0x46d000
arg1: source location for debugging
0040195c 21 80 21 91 add x1=>PTR_s_src/main.rs_0046d860,x1,#0x860 = 0044aea0
arg0: address of vec
00401960 e0 23 00 91 add x0,sp,#0x8
increase cap from 3 to 8
00401964 fd 0d 01 94 bl alloc::raw_vec::RawVec<T,A>::grow_one undefined grow_one()
} // end try from 00401958 to 00401967
catch() { ... } // from try @ 00401958 with catch @ 004019ac
LAB_004019ac XREF[1]: 00453a8a(*)
004019ac e1 07 40 f9 ldr x1,[sp, #0x8]
004019b0 f3 03 00 aa mov x19,x0
004019b4 81 00 00 b4 cbz x1,LAB_004019c4
004019b8 e0 0b 40 f9 ldr x0,[sp, #0x10]
004019bc 22 00 80 52 mov w2,#0x1
004019c0 14 00 00 94 bl __rustc::__rust_dealloc void __rust_dealloc(u8 * ptr, us
LAB_004019c4 XREF[1]: 004019b4(j)
004019c4 e0 03 13 aa mov x0,x19
004019c8 30 c7 00 94 bl _Unwind_Resume undefined _Unwind_Resume()
-- Flow Override: CALL_RETURN (CALL_TERMINATOR)
If we follow the XREFs, we can see they are under the LSDA (Language-Specific Data Area) located in section .gcc_except_table
.
//
// .gcc_except_table
// SHT_PROGBITS [0x453a84 - 0x454c67]
// ram:00453a84-ram:00454c67
//
**************************************************************
* Language-Specific Data Area *
**************************************************************
...
00453a88 44 uleb128 LAB_00401958 (LSDA Call Site) IP Offset
00453a89 10 uleb128 10h (LSDA Call Site) IP Range Length
00453a8a 98 01 uleb128 LAB_004019ac (LSDA Call Site) Landing Pad Add
00453a8c 00 uleb128 0h (LSDA Call Site) Action Table Of
...
This means, if a panic occurs while the execution is between 0x00401958 and 0x00401958 + 0x10 = 0x00401968, during unwinding, the code located at 0x004019ac will be executed. In this case, if necessary (the vector capacity is not null) it deallocates the allocated block and continues the unwinding process.