Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Startup

This section shows how to locate the user-defined main function and trace the call chain that leads to its execution.

Source

Initialize a new workspace with cargo init.

The source code of chapter Hello, world! is reused here.

Build

$ cargo rustc --release

Ghidra

Load the binary into Ghidra and auto-analyze it.

Locating main

In an std environment (as opposed to no_std), the user-defined main function (here rust_lab::main) is called by lang_start_internal.

Call graph:

_start
    _start_c
        __libc_start_main
            main
                lang_start_internal
                    rust_lab::main

Decompiled code:

void main(int param_1,undefined8 param_2)

{
  code *pcStack_8;
  
  pcStack_8 = rust_lab::main;
  std::rt::lang_start_internal(&pcStack_8,&DAT_0046d6e8,(long)param_1,param_2,0);
  return;
}

lang_start_internal can be easily recognized, even if symbols are stripped. The first parameter is the rust_lab::main function being passed.

Listing:

                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined main()
             undefined         <UNASSIGNED>   <RETURN>
             undefined8        Stack[-0x10]:8 local_10                                XREF[1]:     00401b38(W)  
                             main                                            XREF[5]:     Entry Point(*), 
                                                                                          _start_c:004019f4(*), 00453c74, 
                                                                                          004648c4(*), 0046ff90(*)  
        00401b28 08 00 00 90     adrp       x8,0x401000
                             adrp + add loads the address of rust_lab::main
        00401b2c 08 c1 2b 91     add        x8,x8,#0xaf0
                             arg3: argv
        00401b30 e3 03 01 aa     mov        x3,x1
                             arg2: argc
        00401b34 02 7c 40 93     sxtw       x2,w0
                             store link register and address of rust_lab::main on the stack
        00401b38 fe 23 bf a9     stp        x30,x8=>rust_lab::main,[sp, #local_10]!
        00401b3c 61 03 00 90     adrp       x1,0x46d000
                             arg1: vtable pointer of the trait object
        00401b40 21 a0 1b 91     add        x1=>DAT_0046d6e8,x1,#0x6e8
                             arg0: data pointer of the trait object
        00401b44 e0 23 00 91     add        x0,sp,#0x8
                             arg4: 0
        00401b48 e4 03 1f 2a     mov        w4,wzr
        00401b4c ad 5b 00 94     bl         std::rt::lang_start_internal                     undefined lang_start_internal()
        00401b50 fe 07 41 f8     ldr        x30,[sp], #0x10
        00401b54 c0 03 5f d6     ret

Understanding the lang_start_internal arguments

If we look at the signature of lang_start_internal, we can see that it accepts 4 arguments, but the decompiled code above shows 5.

#![allow(unused)]
fn main() {
// To reduce the generated code of the new `lang_start`, this function is doing
// the real work.
#[cfg(not(test))]
fn lang_start_internal(
    main: &(dyn Fn() -> i32 + Sync + crate::panic::RefUnwindSafe),
    argc: isize,
    argv: *const *const u8,
    sigpipe: u8,
) -> isize {
...
}

This is because the first argument is a closure that is converted to a trait object when lang_start calls lang_start_internal.

#![allow(unused)]
fn main() {
#[cfg(not(any(test, doctest)))]
#[lang = "start"]
fn lang_start<T: crate::process::Termination + 'static>(
    main: fn() -> T,
    argc: isize,
    argv: *const *const u8,
    sigpipe: u8,
) -> isize {
    lang_start_internal(
        &move || crate::sys::backtrace::__rust_begin_short_backtrace(main).report().to_i32(),
        argc,
        argv,
        sigpipe,
    )
}
}

Within the closure body, __rust_begin_short_backtrace is called, which then calls rust_lab::main.

#![allow(unused)]
fn main() {
/// Fixed frame used to clean the backtrace with `RUST_BACKTRACE=1`. Note that
/// this is only inline(never) when backtraces in std are enabled, otherwise
/// it's fine to optimize away.
#[cfg_attr(feature = "backtrace", inline(never))]
pub fn __rust_begin_short_backtrace<F, T>(f: F) -> T
where
    F: FnOnce() -> T,
{
    let result = f();

    // prevent this frame from being tail-call optimised away
    crate::hint::black_box(());

    result
}
}

Trait objects are represented by a data pointer (here: address of rust_lab::main) and a vtable pointer (here: DAT_0046d6e8).

                             DAT_0046d6e8                                    XREF[1]:     main:00401b40(*)  
        0046d6e8 00              undefined1 00h
        0046d6e9 00              ??         00h
        ...
        0046d700 d8 1a 40        addr       core::ops::function::FnOnce::call_once{{vtable
                 00 00 00 
                 00 00
        0046d708 b0 1a 40        addr       std::rt::lang_start::_{{closure}}
                 00 00 00 
                 00 00
        0046d710 b0 1a 40        addr       std::rt::lang_start::_{{closure}}
                 00 00 00 
                 00 00

If we look at the disassembly of lang_start_internal, we can see which vtable entry it uses to execute rust_lab::main:


/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */
/* WARNING: Unknown calling convention: __rustcall */
/* std::rt::lang_start_internal */

long __rustcall
std::rt::lang_start_internal
          (undefined8 param_1,long param_2,undefined8 param_3,undefined8 param_4,byte param_5)

{
...
  (**(code **)(param_2 + 0x28))(param_1);
...

Using this offset and the vtable address, we can calculate the address of the vtable entry which contains the address of std::rt::lang_start::_{{closure}}:

0x0046d6e8 + 0x28 = 0x0046d710
/* WARNING: Unknown calling convention: __rustcall */
/* std::rt::lang_start::_{{closure}} */

undefined8 __rustcall std::rt::lang_start::_{{closure}}(undefined8 *param_1)

{
  sys::backtrace::__rust_begin_short_backtrace(*param_1);
  return 0;
}

As we also saw earlier, __rust_begin_short_backtrace calls rust_lab::main in the end.

/* WARNING: Unknown calling convention: __rustcall */
/* std::sys::backtrace::__rust_begin_short_backtrace */

void __rustcall std::sys::backtrace::__rust_begin_short_backtrace(code *param_1)

{
  (*param_1)();
  return;
}