Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

This book teaches modern C++ algorithms, data structures and design patterns (C++11, C++14, C++17, C++20, C++23) through minimal examples written for reverse engineers.

The source code is available here.

🚧 The book is still under construction. New chapters will be added and the existing ones might be modified.

References

Time complexity overview

Time complexity describes how an algorithm’s runtime grows relative to input size (n). These are worst-case bounds (you might get lucky and find an element on the first try, but we plan for the worst).

NotationNameMeaningExample
O(1)ConstantSame time regardless of input size.Array index access.
O(log n)LogarithmicHalves problem each step.Binary search.
O(n)LinearTouches each element once.Finding max in unsorted list.
O(n log n)LinearithmicDivide in half repeatedly, touch all elements each level.Merge sort.
O(n^2)QuadraticNested loops.Bubble sort.

For 1,000,000 elements (worst case):

  • O(1): 1 operation
  • O(log n): log2 1,000,000 ~= 19.93 -> 20 operations
  • O(n): 1,000,000 operations
  • O(n log n): 1,000,000 * log2 1,000,000 ~= 1,000,000 * 19.93 -> 20,000,000 operations
  • O(n^2): 1,000,000 * 1,000,000 operations

Merge sort

Steps:

  • Split the array in half repeatedly, until we have single elements.
  • Merge them back together in sorted order.

Example:

[2,3,1,4]
1. split
[2,3] [1,4]
2. split
[2][3] [1][4] // single elements, log n splits to get here (log 4 = 2)
1. merge
[2,3] [1,4] // 2 comparisons
2. merge
[1,2,3,4] // n-1 (3) comparisons

Total: <n * log2 n -> n * log2 n

Big O is about scaling so we can ignore the < sign.

Bubble sort

Repeatedly walk through, swap adjacent pairs if out of order. Largest element “bubbles” to the end each pass.

[4,3,2,1]
1. pass // n - 1 comparisons
[3,2,1,4]
2. pass // n - 2 comparisons
[2,1,3,4]
3. pass // n - 3 comparisons
[1,2,3,4]

The number of comparisons is decreasing after each pass, because we know that the largest elements are being moved to the end.

To summarize:

pass 1: n - 1 comparisons
pass 2: n - 2 comparisons
pass n-1: 1 comparison

Total: (n-1) + (n-2) + … + 1.

Or written the other way: 1 + 2 + … + (n-1).

Doubling trick so we can simplify it:

    1   +   2   + ... + (n-1)
+ (n-1) + (n-2) + ... +   1
-----------------------------
    n   +   n   + ... +   n

Each column sums to n. How many columns? There are (n-1) numbers.

So we get n*(n-1) but we need to divide it by 2 (to undo the doubling): n*(n-1)/2 = (n^2-n)/2 -> O(n^2).

Big O is about scaling so it ignores 2 and -n.

std::find

Use case

Find a specific value in a sequence.

Explanation

Returns iterator to first element equal to value or end() if not found. Must check every element in worst case (element not present or at end).

Time complexity: O(n). Scans elements one by one until found or end reached. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
//	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;

int main() {
  std::vector<uint32_t> instructions = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  auto it = std::find(instructions.begin(), instructions.end(), ARM64_NOP);

  if (it != instructions.end()) {
    auto index = std::distance(instructions.begin(), it);
    std::println("Found NOP at index: {}", index);
  } else {
    std::println("No NOP found.");
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-find
Found NOP at index: 2

std::find_if

Use case

Find first element matching a predicate.

Explanation

More flexible than std::find (accepts a unary predicate that takes one argument). Essential for finding instructions by property (rather than exact value). Stops at first match.

Time complexity: O(n). Evaluates predicate for each element until match found. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/B--Branch-?lang=en
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
constexpr bool isBranch(uint32_t opcode) {
  uint32_t op = opcode >> 26;
  return op == 0x05 /* B */ || op == 0x25 /* BL */;
}

int main() {
  std::vector<uint32_t> instructions = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  auto it = std::find_if(instructions.begin(), instructions.end(), isBranch);

  if (it != instructions.end()) {
    auto index = std::distance(instructions.begin(), it);
    std::println("Found branch at index: {}", index);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-find_if
Found branch at index: 0

std::count/std::count_if

Use case

Count occurences of a value or elements matching a predicate.

Explanation

count for exact matches, count_if for predicate-based counting. Always travels entire range.

Time complexity: O(n). Must examine every element to get accurate count. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
//	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;

// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/B--Branch-?lang=en
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
constexpr bool isBranch(uint32_t opcode) {
  uint32_t op = opcode >> 26;
  return op == 0x05 /* B */ || op == 0x25 /* BL */;
}

int main() {
  std::vector<uint32_t> instructions = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  auto nopCount =
      std::count(instructions.begin(), instructions.end(), ARM64_NOP);

  auto branchCount =
      std::count_if(instructions.begin(), instructions.end(), isBranch);

  std::println("NOP count: {}", nopCount);
  std::println("Branch count: {}", branchCount);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-count
NOP count: 1
Branch count: 1

std::all_of/std::any_of/std:none_of

Use case

Check if predicate holds for all/any/none of the elements.

Explanation

Useful for validating instruction sequences.

Time complexity: O(n) worst-case but short-circuits:

  • all_of: stops at first false
  • any_of: stops at first true
  • none_of: stops at first true

Best case O(1) if first element determines result.

See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/B--Branch-?lang=en
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
constexpr bool isBranch(uint32_t opcode) {
  uint32_t op = opcode >> 26;
  return op == 0x05 /* B */ || op == 0x25 /* BL */;
}

int main() {
  std::vector<uint32_t> instructions = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  bool allBranch =
      std::all_of(instructions.begin(), instructions.end(), isBranch);

  bool anyBranch =
      std::any_of(instructions.begin(), instructions.end(), isBranch);

  bool noneBranch =
      std::none_of(instructions.begin(), instructions.end(), isBranch);

  std::println("All branch instructions: {}", allBranch);
  std::println("Any branch instructions: {}", anyBranch);
  std::println("No branch instructions: {}", noneBranch);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-all_of-std-any_of-std-none_of
All branch instructions: false
Any branch instructions: true
No branch instructions: false

std::for_each

Use case

Apply a function to each element.

Explanation

Time complexity: O(n). Applies function exactly once per element. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint32_t> instructions = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  std::for_each(instructions.begin(), instructions.end(),
                [](uint32_t insn) { std::println("{:#x}", insn); });

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-for_each
0x94000010
0xd10083ff
0xd503201f

std::equal

Use case

Compare two ranges for equality.

Explanation

Return true if all corresponding elements are equal. Essential for verifying code integrity and detecting patches.

Time complexity: O(n). Compare elements pairwise until mismatch or end. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint32_t> original = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  std::vector<uint32_t> modified = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  bool tampered = std::equal(original.begin(), original.end(), modified.begin(),
                             modified.end());

  std::println("Original == modified: {}", tampered);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-equal
Original == modified: false

std::mismatch

Use case

Find first position where two ranges differ.

Explanation

Returns pair of iterators to first mismatching elements. More useful than equal when we need to know where ranges differ. Essential for binary diffing.

Time complexity: O(n). Compares until first difference found. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint32_t> original = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  std::vector<uint32_t> modified = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  auto [origIt, modIt] = std::mismatch(original.begin(), original.end(),
                                       modified.begin(), modified.end());

  if (origIt != original.end()) {
    size_t offset = std::distance(original.begin(), origIt);
    std::println("First difference at index {}:", offset);
    std::println("  Original: {:#x}", *origIt);
    std::println("  Modified: {:#x}", *modIt);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-mismatch
First difference at index 1:
  Original: 0xd10083ff
  Modified: 0xd503201f

std::search

Use case

Find a subsequence within a sequence.

Explanation

Returns iterator to first occurence of pattern or end() if not found. Essential for gadget finding, pattern matching in code and detecting instruction sequences.

Time complexity: O(n*m) worst case, where n is haystack size and m is needle size. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint32_t> code = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f,
      // $ echo "mov x0, 0" | llvm-mc -triple=aarch64 -show-encoding
      //	mov	x0, #0                          // encoding: [0x00,0x00,0x80,0xd2]
      0xd2800000,
      // $ echo "mov x1, 1" | llvm-mc -triple=aarch64 -show-encoding
      //	mov	x1, #1                          // encoding: [0x21,0x00,0x80,0xd2]
      0xd2800021};

  std::vector<uint32_t> gadget = {
      // $ echo "mov x0, 0" | llvm-mc -triple=aarch64 -show-encoding
      //	mov	x0, #0                          // encoding: [0x00,0x00,0x80,0xd2]
      0xd2800000,
      // $ echo "mov x1, 1" | llvm-mc -triple=aarch64 -show-encoding
      //	mov	x1, #1                          // encoding: [0x21,0x00,0x80,0xd2]
      0xd2800021};

  auto it = std::search(code.begin(), code.end(), gadget.begin(), gadget.end());

  if (it != code.end()) {
    size_t offset = std::distance(code.begin(), it);
    std::println("Found gadget at index {}", offset);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-search
Found gadget at index 4

std::adjacent_find

std::binary_search

Use case

Check if value exists in sorted range.

Explanation

Requires sorted range. Returns bool only. Much faster than find for large sorted datasets.

Time complexity: O(log n). Halves earch space each step. For 1M elements, only ~20 comparisons needed. See possible implementation.

Code

#include <algorithm>
#include <print>
#include <string>
#include <vector>

int main() {
  std::vector<std::string> symbols = {"_helper", "_init", "_main", "_start"};

  bool hasMain = std::binary_search(symbols.begin(), symbols.end(), "_main");

  std::println("Symbol '_main' exists: {}", hasMain);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-binary_search
Symbol '_main' exists: true

std::sort

Use case

Sort elements in ascending order (default) or by custom comparator (in case of complex types).

Explanation

Time complexity: O(n log n). Pick pivot + partitioning: n, recursing on each half: log n. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>

struct Symbol {
  uint64_t address;
  std::string name;
};

int main() {
  std::vector<Symbol> symbols = {
      {0x10001000, "_main"}, {0x10003000, "_fini"}, {0x10002000, "_helper"}};

  std::sort(
      symbols.begin(), symbols.end(),
      [](const Symbol &a, const Symbol &b) { return a.address < b.address; });

  for (const auto &sym : symbols) {
    std::println("{:#x} {}", sym.address, sym.name);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-sort
0x10001000 _main
0x10002000 _helper
0x10003000 _fini

std::stable_sort

Use case

Sort while preserving relative order of equivalent elements.

Explanation

Slightly slower than std::sort due to stability guarantee.

Time complexity: O(n log n) with extra memory, worse than O(n log n) without. The extra memory is necessary for merge sort. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>

struct Symbol {
  uint64_t address;
  std::string name;
  int order;
};

int main() {
  std::vector<Symbol> symbols = {{0x1000, "_init", 0},
                                 {0x1080, "_helper", 1},
                                 {0x1040, "_main", 2},
                                 {0x1040, "main", 3},
                                 {0x1040, "_start", 4}};

  std::stable_sort(
      symbols.begin(), symbols.end(),
      [](const Symbol &a, const Symbol &b) { return a.address < b.address; });

  for (const auto &sym : symbols) {
    std::println("{:#x} {} (was #{})", sym.address, sym.name, sym.order);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-stable_sort
0x1000 _init (was #0)
0x1040 _main (was #2)
0x1040 main (was #3)
0x1040 _start (was #4)
0x1080 _helper (was #1)

std::partial_sort

Use case

Sort only the first k elements.

Explanation

Only sorts the first k elements. Rest are in unspecified order (but all >= sorted portion). Much faster than full sort when k << n. Perfect for “top N” queries.

Time complexity: O(n log k) where k is the number of elements to sort. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>

struct Function {
  std::string name;
  uint64_t size;
};

int main() {
  std::vector<Function> functions = {{"_main", 256},    {"_helper", 56},
                                     {"_init", 32},     {"_process", 1024},
                                     {"_cleanup", 128}, {"_validate", 512},
                                     {"_parse", 2048}};

  std::partial_sort(
      functions.begin(), functions.begin() + 3, functions.end(),
      [](const Function &a, const Function &b) { return a.size > b.size; });

  std::println("Top 3 largest functions:");
  for (size_t i = 0; i < 3; ++i) {
    std::println("{}: {}", functions[i].name, functions[i].size);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-partial_sort
Top 3 largest functions:
_parse: 2048
_process: 1024
_validate: 512

std::is_sorted

Use case

Check if range is sorted.

Explanation

std::is_sorted_until returns iterator to first out-of-order element. Useful for validation before using algorithms that require sorted input (e.g. std::binary_search).

Time complexity: O(n) worst case but short-circuits at first out-of-order element. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint64_t> good = {0x1000, 0x2000, 0x3000, 0x4000};
  std::vector<uint64_t> bad = {0x1000, 0x3000, 0x2000, 0x4000};

  std::println("good sorted: {}", std::is_sorted(good.begin(), good.end()));
  std::println("bad sorted: {}", std::is_sorted(bad.begin(), bad.end()));

  auto it = std::is_sorted_until(bad.begin(), bad.end());

  if (it != bad.end()) {
    size_t offset = std::distance(bad.begin(), it);
    std::println("Sorting breaks at {:#x} (index {})", *it, offset);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-is_sorted
good sorted: true
bad sorted: false
Sorting breaks at 0x2000 (index 2)

std::transform

Use case

Apply a function to each element and store results.

Explanation

Can transform in-place or to different container.

Time complexity: O(n). Applies transformation exactly once per element. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint32_t> offsets = {0x10, 0x20, 0x30, 0x40};
  std::vector<uint64_t> addresses(offsets.size());

  uint64_t base = 0x10000000;

  std::transform(offsets.begin(), offsets.end(), addresses.begin(),
                 [base](uint32_t offs) { return base + offs; });

  for (const auto &addr : addresses) {
    std::println("{:#x}", addr);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-transform
0x10000010
0x10000020
0x10000030
0x10000040

std::copy/std::copy_if

Use case

Copy elements to another range, optionally filtering.

Explanation

copy_if is the filtered version (only copies elements matching predicate). Use with back_inserter for growing containers.

Time complexity: O(n). Visits each element once. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <iterator>
#include <print>
#include <string>
#include <vector>

struct Segment {
  std::string segname;
  uint64_t vmaddr;
  uint64_t vmsize;
  // https://www.gnu.org/software/hurd/gnumach-doc/Memory-Attributes.html
  // https://docs.rs/mach2/latest/src/mach2/vm_prot.rs.html#7
  uint32_t initprot;
};

constexpr uint32_t VM_PROT_EXECUTE = 4;

int main() {
  /// https://securelist.com/mac-os-x/36155/
  std::vector<Segment> segments = {{"__PAGEZERO", 0x0, 0x100000000, 0},
                                   {"__TEXT", 0x100000000, 0x4000, 5},
                                   {"__DATA", 0x100004000, 0x4000, 3},
                                   {"_LINKEDIT", 0x100008000, 0x1000, 1}};

  std::vector<Segment> executableSegments;

  std::copy_if(
      segments.begin(), segments.end(), std::back_inserter(executableSegments),
      [](const Segment &seg) { return (seg.initprot & VM_PROT_EXECUTE) != 0; });

  for (const auto &seg : executableSegments) {
    std::println("{} (vmaddr: {:#x}, vmsize: {:#x})", seg.segname, seg.vmaddr,
                 seg.vmsize);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-copy
__TEXT (vmaddr: 0x100000000, vmsize: 0x4000)

std::fill/std::fill_n

Use case

Fill range with a value.

Explanation

fill takes iterator range, fill_n takes a start iterator and count. Common for initializing buffers or creating NOP sleds.

Time complexity: O(n). Assigns value to each element. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint32_t> code(5);
  // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
  // nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
  constexpr uint32_t ARM64_NOP = 0xd503201f;

  std::fill(code.begin(), code.end(), ARM64_NOP);

  std::println("NOP sled:");
  for (const auto &insn : code) {
    std::println("{:#x}", insn);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-fill
NOP sled:
0xd503201f
0xd503201f
0xd503201f
0xd503201f
0xd503201f

std::replace/std::replace_if

Use case

Replace values in-place.

Explanation

replace for exact value match, replace_if for prediacate-based replacement. Does not change container size.

Time complexity: O(n). Checks and potentially replaces each element. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
//	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BRK--Breakpoint-instruction-?lang=en
// $ echo "brk #0" | llvm-mc -triple=aarch64 -show-encoding
//	brk	#0                              // encoding: [0x00,0x00,0x20,0xd4]
constexpr uint32_t ARM64_BRK_0 = 0xd4200000;

int main() {
  std::vector<uint32_t> instructions = {
      // $ echo "sub sp, sp, #0x20" | llvm-mc -triple=aarch64 -show-encoding
      //sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "brk #0" | llvm-mc -triple=aarch64 -show-encoding
      //	brk	#0                              // encoding: [0x00,0x00,0x20,0xd4]
      0xd4200000,
      // $ echo "add x29, sp, #0x20" | llvm-mc -triple=aarch64 -show-encoding
      // add	x29, sp, #32                    // encoding: [0xfd,0x83,0x00,0x91]
      0x910083fd,
      // $ echo "ret" | llvm-mc -triple=aarch64 -show-encoding
      // ret                                     // encoding: [0xc0,0x03,0x5f,0xd6]
      0xd65f03c0};

  std::replace(instructions.begin(), instructions.end(), ARM64_BRK_0,
               ARM64_NOP);

  std::println("{:#x}", instructions[1]);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-replace
0xd503201f

std::remove_if

Use case

Remove elements matching predicate.

Explanation

remove_if moves unwanted elements to end, returns iterator to new logical end. Must call erase to actually remove (remove-erase idiom). C++20 adds std::erase_if which combines both steps.

Time complexity: O(n). Single pass through elements. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
//	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;

int main() {
  std::vector<uint32_t> code = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  auto newEnd = std::remove_if(code.begin(), code.end(),
                               [](uint32_t insn) { return insn == ARM64_NOP; });

  code.erase(newEnd, code.end());

  std::println("{}", code.size());

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-remove_if
2

std::unique

Use case

Remove consecutive duplicates.

Explanation

Only removes adjacent duplicates (sort first for true uniqueness). Returns iterator to new end. Like remove, requires erase to actually shrink container.

Time complexity: O(n). Single pass comparing adjacent elements. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint64_t> addresses = {0x1000, 0x1000, 0x1000,
                                     0x1040, 0x1080, 0x1080};

  auto newEnd = std::unique(addresses.begin(), addresses.end());

  addresses.erase(newEnd, addresses.end());

  for (const auto &addr : addresses) {
    std::println("{:#x}", addr);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-unique
0x1000
0x1040
0x1080

std::reverse

Use case

Reverse element order.

Explanation

In-place reversal. reverse_copy for non-modifying version. Useful for endianness conversion or reversing instruction sequences.

Time complexity: O(n). Swaps n/2 pairs of elements. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint32_t> code = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  std::reverse(code.begin(), code.end());

  for (const auto &insn : code) {
    std::println("{:#x}", insn);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-reverse
0xd503201f
0xd10083ff
0x94000010

std::accumulate

Use case

Compute sum or fold operation (combining all elements into a single value) over range.

Explanation

Takes initial value and optional binary operation. Strictly left-to-right evaluation (not parallelizable). Use std::reduce for parallel-friendly version.

Time complexity: O(n). Process each element exactly once. See possible implementation.

Code

#include <algorithm>
#include <numeric>
#include <print>
#include <vector>

int main() {
  std::vector<uint64_t> sectionSizes = {1024, 2048, 1024, 4096};

  auto total = std::accumulate(sectionSizes.begin(), sectionSizes.end(), 0);

  std::println("Total size: {}", total);

  std::vector<uint8_t> bytes = {0x01, 0x02, 0x03, 0x04};

  uint32_t checksum =
      std::accumulate(bytes.begin(), bytes.end(), 0, std::bit_xor{});

  std::println("Checksum: {:#x}", checksum);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-accumulate
Total size: 8192
Checksum: 0x4

std::iota

Use case

Fill range with sequentially increasing values.

Explanation

Fill with value, value+1, … .

Time complexity: O(n). Increments and assigns once per element. See possible implementation.

Code

#include <algorithm>
#include <numeric>
#include <print>
#include <vector>

int main() {
  std::vector<uint64_t> addresses(5);

  std::iota(addresses.begin(), addresses.end(), 0x10000000);

  for (const auto &addr : addresses) {
    std::println("{:#x}", addr);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-iota
0x10000000
0x10000001
0x10000002
0x10000003
0x10000004

std::reduce

Use case

Parallel-friendly version of accumulate.

Explanation

Unlike accumulate, operation must be associative (grouping does not matter) and commutative (order does not matter) for correct parallel execution.

Time complexity: O(n), sequential. See std::reduce.

Code

#include <cstdint>
#include <numeric>
#include <print>
#include <vector>

int main() {
  std::vector<uint64_t> segmentSizes = {4096, 4096, 8192, 16384};

  uint64_t total = std::reduce(segmentSizes.begin(), segmentSizes.end());

  std::println("Total size: {}", total);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-reduce
Total size: 32768

std::min_element/std::max_element/std::minmax_element

Use case

Find smallest/largest element in range.

Explanation

Returns iterator, not value. minmax_element is more efficient than calling min_element and max_element separately.

Time complexity: O(n). See possible implementation: std::min_element, std::max_element, std::minmax_element.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>

int main() {
  std::vector<uint64_t> addresses = {0x10000000, 0x10001000, 0x10002000,
                                     0x10003000};

  auto [minIt, maxIt] = std::minmax_element(addresses.begin(), addresses.end());

  std::println("Address range: {:#x}-{:#x}", *minIt, *maxIt);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-minmax
Address range: 0x10000000-0x10003000

std::clamp

Use case

Constrain value to a range.

Explanation

Returns lo if v < lo, hi if v > hi, otherwise v. Useful for bounds checking instruction operands like branch offsets.

Time complexity: O(1). Just two comparisons. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>

constexpr int32_t operator""_MB(unsigned long long mb) {
  return mb * 1024 * 1024;
}

int main() {
  // https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
  constexpr int32_t BL_MAX = 128_MB;
  constexpr int32_t BL_MIN = -128_MB;

  int32_t requested = 200_MB;
  int32_t clamped = std::clamp(requested, BL_MIN, BL_MAX);

  std::println("Clamped: {:} MB", clamped / 1024 / 1024);

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-clamp
Clamped: 128 MB

std::partition

Use case

Reorder elements so those satisfying predicate come first.

Explanation

Not stable. Use stable_partition to preserve relative order. Returns iterator to first element not satisfying the predicate.

Time complexity: O(n). Single pass. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>

struct Section {
  std::string name;
  uint32_t flags;
};

// https://developer.apple.com/library/archive/documentation/DeveloperTools/Reference/Assembler/040-Assembler_Directives/asm_directives.html#//apple_ref/doc/uid/TP30000823-SW1
// https://llvm.org/doxygen/namespacellvm_1_1MachO.html
constexpr uint32_t S_ATTR_PURE_INSTRUCTIONS = 0x80000000;
constexpr uint32_t S_ATTR_SOME_INSTRUCTIONS = 0x00000400;

bool isExecutable(const Section &sec) {
  return (sec.flags & S_ATTR_PURE_INSTRUCTIONS) ||
         (sec.flags & S_ATTR_SOME_INSTRUCTIONS);
}

int main() {
  std::vector<Section> sections = {{"__text", S_ATTR_PURE_INSTRUCTIONS},
                                   {"__cstring", 0},
                                   {"__const", 0},
                                   {"__data", 0}};

  auto partPoint =
      std::partition(sections.begin(), sections.end(), isExecutable);

  std::println("Executable:");
  for (auto it = sections.begin(); it != partPoint; ++it) {
    std::print("{} ", it->name);
  }
  std::println("");

  std::println("Non-executable:");
  for (auto it = partPoint; it != sections.end(); ++it) {
    std::print("{} ", it->name);
  }
  std::println("");

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-partition
Executable:
__text 
Non-executable:
__cstring __const __data

std::set_intersection/std::set_union/std::set_difference

Use case

Perform set operations on sorted ranges.

Explanation

Input ranges must be sorted. Output is sorted. Single pass through both ranges. Useful for comparing symbol tables or finding common/unique symbols.

Time complexity: O(n + m) where n and m are the sizes of the two input ranges. See possible implementations: std::set_intersection, std::set_union, std::set_difference.

Code

#include <algorithm>
#include <iterator>
#include <print>
#include <string>
#include <vector>

int main() {
  std::vector<std::string> binaryA = {"_main", "_helper", "_cleanup",
                                      "_old_func"};
  std::vector<std::string> binaryB = {"_main", "_helper", "_cleanup",
                                      "_new_func"};

  std::vector<std::string> common, onlyA, onlyB;

  std::set_intersection(binaryA.begin(), binaryA.end(), binaryB.begin(),
                        binaryB.end(), std::back_inserter(common));

  std::set_difference(binaryA.begin(), binaryA.end(), binaryB.begin(),
                      binaryB.end(), std::back_inserter(onlyA));

  std::set_difference(binaryB.begin(), binaryB.end(), binaryA.begin(),
                      binaryA.end(), std::back_inserter(onlyB));

  std::println("Common:");
  for (const auto &sym : common) {
    std::print("{} ", sym);
  }
  std::println("");

  std::println("Only A:");
  for (const auto &sym : onlyA) {
    std::print("{} ", sym);
  }
  std::println("");

  std::println("Only B:");
  for (const auto &sym : onlyB) {
    std::print("{} ", sym);
  }
  std::println("");

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-set_intersection-set_union-set_difference
Common:
_main _helper _cleanup 
Only A:
_old_func 
Only B:
_new_func

std::merge

Use case

Merge two sorted ranges into one sorted output.

Explanation

Both inputs must be sorted. Useful for combining symbol tables.

Time complexity: O(n + m). Single pass through both ranges. See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>

struct Symbol {
  uint64_t address;
  std::string name;
};

int main() {
  std::vector<Symbol> obj1 = {{0x1000, "_init"}, {0x1100, "_helper"}};

  std::vector<Symbol> obj2 = {{0x1200, "_main"}, {0x1300, "_cleanup"}};

  std::vector<Symbol> merged(obj1.size() + obj2.size());

  std::merge(
      obj1.begin(), obj1.end(), obj2.begin(), obj2.end(), merged.begin(),
      [](const Symbol &a, const Symbol &b) { return a.address < b.address; });

  for (const auto &sym : merged) {
    std::println("{:#x} {}", sym.address, sym.name);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-merge
0x1000 _init
0x1100 _helper
0x1200 _main
0x1300 _cleanup

std::generator

std::rotate

Use case

Rotate elements so that a given element becomes the first.

Explanation

Left rotation: elements before middle move to end. Use rotate_copy for non-modifying version. Returns iterator to original first element’s new position.

Time complexity: O(n). Each element is moved once. See possible implementation.

Code

#include <algorithm>
#include <print>
#include <string>
#include <vector>

int main() {
  // https://100daysofredteam.medium.com/mach-o-file-format-for-red-team-professionals-fb89ea6c311e
  std::vector<std::string> loadCommands = {"LC_SEGMENT_64 (__TEXT)",
                                           "LC_SEGMENT_64 (__DATA)", "LC_MAIN",
                                           "LC_SEGMENT_64 (__LINKEDIT)"};

  auto mainIt = std::find(loadCommands.begin(), loadCommands.end(), "LC_MAIN");

  if (mainIt != loadCommands.end()) {
    std::rotate(loadCommands.begin(), mainIt, loadCommands.end());
  }

  std::println("Rotated load commands:");
  for (const auto &lc : loadCommands) {
    std::println("{}", lc);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-rotate
Rotated load commands:
LC_MAIN
LC_SEGMENT_64 (__LINKEDIT)
LC_SEGMENT_64 (__TEXT)
LC_SEGMENT_64 (__DATA)

std::shuffle

Use case

Randomly reorder elements using a random number generator.

Explanation

Requires a uniform random generator (like std::mt19937). Useful for obfuscation and fuzzing.

Time complexity: O(n) (exactly n - 1 swaps). See possible implementation.

Code

#include <algorithm>
#include <cstdint>
#include <print>
#include <random>
#include <vector>

int main() {
  std::vector<uint32_t> instructions = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  // Fix seed.
  std::mt19937 rng{0xFF};

  std::shuffle(instructions.begin(), instructions.end(), rng);

  for (const auto &insn : instructions) {
    std::println("{:#x}", insn);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-shuffle
0xd10083ff
0xd503201f
0x94000010

std::next_permutation

Use case

Generate next permutation in dictionary order.

Explanation

“Lexicographically greater” means dictionary order (how words are sorted alphabetically). The permutations are generated in sorted order. Returns false when wrapper back to the smallest permutation (sorted order).

Time complexity: O(n) per call but n! total permutations exist. See possible implementation.

Code

#include <algorithm>
#include <print>
#include <string>
#include <vector>

int main() {
  std::vector<std::string> regs = {"x0", "x1", "x2"};

  std::println("Possible register orderings:");
  std::sort(regs.begin(), regs.end());

  do {
    std::println("{}, {}, {}", regs[0], regs[1], regs[2]);
  } while (std::next_permutation(regs.begin(), regs.end()));

  return 0;
}

View on GitHub.

Output

$ ./src/algorithms/build/std-next_permutation
x0, x1, x2
x0, x2, x1
x1, x0, x2
x1, x2, x0
x2, x0, x1
x2, x1, x0

std::array

Explanation

Fixed-size array. Size must be known at compile-time.

See std::array.

Memory: elements stored inline (stack if local variable, heap if newed).

Time complexity:

OperationComplexity
[], at()O(1)
front(), back()O(1)
size(),empty()O(1)

Use case

  • Size known at compile-time.
  • Small collections.
  • Embedded systems avoiding heap allocation.

Code

#include <array>
#include <print>

int main() {
  std::array<uint32_t, 3> opcodes = {
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      // nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f,
      // $ echo "mov x0, #1" | llvm-mc -triple=aarch64 -show-encoding
      // mov	x0, #1                          // encoding: [0x20,0x00,0x80,0xd2]
      0xd2800020,
      // $ echo "bl #0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010};

  std::println("Size: {}", opcodes.size());
  std::println("First: {:#x}", opcodes.front());
  std::println("Last: {:#x}", opcodes.back());
  std::println("At 2: {:#x}", opcodes.at(2));

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-array
Size: 3
First: 0xd503201f
Last: 0x94000010
At 2: 0x94000010

std::vector

Explanation

Dynamic array. The default container, use this unless we have a specific reason not to.

See std::vector.

Memory: control block on stack, elements in heap (contiguous).

Time complexity:

OperationComplexity
[], at()O(1)
front(), back()O(1)
size(),empty()O(1)
push_back(), pop_back()O(1) amortized
insert()/erase()O(n) (must shift elements)
reserve()/resize()O(n)

Amortized means it is not truly constant. When the array is full, it needs to allocate a new, larger block of memory and copy everything over. Some implementations might also shrink the array, therefore pop_back() is marked as amortized as well.

Use case

  • Default choice for sequences.
  • Size not known at compile time.
  • Need fast random access.

Code

#include <print>
#include <vector>

int main() {
  std::vector<uint32_t> instructions;

  // Reserve to avoid reallocations.
  instructions.reserve(100);

  // $ echo "bl #0x40" | llvm-mc -triple=aarch64 -show-encoding
  // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
  instructions.push_back(0x94000010);
  instructions.push_back(0x94000010);
  instructions.push_back(0x94000010);

  std::println("Size: {}, capacity: {}", instructions.size(),
               instructions.capacity());

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-vector
Size: 3, capacity: 100

std::deque

Explanation

Double-ended queue. Fast insertion/removal at both front and back.

See std::deque.

Memory: array of pointers to fixed-size chunks (all on heap).

Time complexity:

OperationComplexity
[], at()O(1)
front(), back()O(1)
size(),empty()O(1)
push_back(), pop_back()O(1)
push_front(), pop_front()O(1)
insert()/erase()O(n)

Use case

  • Need fast push_front/pop_front.
  • Need stable references (push/insert do not invalidate).
  • Do not need contiguous memory.

Code

#include <deque>
#include <print>

int main() {
  std::deque<uint64_t> addresses;

  addresses.push_back(0x10001000);
  addresses.push_front(0x10002000);

  std::println("Front: {:#x}", addresses.front());
  std::println("Back: {:#x}", addresses.back());

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-deque
Front: 0x10002000
Back: 0x10001000

std::list

Explanation

Doubly-linked list. Each element in its own heap allocation.

See std::list.

Memory: each node contains element + two pointers (prev/next). High memory overhead.

Time complexity:

OperationComplexity
front(), back()O(1)
size(),empty()O(1)
push_back(), pop_back()O(1)
insert()/erase()O(1) (with iterator)
splice()O(1)
sort()O(n log n)

“With iterator” means we already have an iterator to the position where we want to insert/erase.

If sort() uses merge sort (like libstdc++ does), then the time complexity is O(n log n).

Use case

  • Frequent insertion/removal in the middle (with iterator).
  • Need stable iterator after modification.
  • std::list is rather rarely used. std::vector is usually faster due to cache locality.

Code

#include <list>
#include <print>

int main() {
  std::list<uint64_t> breakpoints = {0x1000, 0x2000, 0x3000};

  auto it = std::next(breakpoints.begin());
  breakpoints.insert(it, 0x1500);

  for (const auto &br : breakpoints) {
    std::println("{:#x}", br);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-list
0x1000
0x1500
0x2000
0x3000

std::forward_list

Explanation

Singly-linked list. Minimal memory overhead compared to std::list.

See std::forward_list.

Memory: each node contains element + one pointer (next).

Time complexity:

OperationComplexity
front()O(1)
push_front(), pop_front()O(1)
insert_after()/erase_after()O(1) (with iterator)

“With iterator” means we already have an iterator to the position where we want to insert/erase.

Use case

  • Rarely used. Only if memory is extremely tight and we only traverse forward.

std::map

Explanation

Sorted key-value pairs. Implemented as red-black tree (self-balancing BST).

See std::map.

Memory: each entry in its own tree node (key + value + 3 pointers + color bit). The 3 pointers are: left child pointer, right child pointer, parent pointer. High overhead.

Time complexity:

OperationComplexity
find(), count(), contains()O(log n)
[], atO(log n)
insert()/emplace()O(log n)
erase()O(log n) (by key)
erase()O(1) (by iterator, amortized)

“By iterator” means we already have an iterator to the position where we want to erase.

Amortized means it is not truly constant. Rebalancing might be triggered.

Use case

  • Need sorted iteration by key.
  • Moderate number of elements.

Code

#include <map>
#include <print>
#include <string>

int main() {
  std::map<uint64_t, std::string> symbols;

  symbols.emplace(0x10001000, "_main");
  symbols.emplace(0x10003000, "_init");
  symbols.emplace(0x10002000, "_helper");

  std::println("Symbols (sorted by address):");
  for (const auto &[addr, name] : symbols) {
    std::println("{:#x}: {}", addr, name);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-map
Symbols (sorted by address):
0x10001000: _main
0x10002000: _helper
0x10003000: _init

std::unordered_map

Explanation

Hash table.

See std::unordered_map.

Memory: array of buckets (each bucket is a linked list of entries). Can use more memory than std::map due to bucket overhead.

Time complexity:

OperationComplexity
find(), count(), contains()O(1)
[], atO(1)
insert()/emplace()O(1)
erase()O(1) (by key)

Worst case is O(n) for every operation. It occurs with many hash collisions (all keys hash to same bucket).

Use case

  • Fast lookup is priority.
  • Do not need sorted iteration.
  • Large number of elements.
  • std::unordered_map can be slower than std::map for small sizes due to hash function overhead.

Code

#include <print>
#include <string>
#include <unordered_map>

int main() {
  std::unordered_map<std::string, uint64_t> symbolLookup;

  symbolLookup.emplace("_main", 0x10001000);
  symbolLookup.emplace("_helper", 0x10002000);
  symbolLookup.emplace("_init", 0x10003000);

  std::println("_main is at: {:#x}", symbolLookup["_main"]);

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-unordered_map
_main is at: 0x10001000

std::flat_map

Explanation

Sorted key-value pairs stored in contiguous memory (two vectors internally).

See std::flat_map.

Memory: two std::vectors, one for keys, one for value. Cache-friendly.

Time complexity:

OperationComplexity
find(), count(), contains()O(log n)
[], atO(log n)
insert()/emplace()O(n) (must shift elements)
erase()O(n) (must shift elements)

Use case

  • Read-heavy workloads.
  • Need sorted iteration with better cache performance than std::map.

Code

// std::flat_map is not fully implemented everywhere yet.
// https://en.cppreference.com/w/cpp/compiler_support/23.html
#ifdef __APPLE__
#include <flat_map>
#include <print>
#include <string>

int main() {
  std::flat_map<uint64_t, std::string> symbols;

  symbols.emplace(0x10001000, "_main");
  symbols.emplace(0x10003000, "_init");
  symbols.emplace(0x10002000, "_helper");

  for (const auto &[addr, name] : symbols) {
    std::println("{:#x}: {}", addr, name);
  }

  return 0;
}
#else
int main() { return 0; }
#endif

View on GitHub.

Output

$ ./src/data-structures/build/std-flat_map
0x10001000: _main
0x10002000: _helper
0x10003000: _init

std::set/std::unordered_set

Explanation

Like map/unordered_map but stores only keys.

See std::set and std::unordered_set.

Time complexity:

Operationstd::setstd::unordered_set
find(), count(), contains()O(log n)O(1) avg, O(n) worst
insert()/emplace()O(log n)O(1) avg, O(n) worst
erase()O(log n)O(1) avg, O(n) worst

Use case

  • Deduplication.

Code

#include <print>
#include <set>
#include <unordered_set>

int main() {
  std::set<uint64_t> visitedAddrs;

  visitedAddrs.insert(0x1000);
  visitedAddrs.insert(0x1100);
  visitedAddrs.insert(0x1000);

  std::println("Visited {} unique addresses.", visitedAddrs.size());

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-set-std-unordered_set
Visited 2 unique addresses.

std::multimap/std::multiset

Explanation

Like map/set but allows duplicate keys.

See std::multimap and std::multiset.

Time complexity: same as std::map/std::set.

OperationComplexity
find(), count(), contains()O(log n)
insert()/emplace()O(log n)
erase() by keyO(log n + k) where k = num of matching keys

Use case

  • Multiple values per key, grouping related items.

Code

#include <map>
#include <print>
#include <string>

int main() {
  // Multiple symbols can have the same address (aliases)-
  std::multimap<uint64_t, std::string> symbols;

  symbols.emplace(0x10001000, "_main");
  symbols.emplace(0x10001000, "main");
  symbols.emplace(0x10002000, "_helper");

  auto range = symbols.equal_range(0x10001000);

  std::println("Symbols at 0x10001000:");
  for (auto it = range.first; it != range.second; ++it) {
    // it->second is the value (std::string name)
    std::println("{}", it->second);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-multimap-std-multiset
Symbols at 0x10001000:
_main
main

std::pair

Explanation

Holds exactly two values of potentially different types.

See std::pair.

Memory: inline (stack if local). Size = sizeof(T1) + sizeof(T2) + padding.

Code

#include <print>
#include <utility>

int main() {
  std::pair<uint64_t, std::string> symbol{0x10001000, "_main"};

  auto [addr, name] = symbol;
  std::println("{} @ {:#x}", name, addr);

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-pair
_main @ 0x10001000

std::tuple

Explanation

Holds any number of values of potentially different types.

See std::tuple.

Memory: inline. Size = sum of all element sizes + padding.

Code

#include <print>
#include <string>
#include <tuple>

int main() {
  std::tuple<uint64_t, std::string, size_t> symbol{0x10001000, "_main", 0x100};

  auto [addr, name, size] = symbol;

  std::println("{} @ {:#x} (size: {:#x})", name, addr, size);

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-tuple
_main @ 0x10001000 (size: 0x100)

std::string

Explanation

Dynamic character array. Similar to std::vector<char> but with string-specific operations.

See std::string.

Memory: usually heap-allocated, but small strings might be stored inline (SSO).

Time complexity:

OperationComplexity
[], atO(1)
length()/size()/empty()O(1)
append()O(k) where k = appended length
substr()O(k) where k = substring length
compare()O(n)

Code

#include <print>
#include <string>

int main() {
  std::string symbolName = "_main";
  symbolName.append("_v2");

  std::println("Contains 'main': {}", symbolName.contains("main"));

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-string
Contains 'main': true

std::span

Explanation

Non-owning view over contiguous memory. Does not allocate.

See std::span.

Time complexity:

OperationComplexity
[], front(), back()O(1)
size()/empty()O(1)
data()O(1)

Use case

  • Function parameters that accept any contiguous container (vector, array, C array).

Code

#include <array>
#include <print>
#include <span>
#include <vector>

void processInstruction(std::span<const uint32_t> instructions) {
  for (auto insn : instructions) {
    std::println("{:#x}", insn);
  }
}

int main() {
  std::vector<uint32_t> vec = {
      // $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
      // bl	#64                             // encoding: [0x10,0x00,0x00,0x94]
      0x94000010,
      // $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
      // sub	sp, sp, #32                     // encoding: [0xff,0x83,0x00,0xd1]
      0xd10083ff,
      // $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
      //	nop                                     // encoding: [0x1f,0x20,0x03,0xd5]
      0xd503201f};

  std::array<uint32_t, 2> arr = {0x94000010, 0xd10083ff};

  uint32_t cArr[] = {0x94000010, 0xd10083ff};

  processInstruction(vec);
  processInstruction(arr);
  processInstruction(cArr);

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-span
0x94000010
0xd10083ff
0xd503201f
0x94000010
0xd10083ff
0x94000010
0xd10083ff

std::string_view

Explanation

Non-owning view over character data. Avoids copying strings.

See std::string_view.

Time complexity:

OperationComplexity
[], front(), back()O(1)
size()/empty()O(1)
compare()O(n)

Code

#include <print>
#include <string_view>

void printSymbol(std::string_view name) { std::println("Symbol: {}", name); }

int main() {
  std::string str = "_main";
  const char *cstr = "_helper";

  printSymbol(str);
  printSymbol(cstr);

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-string_view
Symbol: _main
Symbol: _helper

std::stack

Explanation

LIFO (Last In, First Out). Uses std::deque internally.

See std::stack.

Time complexity:

OperationComplexity
push()O(1)
pop()O(1)
empty(), size()O(1)

Code

#include <print>
#include <stack>

int main() {
  std::stack<uint64_t> callStack;

  callStack.push(0x1000);
  callStack.push(0x2000);
  callStack.push(0x3000);

  while (!callStack.empty()) {
    std::println("Return to: {:#x}", callStack.top());
    callStack.pop();
  }

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-stack
Return to: 0x3000
Return to: 0x2000
Return to: 0x1000

std::queue

Explanation

FIFO (First In, First Out). Uses std::deque internally.

See std::queue.

Time complexity:

OperationComplexity
push()O(1)
pop()O(1)
empty(), size()O(1)

Code

#include <print>
#include <queue>

int main() {
  std::queue<uint64_t> workQueue;

  workQueue.push(0x1000);
  workQueue.push(0x2000);
  workQueue.push(0x3000);

  while (!workQueue.empty()) {
    std::println("Process: {:#x}", workQueue.front());
    workQueue.pop();
  }

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-queue
Process: 0x1000
Process: 0x2000
Process: 0x3000

std::priority_queue

Explanation

Largest element always on top. Uses std::vector internally.

See std::priority_queue.

Time complexity:

OperationComplexity
push()O(log n)
pop()O(log n)
empty(), size()O(1)

std::priority_queue is implemented as a binary heap.

Code

#include <print>
#include <queue>

int main() {
  std::priority_queue<int, std::vector<int>, std::less<int>> maxHeap;

  maxHeap.push(0x2000);
  maxHeap.push(0x3000);
  maxHeap.push(0x1000);

  while (!maxHeap.empty()) {
    std::println("Process: {:#x}", maxHeap.top());
    maxHeap.pop();
  }

  return 0;
}

View on GitHub.

Output

$ ./src/data-structures/build/std-priority_queue
Process: 0x3000
Process: 0x2000
Process: 0x1000

Visitor pattern

Use case

Process different instruction types in a disassembler without modifying each instruction class.

Explanation

The visitor pattern separates algorithms from the objects they operate on. Modern C++ uses std::variant + std::visit.

Code

#include <cstdint>
#include <print>
#include <string>
#include <variant>
#include <vector>

// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/MOV--register---Move-register-value--an-alias-of-ORR--shifted-register--?lang=en
struct MovInstr {
  std::string dest;
  std::string src;
};

// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
struct BlInstr {
  uint64_t target;
  std::string name;
};

// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/RET--Return-from-subroutine-?lang=en
struct RetInstr {};

using Instruction = std::variant<MovInstr, BlInstr, RetInstr>;

struct DisasmVisitor {
  void operator()(const MovInstr &m) const {
    std::println("mov {}, {}", m.dest, m.src);
  }
  void operator()(const BlInstr &b) const {
    std::println("bl {:#x} ; {}", b.target, b.name);
  }
  void operator()(const RetInstr &) const { std::println("ret"); }
};

int main() {
  std::vector<Instruction> code = {MovInstr{"x0", "x1"},
                                   BlInstr{0x10001000, "_printf"}, RetInstr{}};

  std::println("Disassembly:");
  for (const auto &instr : code) {
    std::visit(DisasmVisitor{}, instr);
  }

  return 0;
}

View on GitHub.

Output

$ ./src/design-patterns-and-idioms/build/visitor-pattern
Disassembly:
mov x0, x1
bl 0x10001000 ; _printf
ret

Factory pattern

Use case

Create architecture-specific disassembler without hardcoding which class to instantiate.

Explanation

Factory encapsulates object creation logic. Client code works with abstract interface while concrete types are selected at runtime. Modern C++ uses std::unique_ptr for ownership.

Code

#include <cstdint>
#include <memory>
#include <print>
#include <string>

class Disassembler {
public:
  virtual ~Disassembler() = default;
  virtual std::string name() const = 0;
};

class X64Disassembler : public Disassembler {
public:
  std::string name() const override { return "x86-64"; }
};

class ARM64Disassembler : public Disassembler {
public:
  std::string name() const override { return "arm64"; }
};

std::unique_ptr<Disassembler> create_disassembler(const std::string &arch) {
  if (arch == "x86-64") {
    return std::make_unique<X64Disassembler>();
  }
  if (arch == "arm64") {
    return std::make_unique<ARM64Disassembler>();
  }
  return nullptr;
}

int main() {
  for (auto arch : {"x86-64", "arm64", "mips"}) {
    if (auto disasm = create_disassembler(arch)) {
      std::println("{}", disasm->name());
    } else {
      std::println("Not supported: {}", arch);
    }
  }
  return 0;
}

View on GitHub.

Output

$ ./src/design-patterns-and-idioms/build/factory-pattern
x86-64
arm64
Not supported: mips

CRTP (Curiously Recurring Template Pattern)

Use case

Implement common functionality for memory region types without virtual function overhead.

Explanation

CRTP provides static polymorphism: a base class template takes the derived class as a parameter, enabling compile-time method dispatch (via static_cast). Zero runtime overhead compared to virtual functions (at the cost of increased binary size).

Code

#include <cstdint>
#include <print>

template <typename Derived> class MemoryRegion {
public:
  void dump() const {
    // Dereference this, then cast the base reference to derived reference.
    const auto &self = static_cast<const Derived &>(*this);
    std::println("{}: {:#x}", self.name(), self.start());
  }
};

class CodeSection : public MemoryRegion<CodeSection> {
  uint64_t base_;

public:
  CodeSection(uint64_t base) : base_(base) {}
  uint64_t start() const { return base_; }
  const char *name() const { return ".text"; }
};

class DataSection : public MemoryRegion<DataSection> {
  uint64_t base_;

public:
  DataSection(uint64_t base) : base_(base) {}
  uint64_t start() const { return base_; }
  const char *name() const { return ".data"; }
};

int main() {
  CodeSection text{0x10001000};
  DataSection data{0x10002000};

  text.dump();
  data.dump();

  return 0;
}

View on GitHub.

Output

$ ./src/design-patterns-and-idioms/build/crtp
.text: 0x10001000
.data: 0x10002000

RAII (Resource Acquisition Is Initialization)

Use case

Automatically release a lock on shared disassembler cache when leaving scope.

Explanation

RAII ties resource lifetime to object lifetime. Constructor acquires, destructor releases. Guarantees cleanup even on exceptions.

Code

#include <cstdint>
#include <mutex>
#include <print>

std::mutex g_cache_mutex;

class CacheLock {
public:
  CacheLock(std::mutex &m) : mutex_(m) {
    mutex_.lock();
    std::println("Cache locked.");
  }

  ~CacheLock() {
    mutex_.unlock();
    std::println("Cache unlocked.");
  }

private:
  std::mutex &mutex_;
};

void disassemble(uint64_t addr) {
  CacheLock lock(g_cache_mutex);
  std::println("Disassembling {:#x}...", addr);
}

int main() {
  disassemble(0x10001000);
  disassemble(0x10002000);
  return 0;
}

View on GitHub.

Output

$ ./src/design-patterns-and-idioms/build/raii
Cache locked.
Disassembling 0x10001000...
Cache unlocked.
Cache locked.
Disassembling 0x10002000...
Cache unlocked.

Singleton

Use case

Global symbol table that must be initialized once and accessed from multiple analysis passes.

Explanation

Meyer’s singleton uses a function-local static variable.

Code

#include <cstdint>
#include <print>
#include <string>
#include <unordered_map>

class SymbolTable {
public:
  static SymbolTable &instance() {
    static SymbolTable table;
    return table;
  }

  void add(uint64_t addr, const std::string &name) {
    symbols_.emplace(addr, name);
  }

  std::string lookup(uint64_t addr) const {
    auto it = symbols_.find(addr);
    if (it != symbols_.end()) {
      return it->second;
    } else {
      return std::format("sub_{:#x}", addr);
    }
  }

private:
  SymbolTable() { std::println("Symbol table created."); }
  std::unordered_map<uint64_t, std::string> symbols_;
};

int main() {
  SymbolTable::instance().add(0x10001000, "_main");
  SymbolTable::instance().add(0x10002000, "_decrypt");

  std::println("call {}", SymbolTable::instance().lookup(0x10001000));
  std::println("call {}", SymbolTable::instance().lookup(0x10002000));
  std::println("call {}", SymbolTable::instance().lookup(0x10003000));

  return 0;
}

View on GitHub.

Output

$ ./src/design-patterns-and-idioms/build/singleton
Symbol table created.
call _main
call _decrypt
call sub_0x10003000