Introduction
This book teaches modern C++ algorithms, data structures and design patterns (C++11, C++14, C++17, C++20, C++23) through minimal examples written for reverse engineers.
The source code is available here.
🚧 The book is still under construction. New chapters will be added and the existing ones might be modified.
References
Time complexity overview
Time complexity describes how an algorithm’s runtime grows relative to input size (n). These are worst-case bounds (you might get lucky and find an element on the first try, but we plan for the worst).
| Notation | Name | Meaning | Example |
|---|---|---|---|
| O(1) | Constant | Same time regardless of input size. | Array index access. |
| O(log n) | Logarithmic | Halves problem each step. | Binary search. |
| O(n) | Linear | Touches each element once. | Finding max in unsorted list. |
| O(n log n) | Linearithmic | Divide in half repeatedly, touch all elements each level. | Merge sort. |
| O(n^2) | Quadratic | Nested loops. | Bubble sort. |
For 1,000,000 elements (worst case):
- O(1): 1 operation
- O(log n): log2 1,000,000 ~= 19.93 -> 20 operations
- O(n): 1,000,000 operations
- O(n log n): 1,000,000 * log2 1,000,000 ~= 1,000,000 * 19.93 -> 20,000,000 operations
- O(n^2): 1,000,000 * 1,000,000 operations
Merge sort
Steps:
- Split the array in half repeatedly, until we have single elements.
- Merge them back together in sorted order.
Example:
[2,3,1,4]
1. split
[2,3] [1,4]
2. split
[2][3] [1][4] // single elements, log n splits to get here (log 4 = 2)
1. merge
[2,3] [1,4] // 2 comparisons
2. merge
[1,2,3,4] // n-1 (3) comparisons
Total: <n * log2 n -> n * log2 n
Big O is about scaling so we can ignore the < sign.
Bubble sort
Repeatedly walk through, swap adjacent pairs if out of order. Largest element “bubbles” to the end each pass.
[4,3,2,1]
1. pass // n - 1 comparisons
[3,2,1,4]
2. pass // n - 2 comparisons
[2,1,3,4]
3. pass // n - 3 comparisons
[1,2,3,4]
The number of comparisons is decreasing after each pass, because we know that the largest elements are being moved to the end.
To summarize:
pass 1: n - 1 comparisons
pass 2: n - 2 comparisons
pass n-1: 1 comparison
Total: (n-1) + (n-2) + … + 1.
Or written the other way: 1 + 2 + … + (n-1).
Doubling trick so we can simplify it:
1 + 2 + ... + (n-1)
+ (n-1) + (n-2) + ... + 1
-----------------------------
n + n + ... + n
Each column sums to n. How many columns? There are (n-1) numbers.
So we get n*(n-1) but we need to divide it by 2 (to undo the doubling): n*(n-1)/2 = (n^2-n)/2 -> O(n^2).
Big O is about scaling so it ignores 2 and -n.
std::find
Use case
Find a specific value in a sequence.
Explanation
Returns iterator to first element equal to value or end() if not found. Must check every element in worst case (element not present or at end).
Time complexity: O(n). Scans elements one by one until found or end reached. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;
int main() {
std::vector<uint32_t> instructions = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
auto it = std::find(instructions.begin(), instructions.end(), ARM64_NOP);
if (it != instructions.end()) {
auto index = std::distance(instructions.begin(), it);
std::println("Found NOP at index: {}", index);
} else {
std::println("No NOP found.");
}
return 0;
}
Output
$ ./src/algorithms/build/std-find
Found NOP at index: 2
std::find_if
Use case
Find first element matching a predicate.
Explanation
More flexible than std::find (accepts a unary predicate that takes one argument). Essential for finding instructions by property (rather than exact value). Stops at first match.
Time complexity: O(n). Evaluates predicate for each element until match found. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/B--Branch-?lang=en
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
constexpr bool isBranch(uint32_t opcode) {
uint32_t op = opcode >> 26;
return op == 0x05 /* B */ || op == 0x25 /* BL */;
}
int main() {
std::vector<uint32_t> instructions = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
auto it = std::find_if(instructions.begin(), instructions.end(), isBranch);
if (it != instructions.end()) {
auto index = std::distance(instructions.begin(), it);
std::println("Found branch at index: {}", index);
}
return 0;
}
Output
$ ./src/algorithms/build/std-find_if
Found branch at index: 0
std::count/std::count_if
Use case
Count occurences of a value or elements matching a predicate.
Explanation
count for exact matches, count_if for predicate-based counting. Always travels entire range.
Time complexity: O(n). Must examine every element to get accurate count. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/B--Branch-?lang=en
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
constexpr bool isBranch(uint32_t opcode) {
uint32_t op = opcode >> 26;
return op == 0x05 /* B */ || op == 0x25 /* BL */;
}
int main() {
std::vector<uint32_t> instructions = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
auto nopCount =
std::count(instructions.begin(), instructions.end(), ARM64_NOP);
auto branchCount =
std::count_if(instructions.begin(), instructions.end(), isBranch);
std::println("NOP count: {}", nopCount);
std::println("Branch count: {}", branchCount);
return 0;
}
Output
$ ./src/algorithms/build/std-count
NOP count: 1
Branch count: 1
std::all_of/std::any_of/std:none_of
Use case
Check if predicate holds for all/any/none of the elements.
Explanation
Useful for validating instruction sequences.
Time complexity: O(n) worst-case but short-circuits:
all_of: stops at firstfalseany_of: stops at firsttruenone_of: stops at firsttrue
Best case O(1) if first element determines result.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/B--Branch-?lang=en
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
constexpr bool isBranch(uint32_t opcode) {
uint32_t op = opcode >> 26;
return op == 0x05 /* B */ || op == 0x25 /* BL */;
}
int main() {
std::vector<uint32_t> instructions = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
bool allBranch =
std::all_of(instructions.begin(), instructions.end(), isBranch);
bool anyBranch =
std::any_of(instructions.begin(), instructions.end(), isBranch);
bool noneBranch =
std::none_of(instructions.begin(), instructions.end(), isBranch);
std::println("All branch instructions: {}", allBranch);
std::println("Any branch instructions: {}", anyBranch);
std::println("No branch instructions: {}", noneBranch);
return 0;
}
Output
$ ./src/algorithms/build/std-all_of-std-any_of-std-none_of
All branch instructions: false
Any branch instructions: true
No branch instructions: false
std::for_each
Use case
Apply a function to each element.
Explanation
Time complexity: O(n). Applies function exactly once per element. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint32_t> instructions = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
std::for_each(instructions.begin(), instructions.end(),
[](uint32_t insn) { std::println("{:#x}", insn); });
return 0;
}
Output
$ ./src/algorithms/build/std-for_each
0x94000010
0xd10083ff
0xd503201f
std::equal
Use case
Compare two ranges for equality.
Explanation
Return true if all corresponding elements are equal. Essential for verifying code integrity and detecting patches.
Time complexity: O(n). Compare elements pairwise until mismatch or end. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint32_t> original = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
std::vector<uint32_t> modified = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
bool tampered = std::equal(original.begin(), original.end(), modified.begin(),
modified.end());
std::println("Original == modified: {}", tampered);
return 0;
}
Output
$ ./src/algorithms/build/std-equal
Original == modified: false
std::mismatch
Use case
Find first position where two ranges differ.
Explanation
Returns pair of iterators to first mismatching elements. More useful than equal when we need to know where ranges differ. Essential for binary diffing.
Time complexity: O(n). Compares until first difference found. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint32_t> original = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
std::vector<uint32_t> modified = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
auto [origIt, modIt] = std::mismatch(original.begin(), original.end(),
modified.begin(), modified.end());
if (origIt != original.end()) {
size_t offset = std::distance(original.begin(), origIt);
std::println("First difference at index {}:", offset);
std::println(" Original: {:#x}", *origIt);
std::println(" Modified: {:#x}", *modIt);
}
return 0;
}
Output
$ ./src/algorithms/build/std-mismatch
First difference at index 1:
Original: 0xd10083ff
Modified: 0xd503201f
std::search
Use case
Find a subsequence within a sequence.
Explanation
Returns iterator to first occurence of pattern or end() if not found. Essential for gadget finding, pattern matching in code and detecting instruction sequences.
Time complexity: O(n*m) worst case, where n is haystack size and m is needle size. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint32_t> code = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f,
// $ echo "mov x0, 0" | llvm-mc -triple=aarch64 -show-encoding
// mov x0, #0 // encoding: [0x00,0x00,0x80,0xd2]
0xd2800000,
// $ echo "mov x1, 1" | llvm-mc -triple=aarch64 -show-encoding
// mov x1, #1 // encoding: [0x21,0x00,0x80,0xd2]
0xd2800021};
std::vector<uint32_t> gadget = {
// $ echo "mov x0, 0" | llvm-mc -triple=aarch64 -show-encoding
// mov x0, #0 // encoding: [0x00,0x00,0x80,0xd2]
0xd2800000,
// $ echo "mov x1, 1" | llvm-mc -triple=aarch64 -show-encoding
// mov x1, #1 // encoding: [0x21,0x00,0x80,0xd2]
0xd2800021};
auto it = std::search(code.begin(), code.end(), gadget.begin(), gadget.end());
if (it != code.end()) {
size_t offset = std::distance(code.begin(), it);
std::println("Found gadget at index {}", offset);
}
return 0;
}
Output
$ ./src/algorithms/build/std-search
Found gadget at index 4
std::adjacent_find
std::binary_search
Use case
Check if value exists in sorted range.
Explanation
Requires sorted range. Returns bool only. Much faster than find for large sorted datasets.
Time complexity: O(log n). Halves earch space each step. For 1M elements, only ~20 comparisons needed. See possible implementation.
Code
#include <algorithm>
#include <print>
#include <string>
#include <vector>
int main() {
std::vector<std::string> symbols = {"_helper", "_init", "_main", "_start"};
bool hasMain = std::binary_search(symbols.begin(), symbols.end(), "_main");
std::println("Symbol '_main' exists: {}", hasMain);
return 0;
}
Output
$ ./src/algorithms/build/std-binary_search
Symbol '_main' exists: true
std::sort
Use case
Sort elements in ascending order (default) or by custom comparator (in case of complex types).
Explanation
Time complexity: O(n log n). Pick pivot + partitioning: n, recursing on each half: log n. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>
struct Symbol {
uint64_t address;
std::string name;
};
int main() {
std::vector<Symbol> symbols = {
{0x10001000, "_main"}, {0x10003000, "_fini"}, {0x10002000, "_helper"}};
std::sort(
symbols.begin(), symbols.end(),
[](const Symbol &a, const Symbol &b) { return a.address < b.address; });
for (const auto &sym : symbols) {
std::println("{:#x} {}", sym.address, sym.name);
}
return 0;
}
Output
$ ./src/algorithms/build/std-sort
0x10001000 _main
0x10002000 _helper
0x10003000 _fini
std::stable_sort
Use case
Sort while preserving relative order of equivalent elements.
Explanation
Slightly slower than std::sort due to stability guarantee.
Time complexity: O(n log n) with extra memory, worse than O(n log n) without. The extra memory is necessary for merge sort. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>
struct Symbol {
uint64_t address;
std::string name;
int order;
};
int main() {
std::vector<Symbol> symbols = {{0x1000, "_init", 0},
{0x1080, "_helper", 1},
{0x1040, "_main", 2},
{0x1040, "main", 3},
{0x1040, "_start", 4}};
std::stable_sort(
symbols.begin(), symbols.end(),
[](const Symbol &a, const Symbol &b) { return a.address < b.address; });
for (const auto &sym : symbols) {
std::println("{:#x} {} (was #{})", sym.address, sym.name, sym.order);
}
return 0;
}
Output
$ ./src/algorithms/build/std-stable_sort
0x1000 _init (was #0)
0x1040 _main (was #2)
0x1040 main (was #3)
0x1040 _start (was #4)
0x1080 _helper (was #1)
std::partial_sort
Use case
Sort only the first k elements.
Explanation
Only sorts the first k elements. Rest are in unspecified order (but all >= sorted portion). Much faster than full sort when k << n. Perfect for “top N” queries.
Time complexity: O(n log k) where k is the number of elements to sort. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>
struct Function {
std::string name;
uint64_t size;
};
int main() {
std::vector<Function> functions = {{"_main", 256}, {"_helper", 56},
{"_init", 32}, {"_process", 1024},
{"_cleanup", 128}, {"_validate", 512},
{"_parse", 2048}};
std::partial_sort(
functions.begin(), functions.begin() + 3, functions.end(),
[](const Function &a, const Function &b) { return a.size > b.size; });
std::println("Top 3 largest functions:");
for (size_t i = 0; i < 3; ++i) {
std::println("{}: {}", functions[i].name, functions[i].size);
}
return 0;
}
Output
$ ./src/algorithms/build/std-partial_sort
Top 3 largest functions:
_parse: 2048
_process: 1024
_validate: 512
std::is_sorted
Use case
Check if range is sorted.
Explanation
std::is_sorted_until returns iterator to first out-of-order element. Useful for validation before using algorithms that require sorted input (e.g. std::binary_search).
Time complexity: O(n) worst case but short-circuits at first out-of-order element. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint64_t> good = {0x1000, 0x2000, 0x3000, 0x4000};
std::vector<uint64_t> bad = {0x1000, 0x3000, 0x2000, 0x4000};
std::println("good sorted: {}", std::is_sorted(good.begin(), good.end()));
std::println("bad sorted: {}", std::is_sorted(bad.begin(), bad.end()));
auto it = std::is_sorted_until(bad.begin(), bad.end());
if (it != bad.end()) {
size_t offset = std::distance(bad.begin(), it);
std::println("Sorting breaks at {:#x} (index {})", *it, offset);
}
return 0;
}
Output
$ ./src/algorithms/build/std-is_sorted
good sorted: true
bad sorted: false
Sorting breaks at 0x2000 (index 2)
std::transform
Use case
Apply a function to each element and store results.
Explanation
Can transform in-place or to different container.
Time complexity: O(n). Applies transformation exactly once per element. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint32_t> offsets = {0x10, 0x20, 0x30, 0x40};
std::vector<uint64_t> addresses(offsets.size());
uint64_t base = 0x10000000;
std::transform(offsets.begin(), offsets.end(), addresses.begin(),
[base](uint32_t offs) { return base + offs; });
for (const auto &addr : addresses) {
std::println("{:#x}", addr);
}
return 0;
}
Output
$ ./src/algorithms/build/std-transform
0x10000010
0x10000020
0x10000030
0x10000040
std::copy/std::copy_if
Use case
Copy elements to another range, optionally filtering.
Explanation
copy_if is the filtered version (only copies elements matching predicate). Use with back_inserter for growing containers.
Time complexity: O(n). Visits each element once. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <iterator>
#include <print>
#include <string>
#include <vector>
struct Segment {
std::string segname;
uint64_t vmaddr;
uint64_t vmsize;
// https://www.gnu.org/software/hurd/gnumach-doc/Memory-Attributes.html
// https://docs.rs/mach2/latest/src/mach2/vm_prot.rs.html#7
uint32_t initprot;
};
constexpr uint32_t VM_PROT_EXECUTE = 4;
int main() {
/// https://securelist.com/mac-os-x/36155/
std::vector<Segment> segments = {{"__PAGEZERO", 0x0, 0x100000000, 0},
{"__TEXT", 0x100000000, 0x4000, 5},
{"__DATA", 0x100004000, 0x4000, 3},
{"_LINKEDIT", 0x100008000, 0x1000, 1}};
std::vector<Segment> executableSegments;
std::copy_if(
segments.begin(), segments.end(), std::back_inserter(executableSegments),
[](const Segment &seg) { return (seg.initprot & VM_PROT_EXECUTE) != 0; });
for (const auto &seg : executableSegments) {
std::println("{} (vmaddr: {:#x}, vmsize: {:#x})", seg.segname, seg.vmaddr,
seg.vmsize);
}
return 0;
}
Output
$ ./src/algorithms/build/std-copy
__TEXT (vmaddr: 0x100000000, vmsize: 0x4000)
std::fill/std::fill_n
Use case
Fill range with a value.
Explanation
fill takes iterator range, fill_n takes a start iterator and count. Common for initializing buffers or creating NOP sleds.
Time complexity: O(n). Assigns value to each element. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint32_t> code(5);
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;
std::fill(code.begin(), code.end(), ARM64_NOP);
std::println("NOP sled:");
for (const auto &insn : code) {
std::println("{:#x}", insn);
}
return 0;
}
Output
$ ./src/algorithms/build/std-fill
NOP sled:
0xd503201f
0xd503201f
0xd503201f
0xd503201f
0xd503201f
std::replace/std::replace_if
Use case
Replace values in-place.
Explanation
replace for exact value match, replace_if for prediacate-based replacement. Does not change container size.
Time complexity: O(n). Checks and potentially replaces each element. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BRK--Breakpoint-instruction-?lang=en
// $ echo "brk #0" | llvm-mc -triple=aarch64 -show-encoding
// brk #0 // encoding: [0x00,0x00,0x20,0xd4]
constexpr uint32_t ARM64_BRK_0 = 0xd4200000;
int main() {
std::vector<uint32_t> instructions = {
// $ echo "sub sp, sp, #0x20" | llvm-mc -triple=aarch64 -show-encoding
//sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "brk #0" | llvm-mc -triple=aarch64 -show-encoding
// brk #0 // encoding: [0x00,0x00,0x20,0xd4]
0xd4200000,
// $ echo "add x29, sp, #0x20" | llvm-mc -triple=aarch64 -show-encoding
// add x29, sp, #32 // encoding: [0xfd,0x83,0x00,0x91]
0x910083fd,
// $ echo "ret" | llvm-mc -triple=aarch64 -show-encoding
// ret // encoding: [0xc0,0x03,0x5f,0xd6]
0xd65f03c0};
std::replace(instructions.begin(), instructions.end(), ARM64_BRK_0,
ARM64_NOP);
std::println("{:#x}", instructions[1]);
return 0;
}
Output
$ ./src/algorithms/build/std-replace
0xd503201f
std::remove_if
Use case
Remove elements matching predicate.
Explanation
remove_if moves unwanted elements to end, returns iterator to new logical end. Must call erase to actually remove (remove-erase idiom). C++20 adds std::erase_if which combines both steps.
Time complexity: O(n). Single pass through elements. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
constexpr uint32_t ARM64_NOP = 0xd503201f;
int main() {
std::vector<uint32_t> code = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
auto newEnd = std::remove_if(code.begin(), code.end(),
[](uint32_t insn) { return insn == ARM64_NOP; });
code.erase(newEnd, code.end());
std::println("{}", code.size());
return 0;
}
Output
$ ./src/algorithms/build/std-remove_if
2
std::unique
Use case
Remove consecutive duplicates.
Explanation
Only removes adjacent duplicates (sort first for true uniqueness). Returns iterator to new end. Like remove, requires erase to actually shrink container.
Time complexity: O(n). Single pass comparing adjacent elements. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint64_t> addresses = {0x1000, 0x1000, 0x1000,
0x1040, 0x1080, 0x1080};
auto newEnd = std::unique(addresses.begin(), addresses.end());
addresses.erase(newEnd, addresses.end());
for (const auto &addr : addresses) {
std::println("{:#x}", addr);
}
return 0;
}
Output
$ ./src/algorithms/build/std-unique
0x1000
0x1040
0x1080
std::reverse
Use case
Reverse element order.
Explanation
In-place reversal. reverse_copy for non-modifying version. Useful for endianness conversion or reversing instruction sequences.
Time complexity: O(n). Swaps n/2 pairs of elements. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint32_t> code = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
std::reverse(code.begin(), code.end());
for (const auto &insn : code) {
std::println("{:#x}", insn);
}
return 0;
}
Output
$ ./src/algorithms/build/std-reverse
0xd503201f
0xd10083ff
0x94000010
std::accumulate
Use case
Compute sum or fold operation (combining all elements into a single value) over range.
Explanation
Takes initial value and optional binary operation. Strictly left-to-right evaluation (not parallelizable). Use std::reduce for parallel-friendly version.
Time complexity: O(n). Process each element exactly once. See possible implementation.
Code
#include <algorithm>
#include <numeric>
#include <print>
#include <vector>
int main() {
std::vector<uint64_t> sectionSizes = {1024, 2048, 1024, 4096};
auto total = std::accumulate(sectionSizes.begin(), sectionSizes.end(), 0);
std::println("Total size: {}", total);
std::vector<uint8_t> bytes = {0x01, 0x02, 0x03, 0x04};
uint32_t checksum =
std::accumulate(bytes.begin(), bytes.end(), 0, std::bit_xor{});
std::println("Checksum: {:#x}", checksum);
return 0;
}
Output
$ ./src/algorithms/build/std-accumulate
Total size: 8192
Checksum: 0x4
std::iota
Use case
Fill range with sequentially increasing values.
Explanation
Fill with value, value+1, … .
Time complexity: O(n). Increments and assigns once per element. See possible implementation.
Code
#include <algorithm>
#include <numeric>
#include <print>
#include <vector>
int main() {
std::vector<uint64_t> addresses(5);
std::iota(addresses.begin(), addresses.end(), 0x10000000);
for (const auto &addr : addresses) {
std::println("{:#x}", addr);
}
return 0;
}
Output
$ ./src/algorithms/build/std-iota
0x10000000
0x10000001
0x10000002
0x10000003
0x10000004
std::reduce
Use case
Parallel-friendly version of accumulate.
Explanation
Unlike accumulate, operation must be associative (grouping does not matter) and commutative (order does not matter) for correct parallel execution.
Time complexity: O(n), sequential. See std::reduce.
Code
#include <cstdint>
#include <numeric>
#include <print>
#include <vector>
int main() {
std::vector<uint64_t> segmentSizes = {4096, 4096, 8192, 16384};
uint64_t total = std::reduce(segmentSizes.begin(), segmentSizes.end());
std::println("Total size: {}", total);
return 0;
}
Output
$ ./src/algorithms/build/std-reduce
Total size: 32768
std::min_element/std::max_element/std::minmax_element
Use case
Find smallest/largest element in range.
Explanation
Returns iterator, not value. minmax_element is more efficient than calling min_element and max_element separately.
Time complexity: O(n). See possible implementation: std::min_element, std::max_element, std::minmax_element.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <vector>
int main() {
std::vector<uint64_t> addresses = {0x10000000, 0x10001000, 0x10002000,
0x10003000};
auto [minIt, maxIt] = std::minmax_element(addresses.begin(), addresses.end());
std::println("Address range: {:#x}-{:#x}", *minIt, *maxIt);
return 0;
}
Output
$ ./src/algorithms/build/std-minmax
Address range: 0x10000000-0x10003000
std::clamp
Use case
Constrain value to a range.
Explanation
Returns lo if v < lo, hi if v > hi, otherwise v. Useful for bounds checking instruction operands like branch offsets.
Time complexity: O(1). Just two comparisons. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
constexpr int32_t operator""_MB(unsigned long long mb) {
return mb * 1024 * 1024;
}
int main() {
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
constexpr int32_t BL_MAX = 128_MB;
constexpr int32_t BL_MIN = -128_MB;
int32_t requested = 200_MB;
int32_t clamped = std::clamp(requested, BL_MIN, BL_MAX);
std::println("Clamped: {:} MB", clamped / 1024 / 1024);
return 0;
}
Output
$ ./src/algorithms/build/std-clamp
Clamped: 128 MB
std::partition
Use case
Reorder elements so those satisfying predicate come first.
Explanation
Not stable. Use stable_partition to preserve relative order. Returns iterator to first element not satisfying the predicate.
Time complexity: O(n). Single pass. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>
struct Section {
std::string name;
uint32_t flags;
};
// https://developer.apple.com/library/archive/documentation/DeveloperTools/Reference/Assembler/040-Assembler_Directives/asm_directives.html#//apple_ref/doc/uid/TP30000823-SW1
// https://llvm.org/doxygen/namespacellvm_1_1MachO.html
constexpr uint32_t S_ATTR_PURE_INSTRUCTIONS = 0x80000000;
constexpr uint32_t S_ATTR_SOME_INSTRUCTIONS = 0x00000400;
bool isExecutable(const Section &sec) {
return (sec.flags & S_ATTR_PURE_INSTRUCTIONS) ||
(sec.flags & S_ATTR_SOME_INSTRUCTIONS);
}
int main() {
std::vector<Section> sections = {{"__text", S_ATTR_PURE_INSTRUCTIONS},
{"__cstring", 0},
{"__const", 0},
{"__data", 0}};
auto partPoint =
std::partition(sections.begin(), sections.end(), isExecutable);
std::println("Executable:");
for (auto it = sections.begin(); it != partPoint; ++it) {
std::print("{} ", it->name);
}
std::println("");
std::println("Non-executable:");
for (auto it = partPoint; it != sections.end(); ++it) {
std::print("{} ", it->name);
}
std::println("");
return 0;
}
Output
$ ./src/algorithms/build/std-partition
Executable:
__text
Non-executable:
__cstring __const __data
std::set_intersection/std::set_union/std::set_difference
Use case
Perform set operations on sorted ranges.
Explanation
Input ranges must be sorted. Output is sorted. Single pass through both ranges. Useful for comparing symbol tables or finding common/unique symbols.
Time complexity: O(n + m) where n and m are the sizes of the two input ranges. See possible implementations: std::set_intersection, std::set_union, std::set_difference.
Code
#include <algorithm>
#include <iterator>
#include <print>
#include <string>
#include <vector>
int main() {
std::vector<std::string> binaryA = {"_main", "_helper", "_cleanup",
"_old_func"};
std::vector<std::string> binaryB = {"_main", "_helper", "_cleanup",
"_new_func"};
std::vector<std::string> common, onlyA, onlyB;
std::set_intersection(binaryA.begin(), binaryA.end(), binaryB.begin(),
binaryB.end(), std::back_inserter(common));
std::set_difference(binaryA.begin(), binaryA.end(), binaryB.begin(),
binaryB.end(), std::back_inserter(onlyA));
std::set_difference(binaryB.begin(), binaryB.end(), binaryA.begin(),
binaryA.end(), std::back_inserter(onlyB));
std::println("Common:");
for (const auto &sym : common) {
std::print("{} ", sym);
}
std::println("");
std::println("Only A:");
for (const auto &sym : onlyA) {
std::print("{} ", sym);
}
std::println("");
std::println("Only B:");
for (const auto &sym : onlyB) {
std::print("{} ", sym);
}
std::println("");
return 0;
}
Output
$ ./src/algorithms/build/std-set_intersection-set_union-set_difference
Common:
_main _helper _cleanup
Only A:
_old_func
Only B:
_new_func
std::merge
Use case
Merge two sorted ranges into one sorted output.
Explanation
Both inputs must be sorted. Useful for combining symbol tables.
Time complexity: O(n + m). Single pass through both ranges. See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <string>
#include <vector>
struct Symbol {
uint64_t address;
std::string name;
};
int main() {
std::vector<Symbol> obj1 = {{0x1000, "_init"}, {0x1100, "_helper"}};
std::vector<Symbol> obj2 = {{0x1200, "_main"}, {0x1300, "_cleanup"}};
std::vector<Symbol> merged(obj1.size() + obj2.size());
std::merge(
obj1.begin(), obj1.end(), obj2.begin(), obj2.end(), merged.begin(),
[](const Symbol &a, const Symbol &b) { return a.address < b.address; });
for (const auto &sym : merged) {
std::println("{:#x} {}", sym.address, sym.name);
}
return 0;
}
Output
$ ./src/algorithms/build/std-merge
0x1000 _init
0x1100 _helper
0x1200 _main
0x1300 _cleanup
std::generator
std::rotate
Use case
Rotate elements so that a given element becomes the first.
Explanation
Left rotation: elements before middle move to end. Use rotate_copy for non-modifying version. Returns iterator to original first element’s new position.
Time complexity: O(n). Each element is moved once. See possible implementation.
Code
#include <algorithm>
#include <print>
#include <string>
#include <vector>
int main() {
// https://100daysofredteam.medium.com/mach-o-file-format-for-red-team-professionals-fb89ea6c311e
std::vector<std::string> loadCommands = {"LC_SEGMENT_64 (__TEXT)",
"LC_SEGMENT_64 (__DATA)", "LC_MAIN",
"LC_SEGMENT_64 (__LINKEDIT)"};
auto mainIt = std::find(loadCommands.begin(), loadCommands.end(), "LC_MAIN");
if (mainIt != loadCommands.end()) {
std::rotate(loadCommands.begin(), mainIt, loadCommands.end());
}
std::println("Rotated load commands:");
for (const auto &lc : loadCommands) {
std::println("{}", lc);
}
return 0;
}
Output
$ ./src/algorithms/build/std-rotate
Rotated load commands:
LC_MAIN
LC_SEGMENT_64 (__LINKEDIT)
LC_SEGMENT_64 (__TEXT)
LC_SEGMENT_64 (__DATA)
std::shuffle
Use case
Randomly reorder elements using a random number generator.
Explanation
Requires a uniform random generator (like std::mt19937). Useful for obfuscation and fuzzing.
Time complexity: O(n) (exactly n - 1 swaps). See possible implementation.
Code
#include <algorithm>
#include <cstdint>
#include <print>
#include <random>
#include <vector>
int main() {
std::vector<uint32_t> instructions = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
// Fix seed.
std::mt19937 rng{0xFF};
std::shuffle(instructions.begin(), instructions.end(), rng);
for (const auto &insn : instructions) {
std::println("{:#x}", insn);
}
return 0;
}
Output
$ ./src/algorithms/build/std-shuffle
0xd10083ff
0xd503201f
0x94000010
std::next_permutation
Use case
Generate next permutation in dictionary order.
Explanation
“Lexicographically greater” means dictionary order (how words are sorted alphabetically). The permutations are generated in sorted order. Returns false when wrapper back to the smallest permutation (sorted order).
Time complexity: O(n) per call but n! total permutations exist. See possible implementation.
Code
#include <algorithm>
#include <print>
#include <string>
#include <vector>
int main() {
std::vector<std::string> regs = {"x0", "x1", "x2"};
std::println("Possible register orderings:");
std::sort(regs.begin(), regs.end());
do {
std::println("{}, {}, {}", regs[0], regs[1], regs[2]);
} while (std::next_permutation(regs.begin(), regs.end()));
return 0;
}
Output
$ ./src/algorithms/build/std-next_permutation
x0, x1, x2
x0, x2, x1
x1, x0, x2
x1, x2, x0
x2, x0, x1
x2, x1, x0
std::array
Explanation
Fixed-size array. Size must be known at compile-time.
See std::array.
Memory: elements stored inline (stack if local variable, heap if newed).
Time complexity:
| Operation | Complexity |
|---|---|
[], at() | O(1) |
front(), back() | O(1) |
size(),empty() | O(1) |
Use case
- Size known at compile-time.
- Small collections.
- Embedded systems avoiding heap allocation.
Code
#include <array>
#include <print>
int main() {
std::array<uint32_t, 3> opcodes = {
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f,
// $ echo "mov x0, #1" | llvm-mc -triple=aarch64 -show-encoding
// mov x0, #1 // encoding: [0x20,0x00,0x80,0xd2]
0xd2800020,
// $ echo "bl #0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010};
std::println("Size: {}", opcodes.size());
std::println("First: {:#x}", opcodes.front());
std::println("Last: {:#x}", opcodes.back());
std::println("At 2: {:#x}", opcodes.at(2));
return 0;
}
Output
$ ./src/data-structures/build/std-array
Size: 3
First: 0xd503201f
Last: 0x94000010
At 2: 0x94000010
std::vector
Explanation
Dynamic array. The default container, use this unless we have a specific reason not to.
See std::vector.
Memory: control block on stack, elements in heap (contiguous).
Time complexity:
| Operation | Complexity |
|---|---|
[], at() | O(1) |
front(), back() | O(1) |
size(),empty() | O(1) |
push_back(), pop_back() | O(1) amortized |
insert()/erase() | O(n) (must shift elements) |
reserve()/resize() | O(n) |
Amortized means it is not truly constant. When the array is full, it needs to allocate a new, larger block of memory and copy everything over. Some implementations might also shrink the array, therefore pop_back() is marked as amortized as well.
Use case
- Default choice for sequences.
- Size not known at compile time.
- Need fast random access.
Code
#include <print>
#include <vector>
int main() {
std::vector<uint32_t> instructions;
// Reserve to avoid reallocations.
instructions.reserve(100);
// $ echo "bl #0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
instructions.push_back(0x94000010);
instructions.push_back(0x94000010);
instructions.push_back(0x94000010);
std::println("Size: {}, capacity: {}", instructions.size(),
instructions.capacity());
return 0;
}
Output
$ ./src/data-structures/build/std-vector
Size: 3, capacity: 100
std::deque
Explanation
Double-ended queue. Fast insertion/removal at both front and back.
See std::deque.
Memory: array of pointers to fixed-size chunks (all on heap).
Time complexity:
| Operation | Complexity |
|---|---|
[], at() | O(1) |
front(), back() | O(1) |
size(),empty() | O(1) |
push_back(), pop_back() | O(1) |
push_front(), pop_front() | O(1) |
insert()/erase() | O(n) |
Use case
- Need fast
push_front/pop_front. - Need stable references (
push/insertdo not invalidate). - Do not need contiguous memory.
Code
#include <deque>
#include <print>
int main() {
std::deque<uint64_t> addresses;
addresses.push_back(0x10001000);
addresses.push_front(0x10002000);
std::println("Front: {:#x}", addresses.front());
std::println("Back: {:#x}", addresses.back());
return 0;
}
Output
$ ./src/data-structures/build/std-deque
Front: 0x10002000
Back: 0x10001000
std::list
Explanation
Doubly-linked list. Each element in its own heap allocation.
See std::list.
Memory: each node contains element + two pointers (prev/next). High memory overhead.
Time complexity:
| Operation | Complexity |
|---|---|
front(), back() | O(1) |
size(),empty() | O(1) |
push_back(), pop_back() | O(1) |
insert()/erase() | O(1) (with iterator) |
splice() | O(1) |
sort() | O(n log n) |
“With iterator” means we already have an iterator to the position where we want to insert/erase.
If sort() uses merge sort (like libstdc++ does), then the time complexity is O(n log n).
Use case
- Frequent insertion/removal in the middle (with iterator).
- Need stable iterator after modification.
std::listis rather rarely used.std::vectoris usually faster due to cache locality.
Code
#include <list>
#include <print>
int main() {
std::list<uint64_t> breakpoints = {0x1000, 0x2000, 0x3000};
auto it = std::next(breakpoints.begin());
breakpoints.insert(it, 0x1500);
for (const auto &br : breakpoints) {
std::println("{:#x}", br);
}
return 0;
}
Output
$ ./src/data-structures/build/std-list
0x1000
0x1500
0x2000
0x3000
std::forward_list
Explanation
Singly-linked list. Minimal memory overhead compared to std::list.
See std::forward_list.
Memory: each node contains element + one pointer (next).
Time complexity:
| Operation | Complexity |
|---|---|
front() | O(1) |
push_front(), pop_front() | O(1) |
insert_after()/erase_after() | O(1) (with iterator) |
“With iterator” means we already have an iterator to the position where we want to insert/erase.
Use case
- Rarely used. Only if memory is extremely tight and we only traverse forward.
std::map
Explanation
Sorted key-value pairs. Implemented as red-black tree (self-balancing BST).
See std::map.
Memory: each entry in its own tree node (key + value + 3 pointers + color bit). The 3 pointers are: left child pointer, right child pointer, parent pointer. High overhead.
Time complexity:
| Operation | Complexity |
|---|---|
find(), count(), contains() | O(log n) |
[], at | O(log n) |
insert()/emplace() | O(log n) |
erase() | O(log n) (by key) |
erase() | O(1) (by iterator, amortized) |
“By iterator” means we already have an iterator to the position where we want to erase.
Amortized means it is not truly constant. Rebalancing might be triggered.
Use case
- Need sorted iteration by key.
- Moderate number of elements.
Code
#include <map>
#include <print>
#include <string>
int main() {
std::map<uint64_t, std::string> symbols;
symbols.emplace(0x10001000, "_main");
symbols.emplace(0x10003000, "_init");
symbols.emplace(0x10002000, "_helper");
std::println("Symbols (sorted by address):");
for (const auto &[addr, name] : symbols) {
std::println("{:#x}: {}", addr, name);
}
return 0;
}
Output
$ ./src/data-structures/build/std-map
Symbols (sorted by address):
0x10001000: _main
0x10002000: _helper
0x10003000: _init
std::unordered_map
Explanation
Hash table.
See std::unordered_map.
Memory: array of buckets (each bucket is a linked list of entries). Can use more memory than std::map due to bucket overhead.
Time complexity:
| Operation | Complexity |
|---|---|
find(), count(), contains() | O(1) |
[], at | O(1) |
insert()/emplace() | O(1) |
erase() | O(1) (by key) |
Worst case is O(n) for every operation. It occurs with many hash collisions (all keys hash to same bucket).
Use case
- Fast lookup is priority.
- Do not need sorted iteration.
- Large number of elements.
std::unordered_mapcan be slower thanstd::mapfor small sizes due to hash function overhead.
Code
#include <print>
#include <string>
#include <unordered_map>
int main() {
std::unordered_map<std::string, uint64_t> symbolLookup;
symbolLookup.emplace("_main", 0x10001000);
symbolLookup.emplace("_helper", 0x10002000);
symbolLookup.emplace("_init", 0x10003000);
std::println("_main is at: {:#x}", symbolLookup["_main"]);
return 0;
}
Output
$ ./src/data-structures/build/std-unordered_map
_main is at: 0x10001000
std::flat_map
Explanation
Sorted key-value pairs stored in contiguous memory (two vectors internally).
See std::flat_map.
Memory: two std::vectors, one for keys, one for value. Cache-friendly.
Time complexity:
| Operation | Complexity |
|---|---|
find(), count(), contains() | O(log n) |
[], at | O(log n) |
insert()/emplace() | O(n) (must shift elements) |
erase() | O(n) (must shift elements) |
Use case
- Read-heavy workloads.
- Need sorted iteration with better cache performance than
std::map.
Code
// std::flat_map is not fully implemented everywhere yet.
// https://en.cppreference.com/w/cpp/compiler_support/23.html
#ifdef __APPLE__
#include <flat_map>
#include <print>
#include <string>
int main() {
std::flat_map<uint64_t, std::string> symbols;
symbols.emplace(0x10001000, "_main");
symbols.emplace(0x10003000, "_init");
symbols.emplace(0x10002000, "_helper");
for (const auto &[addr, name] : symbols) {
std::println("{:#x}: {}", addr, name);
}
return 0;
}
#else
int main() { return 0; }
#endif
Output
$ ./src/data-structures/build/std-flat_map
0x10001000: _main
0x10002000: _helper
0x10003000: _init
std::set/std::unordered_set
Explanation
Like map/unordered_map but stores only keys.
See std::set and std::unordered_set.
Time complexity:
| Operation | std::set | std::unordered_set |
|---|---|---|
find(), count(), contains() | O(log n) | O(1) avg, O(n) worst |
insert()/emplace() | O(log n) | O(1) avg, O(n) worst |
erase() | O(log n) | O(1) avg, O(n) worst |
Use case
- Deduplication.
Code
#include <print>
#include <set>
#include <unordered_set>
int main() {
std::set<uint64_t> visitedAddrs;
visitedAddrs.insert(0x1000);
visitedAddrs.insert(0x1100);
visitedAddrs.insert(0x1000);
std::println("Visited {} unique addresses.", visitedAddrs.size());
return 0;
}
Output
$ ./src/data-structures/build/std-set-std-unordered_set
Visited 2 unique addresses.
std::multimap/std::multiset
Explanation
Like map/set but allows duplicate keys.
See std::multimap and std::multiset.
Time complexity: same as std::map/std::set.
| Operation | Complexity |
|---|---|
find(), count(), contains() | O(log n) |
insert()/emplace() | O(log n) |
erase() by key | O(log n + k) where k = num of matching keys |
Use case
- Multiple values per key, grouping related items.
Code
#include <map>
#include <print>
#include <string>
int main() {
// Multiple symbols can have the same address (aliases)-
std::multimap<uint64_t, std::string> symbols;
symbols.emplace(0x10001000, "_main");
symbols.emplace(0x10001000, "main");
symbols.emplace(0x10002000, "_helper");
auto range = symbols.equal_range(0x10001000);
std::println("Symbols at 0x10001000:");
for (auto it = range.first; it != range.second; ++it) {
// it->second is the value (std::string name)
std::println("{}", it->second);
}
return 0;
}
Output
$ ./src/data-structures/build/std-multimap-std-multiset
Symbols at 0x10001000:
_main
main
std::pair
Explanation
Holds exactly two values of potentially different types.
See std::pair.
Memory: inline (stack if local). Size = sizeof(T1) + sizeof(T2) + padding.
Code
#include <print>
#include <utility>
int main() {
std::pair<uint64_t, std::string> symbol{0x10001000, "_main"};
auto [addr, name] = symbol;
std::println("{} @ {:#x}", name, addr);
return 0;
}
Output
$ ./src/data-structures/build/std-pair
_main @ 0x10001000
std::tuple
Explanation
Holds any number of values of potentially different types.
See std::tuple.
Memory: inline. Size = sum of all element sizes + padding.
Code
#include <print>
#include <string>
#include <tuple>
int main() {
std::tuple<uint64_t, std::string, size_t> symbol{0x10001000, "_main", 0x100};
auto [addr, name, size] = symbol;
std::println("{} @ {:#x} (size: {:#x})", name, addr, size);
return 0;
}
Output
$ ./src/data-structures/build/std-tuple
_main @ 0x10001000 (size: 0x100)
std::string
Explanation
Dynamic character array. Similar to std::vector<char> but with string-specific operations.
See std::string.
Memory: usually heap-allocated, but small strings might be stored inline (SSO).
Time complexity:
| Operation | Complexity |
|---|---|
[], at | O(1) |
length()/size()/empty() | O(1) |
append() | O(k) where k = appended length |
substr() | O(k) where k = substring length |
compare() | O(n) |
Code
#include <print>
#include <string>
int main() {
std::string symbolName = "_main";
symbolName.append("_v2");
std::println("Contains 'main': {}", symbolName.contains("main"));
return 0;
}
Output
$ ./src/data-structures/build/std-string
Contains 'main': true
std::span
Explanation
Non-owning view over contiguous memory. Does not allocate.
See std::span.
Time complexity:
| Operation | Complexity |
|---|---|
[], front(), back() | O(1) |
size()/empty() | O(1) |
data() | O(1) |
Use case
- Function parameters that accept any contiguous container (vector, array, C array).
Code
#include <array>
#include <print>
#include <span>
#include <vector>
void processInstruction(std::span<const uint32_t> instructions) {
for (auto insn : instructions) {
std::println("{:#x}", insn);
}
}
int main() {
std::vector<uint32_t> vec = {
// $ echo "bl 0x40" | llvm-mc -triple=aarch64 -show-encoding
// bl #64 // encoding: [0x10,0x00,0x00,0x94]
0x94000010,
// $ echo "sub sp, sp, 0x20" | llvm-mc -triple=aarch64 -show-encoding
// sub sp, sp, #32 // encoding: [0xff,0x83,0x00,0xd1]
0xd10083ff,
// $ echo "nop" | llvm-mc -triple=aarch64 -show-encoding
// nop // encoding: [0x1f,0x20,0x03,0xd5]
0xd503201f};
std::array<uint32_t, 2> arr = {0x94000010, 0xd10083ff};
uint32_t cArr[] = {0x94000010, 0xd10083ff};
processInstruction(vec);
processInstruction(arr);
processInstruction(cArr);
return 0;
}
Output
$ ./src/data-structures/build/std-span
0x94000010
0xd10083ff
0xd503201f
0x94000010
0xd10083ff
0x94000010
0xd10083ff
std::string_view
Explanation
Non-owning view over character data. Avoids copying strings.
See std::string_view.
Time complexity:
| Operation | Complexity |
|---|---|
[], front(), back() | O(1) |
size()/empty() | O(1) |
compare() | O(n) |
Code
#include <print>
#include <string_view>
void printSymbol(std::string_view name) { std::println("Symbol: {}", name); }
int main() {
std::string str = "_main";
const char *cstr = "_helper";
printSymbol(str);
printSymbol(cstr);
return 0;
}
Output
$ ./src/data-structures/build/std-string_view
Symbol: _main
Symbol: _helper
std::stack
Explanation
LIFO (Last In, First Out). Uses std::deque internally.
See std::stack.
Time complexity:
| Operation | Complexity |
|---|---|
push() | O(1) |
pop() | O(1) |
empty(), size() | O(1) |
Code
#include <print>
#include <stack>
int main() {
std::stack<uint64_t> callStack;
callStack.push(0x1000);
callStack.push(0x2000);
callStack.push(0x3000);
while (!callStack.empty()) {
std::println("Return to: {:#x}", callStack.top());
callStack.pop();
}
return 0;
}
Output
$ ./src/data-structures/build/std-stack
Return to: 0x3000
Return to: 0x2000
Return to: 0x1000
std::queue
Explanation
FIFO (First In, First Out). Uses std::deque internally.
See std::queue.
Time complexity:
| Operation | Complexity |
|---|---|
push() | O(1) |
pop() | O(1) |
empty(), size() | O(1) |
Code
#include <print>
#include <queue>
int main() {
std::queue<uint64_t> workQueue;
workQueue.push(0x1000);
workQueue.push(0x2000);
workQueue.push(0x3000);
while (!workQueue.empty()) {
std::println("Process: {:#x}", workQueue.front());
workQueue.pop();
}
return 0;
}
Output
$ ./src/data-structures/build/std-queue
Process: 0x1000
Process: 0x2000
Process: 0x3000
std::priority_queue
Explanation
Largest element always on top. Uses std::vector internally.
See std::priority_queue.
Time complexity:
| Operation | Complexity |
|---|---|
push() | O(log n) |
pop() | O(log n) |
empty(), size() | O(1) |
std::priority_queue is implemented as a binary heap.
Code
#include <print>
#include <queue>
int main() {
std::priority_queue<int, std::vector<int>, std::less<int>> maxHeap;
maxHeap.push(0x2000);
maxHeap.push(0x3000);
maxHeap.push(0x1000);
while (!maxHeap.empty()) {
std::println("Process: {:#x}", maxHeap.top());
maxHeap.pop();
}
return 0;
}
Output
$ ./src/data-structures/build/std-priority_queue
Process: 0x3000
Process: 0x2000
Process: 0x1000
Visitor pattern
Use case
Process different instruction types in a disassembler without modifying each instruction class.
Explanation
The visitor pattern separates algorithms from the objects they operate on. Modern C++ uses std::variant + std::visit.
Code
#include <cstdint>
#include <print>
#include <string>
#include <variant>
#include <vector>
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/MOV--register---Move-register-value--an-alias-of-ORR--shifted-register--?lang=en
struct MovInstr {
std::string dest;
std::string src;
};
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/BL--Branch-with-link-?lang=en
struct BlInstr {
uint64_t target;
std::string name;
};
// https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/RET--Return-from-subroutine-?lang=en
struct RetInstr {};
using Instruction = std::variant<MovInstr, BlInstr, RetInstr>;
struct DisasmVisitor {
void operator()(const MovInstr &m) const {
std::println("mov {}, {}", m.dest, m.src);
}
void operator()(const BlInstr &b) const {
std::println("bl {:#x} ; {}", b.target, b.name);
}
void operator()(const RetInstr &) const { std::println("ret"); }
};
int main() {
std::vector<Instruction> code = {MovInstr{"x0", "x1"},
BlInstr{0x10001000, "_printf"}, RetInstr{}};
std::println("Disassembly:");
for (const auto &instr : code) {
std::visit(DisasmVisitor{}, instr);
}
return 0;
}
Output
$ ./src/design-patterns-and-idioms/build/visitor-pattern
Disassembly:
mov x0, x1
bl 0x10001000 ; _printf
ret
Factory pattern
Use case
Create architecture-specific disassembler without hardcoding which class to instantiate.
Explanation
Factory encapsulates object creation logic. Client code works with abstract interface while concrete types are selected at runtime. Modern C++ uses std::unique_ptr for ownership.
Code
#include <cstdint>
#include <memory>
#include <print>
#include <string>
class Disassembler {
public:
virtual ~Disassembler() = default;
virtual std::string name() const = 0;
};
class X64Disassembler : public Disassembler {
public:
std::string name() const override { return "x86-64"; }
};
class ARM64Disassembler : public Disassembler {
public:
std::string name() const override { return "arm64"; }
};
std::unique_ptr<Disassembler> create_disassembler(const std::string &arch) {
if (arch == "x86-64") {
return std::make_unique<X64Disassembler>();
}
if (arch == "arm64") {
return std::make_unique<ARM64Disassembler>();
}
return nullptr;
}
int main() {
for (auto arch : {"x86-64", "arm64", "mips"}) {
if (auto disasm = create_disassembler(arch)) {
std::println("{}", disasm->name());
} else {
std::println("Not supported: {}", arch);
}
}
return 0;
}
Output
$ ./src/design-patterns-and-idioms/build/factory-pattern
x86-64
arm64
Not supported: mips
CRTP (Curiously Recurring Template Pattern)
Use case
Implement common functionality for memory region types without virtual function overhead.
Explanation
CRTP provides static polymorphism: a base class template takes the derived class as a parameter, enabling compile-time method dispatch (via static_cast). Zero runtime overhead compared to virtual functions (at the cost of increased binary size).
Code
#include <cstdint>
#include <print>
template <typename Derived> class MemoryRegion {
public:
void dump() const {
// Dereference this, then cast the base reference to derived reference.
const auto &self = static_cast<const Derived &>(*this);
std::println("{}: {:#x}", self.name(), self.start());
}
};
class CodeSection : public MemoryRegion<CodeSection> {
uint64_t base_;
public:
CodeSection(uint64_t base) : base_(base) {}
uint64_t start() const { return base_; }
const char *name() const { return ".text"; }
};
class DataSection : public MemoryRegion<DataSection> {
uint64_t base_;
public:
DataSection(uint64_t base) : base_(base) {}
uint64_t start() const { return base_; }
const char *name() const { return ".data"; }
};
int main() {
CodeSection text{0x10001000};
DataSection data{0x10002000};
text.dump();
data.dump();
return 0;
}
Output
$ ./src/design-patterns-and-idioms/build/crtp
.text: 0x10001000
.data: 0x10002000
RAII (Resource Acquisition Is Initialization)
Use case
Automatically release a lock on shared disassembler cache when leaving scope.
Explanation
RAII ties resource lifetime to object lifetime. Constructor acquires, destructor releases. Guarantees cleanup even on exceptions.
Code
#include <cstdint>
#include <mutex>
#include <print>
std::mutex g_cache_mutex;
class CacheLock {
public:
CacheLock(std::mutex &m) : mutex_(m) {
mutex_.lock();
std::println("Cache locked.");
}
~CacheLock() {
mutex_.unlock();
std::println("Cache unlocked.");
}
private:
std::mutex &mutex_;
};
void disassemble(uint64_t addr) {
CacheLock lock(g_cache_mutex);
std::println("Disassembling {:#x}...", addr);
}
int main() {
disassemble(0x10001000);
disassemble(0x10002000);
return 0;
}
Output
$ ./src/design-patterns-and-idioms/build/raii
Cache locked.
Disassembling 0x10001000...
Cache unlocked.
Cache locked.
Disassembling 0x10002000...
Cache unlocked.
Singleton
Use case
Global symbol table that must be initialized once and accessed from multiple analysis passes.
Explanation
Meyer’s singleton uses a function-local static variable.
Code
#include <cstdint>
#include <print>
#include <string>
#include <unordered_map>
class SymbolTable {
public:
static SymbolTable &instance() {
static SymbolTable table;
return table;
}
void add(uint64_t addr, const std::string &name) {
symbols_.emplace(addr, name);
}
std::string lookup(uint64_t addr) const {
auto it = symbols_.find(addr);
if (it != symbols_.end()) {
return it->second;
} else {
return std::format("sub_{:#x}", addr);
}
}
private:
SymbolTable() { std::println("Symbol table created."); }
std::unordered_map<uint64_t, std::string> symbols_;
};
int main() {
SymbolTable::instance().add(0x10001000, "_main");
SymbolTable::instance().add(0x10002000, "_decrypt");
std::println("call {}", SymbolTable::instance().lookup(0x10001000));
std::println("call {}", SymbolTable::instance().lookup(0x10002000));
std::println("call {}", SymbolTable::instance().lookup(0x10003000));
return 0;
}
Output
$ ./src/design-patterns-and-idioms/build/singleton
Symbol table created.
call _main
call _decrypt
call sub_0x10003000