Memory Optimization
1. Cache-friendly Algorithm Design
| Principle | Cache-unfriendly | Cache-friendly |
|---|---|---|
| Spatial Locality | Random memory access | Sequential access, contiguous data |
| Temporal Locality | Use data once and discard | Reuse recently accessed data |
| Array Traversal | Column-major in row-major array | Row-major traversal matches layout |
| Data Structure | Linked list (pointer chasing) | Array/Vector (contiguous memory) |
| Loop Order | Wrong loop nesting order | Correct order for memory layout |
Example: Matrix Multiplication Cache Optimization
// Cache-unfriendly: standard ijk order
function matrixMultiplyNaive(A, B) {
const n = A.length;
const C = Array(n).fill(0).map(() => Array(n).fill(0));
for (let i = 0; i < n; i++) {
for (let j = 0; j < n; j++) {
for (let k = 0; k < n; k++) {
C[i][j] += A[i][k] * B[k][j];
// B[k][j] has poor spatial locality (column access)
}
}
}
return C;
}
// Cache-friendly: ikj order improves locality
function matrixMultiplyOptimized(A, B) {
const n = A.length;
const C = Array(n).fill(0).map(() => Array(n).fill(0));
for (let i = 0; i < n; i++) {
for (let k = 0; k < n; k++) {
const r = A[i][k];
for (let j = 0; j < n; j++) {
C[i][j] += r * B[k][j];
// B[k][j] accessed sequentially (row access)
}
}
}
return C;
}
// Blocked/Tiled matrix multiplication for L1 cache
function matrixMultiplyBlocked(A, B, blockSize = 32) {
const n = A.length;
const C = Array(n).fill(0).map(() => Array(n).fill(0));
for (let ii = 0; ii < n; ii += blockSize) {
for (let jj = 0; jj < n; jj += blockSize) {
for (let kk = 0; kk < n; kk += blockSize) {
// Process block
for (let i = ii; i < Math.min(ii + blockSize, n); i++) {
for (let k = kk; k < Math.min(kk + blockSize, n); k++) {
const r = A[i][k];
for (let j = jj; j < Math.min(jj + blockSize, n); j++) {
C[i][j] += r * B[k][j];
}
}
}
}
}
}
return C;
}
// Array of Structures vs Structure of Arrays
class Point2D_AoS {
constructor() {
this.points = []; // [{x, y}, {x, y}, ...]
}
addPoint(x, y) {
this.points.push({x, y});
}
// Poor cache locality when processing only x or y
sumX() {
let sum = 0;
for (const p of this.points) {
sum += p.x; // Loads both x and y into cache
}
return sum;
}
}
class Point2D_SoA {
constructor() {
this.x = []; // All x coordinates
this.y = []; // All y coordinates
}
addPoint(x, y) {
this.x.push(x);
this.y.push(y);
}
// Better cache locality for SIMD and sequential access
sumX() {
let sum = 0;
for (const val of this.x) {
sum += val; // Only loads x array into cache
}
return sum;
}
}
// Benchmark
const n = 1000;
const aos = new Point2D_AoS();
const soa = new Point2D_SoA();
for (let i = 0; i < n; i++) {
aos.addPoint(i, i * 2);
soa.addPoint(i, i * 2);
}
console.time("AoS sumX");
aos.sumX();
console.timeEnd("AoS sumX");
console.time("SoA sumX");
soa.sumX();
console.timeEnd("SoA sumX");
// Cache lines: typically 64 bytes
// Prefetching: CPU loads next cache line automatically
// False sharing: avoid in multi-threaded code
2. Space Optimization Techniques
| Technique | Description | Space Saving |
|---|---|---|
| Rolling Array | Keep only k previous rows in DP | O(n×m) → O(k×m) |
| State Compression | Pack multiple booleans into bits | 8× reduction for boolean arrays |
| Coordinate Compression | Map large sparse values to small range | O(max_value) → O(n) |
| Implicit Trees | Store heap in array without pointers | No pointer overhead |
| Lazy Evaluation | Compute values on demand | Avoid storing intermediate results |
Example: Space Optimization in DP
// 2D DP: O(n×m) space
function longestCommonSubsequence2D(s1, s2) {
const m = s1.length, n = s2.length;
const dp = Array(m + 1).fill(0).map(() => Array(n + 1).fill(0));
for (let i = 1; i <= m; i++) {
for (let j = 1; j <= n; j++) {
if (s1[i - 1] === s2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
} else {
dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
}
}
}
return dp[m][n];
}
// 1D DP: O(n) space with rolling array
function longestCommonSubsequence1D(s1, s2) {
const m = s1.length, n = s2.length;
let prev = Array(n + 1).fill(0);
let curr = Array(n + 1).fill(0);
for (let i = 1; i <= m; i++) {
for (let j = 1; j <= n; j++) {
if (s1[i - 1] === s2[j - 1]) {
curr[j] = prev[j - 1] + 1;
} else {
curr[j] = Math.max(prev[j], curr[j - 1]);
}
}
[prev, curr] = [curr, prev];
}
return prev[n];
}
// Bit packing: store 8 booleans in 1 byte
class BitArray {
constructor(size) {
this.size = size;
this.bytes = new Uint8Array(Math.ceil(size / 8));
}
set(index, value) {
const byteIndex = Math.floor(index / 8);
const bitIndex = index % 8;
if (value) {
this.bytes[byteIndex] |= (1 << bitIndex);
} else {
this.bytes[byteIndex] &= ~(1 << bitIndex);
}
}
get(index) {
const byteIndex = Math.floor(index / 8);
const bitIndex = index % 8;
return (this.bytes[byteIndex] & (1 << bitIndex)) !== 0;
}
}
// Coordinate compression
function compressCoordinates(arr) {
const sorted = [...new Set(arr)].sort((a, b) => a - b);
const map = new Map();
sorted.forEach((val, idx) => map.set(val, idx));
return arr.map(val => map.get(val));
}
const coords = [1000000, 5, 1000000, 42, 5];
console.log("Compressed:", compressCoordinates(coords));
// Output: [2, 0, 2, 1, 0] - now in range [0, 2]
// Test
const str1 = "ABCDGH";
const str2 = "AEDFHR";
console.log("\nLCS 2D space:", longestCommonSubsequence2D(str1, str2));
console.log("LCS 1D space:", longestCommonSubsequence1D(str1, str2));
const bits = new BitArray(100);
bits.set(0, true);
bits.set(50, true);
console.log("\nBit 0:", bits.get(0));
console.log("Bit 50:", bits.get(50));
console.log("Bit 25:", bits.get(25));
3. In-place Algorithm Implementation
| Algorithm | Extra Space (naive) | In-place Space |
|---|---|---|
| Array Reversal | O(n) - create new array | O(1) - two pointers swap |
| Quick Sort | O(n) - copy partitions | O(log n) - recursion stack only |
| Heap Sort | O(n) - separate heap | O(1) - heapify in array |
| Array Rotation | O(n) - temporary array | O(1) - reversal algorithm |
| Dutch Flag | O(n) - counting sort | O(1) - three pointers |
Example: In-place Algorithms
// In-place array rotation
function rotateArray(arr, k) {
k = k % arr.length;
function reverse(start, end) {
while (start < end) {
[arr[start], arr[end]] = [arr[end], arr[start]];
start++;
end--;
}
}
reverse(0, arr.length - 1);
reverse(0, k - 1);
reverse(k, arr.length - 1);
return arr;
}
// Dutch National Flag: sort 0s, 1s, 2s
function dutchFlagSort(arr) {
let low = 0, mid = 0, high = arr.length - 1;
while (mid <= high) {
if (arr[mid] === 0) {
[arr[low], arr[mid]] = [arr[mid], arr[low]];
low++;
mid++;
} else if (arr[mid] === 1) {
mid++;
} else {
[arr[mid], arr[high]] = [arr[high], arr[mid]];
high--;
}
}
return arr;
}
// In-place merge (for merge sort)
function mergeTwoSortedArrays(arr, left, mid, right) {
let start2 = mid + 1;
if (arr[mid] <= arr[start2]) {
return; // Already sorted
}
while (left <= mid && start2 <= right) {
if (arr[left] <= arr[start2]) {
left++;
} else {
const value = arr[start2];
let index = start2;
// Shift elements
while (index !== left) {
arr[index] = arr[index - 1];
index--;
}
arr[left] = value;
left++;
mid++;
start2++;
}
}
}
// Remove duplicates in-place
function removeDuplicates(arr) {
if (arr.length === 0) return 0;
let writeIdx = 1;
for (let i = 1; i < arr.length; i++) {
if (arr[i] !== arr[i - 1]) {
arr[writeIdx] = arr[i];
writeIdx++;
}
}
return writeIdx; // New length
}
// Partition array (QuickSort partition)
function partition(arr, low, high) {
const pivot = arr[high];
let i = low - 1;
for (let j = low; j < high; j++) {
if (arr[j] < pivot) {
i++;
[arr[i], arr[j]] = [arr[j], arr[i]];
}
}
[arr[i + 1], arr[high]] = [arr[high], arr[i + 1]];
return i + 1;
}
// Tests
const arr1 = [1, 2, 3, 4, 5];
console.log("Rotated by 2:", rotateArray([...arr1], 2));
const arr2 = [2, 0, 2, 1, 1, 0];
console.log("Dutch flag sorted:", dutchFlagSort([...arr2]));
const arr3 = [1, 1, 2, 2, 3, 4, 4];
const newLen = removeDuplicates(arr3);
console.log("Remove duplicates:", arr3.slice(0, newLen));
// Space: O(1) for all algorithms above
// Key: modify input array, use swaps instead of copies
4. Memory Pool Allocation Strategies
| Strategy | Allocation | Deallocation |
|---|---|---|
| Object Pool | O(1) - reuse pre-allocated objects | O(1) - return to pool |
| Slab Allocator | Fixed-size blocks, O(1) | O(1) - mark as free |
| Arena/Region | Bump pointer, O(1) | Free entire arena at once |
| Free List | O(1) - pop from list | O(1) - push to list |
| Benefits | Reduced fragmentation, cache locality, predictable performance | |
Example: Object Pool Implementation
// Object pool for frequently created/destroyed objects
class ObjectPool {
constructor(factory, reset, initialSize = 10) {
this.factory = factory;
this.reset = reset;
this.available = [];
this.inUse = new Set();
// Pre-allocate objects
for (let i = 0; i < initialSize; i++) {
this.available.push(this.factory());
}
}
acquire() {
let obj;
if (this.available.length > 0) {
obj = this.available.pop();
} else {
obj = this.factory(); // Create new if pool exhausted
}
this.inUse.add(obj);
return obj;
}
release(obj) {
if (!this.inUse.has(obj)) {
throw new Error("Object not from this pool");
}
this.inUse.delete(obj);
this.reset(obj);
this.available.push(obj);
}
size() {
return {
available: this.available.length,
inUse: this.inUse.size,
total: this.available.length + this.inUse.size
};
}
}
// Example: pool for vector objects
class Vector2D {
constructor(x = 0, y = 0) {
this.x = x;
this.y = y;
}
set(x, y) {
this.x = x;
this.y = y;
}
}
const vectorPool = new ObjectPool(
() => new Vector2D(),
(v) => v.set(0, 0),
5
);
// Usage
const v1 = vectorPool.acquire();
v1.set(10, 20);
console.log("Vector:", v1);
console.log("Pool size:", vectorPool.size());
vectorPool.release(v1);
console.log("After release:", vectorPool.size());
// Arena allocator (bump pointer)
class Arena {
constructor(size = 1024 * 1024) {
this.buffer = new ArrayBuffer(size);
this.offset = 0;
this.allocations = [];
}
allocate(bytes) {
if (this.offset + bytes > this.buffer.byteLength) {
throw new Error("Arena out of memory");
}
const ptr = this.offset;
this.offset += bytes;
this.allocations.push({ptr, bytes});
return new Uint8Array(this.buffer, ptr, bytes);
}
reset() {
this.offset = 0;
this.allocations = [];
// All allocations invalidated, but memory reused
}
usage() {
return {
used: this.offset,
total: this.buffer.byteLength,
percentage: (this.offset / this.buffer.byteLength * 100).toFixed(2) + '%'
};
}
}
const arena = new Arena(1024);
const arr1 = arena.allocate(100);
const arr2 = arena.allocate(200);
console.log("\nArena usage:", arena.usage());
arena.reset();
console.log("After reset:", arena.usage());
// Free list allocator
class FreeListAllocator {
constructor(blockSize, numBlocks) {
this.blockSize = blockSize;
this.freeList = [];
// Initialize free list
for (let i = 0; i < numBlocks; i++) {
this.freeList.push(new Uint8Array(blockSize));
}
}
allocate() {
if (this.freeList.length === 0) {
return null; // Out of memory
}
return this.freeList.pop();
}
free(block) {
block.fill(0); // Clear data
this.freeList.push(block);
}
available() {
return this.freeList.length;
}
}
const allocator = new FreeListAllocator(64, 10);
const block1 = allocator.allocate();
const block2 = allocator.allocate();
console.log("\nAvailable blocks:", allocator.available());
allocator.free(block1);
console.log("After freeing one:", allocator.available());
// Benefits:
// - Reduced GC pressure
// - Better cache locality
// - Predictable allocation times
// - Reduced memory fragmentation
5. Garbage Collection Aware Programming
| Technique | Problem | Solution |
|---|---|---|
| Object Reuse | Frequent allocation/deallocation | Object pooling, reuse instances |
| Avoid Temporary Objects | GC pressure from short-lived objects | Use primitives, modify in-place |
| Weak References | Memory leaks with caches | Use WeakMap/WeakSet for caches |
| Large Arrays | Full heap GC with big allocations | Typed arrays, pre-allocate capacity |
| Closure Memory | Closures retain outer scope | Nullify references when done |
Example: GC-Aware Patterns
// Bad: creates many temporary objects
function sumBad(arr) {
return arr
.map(x => x * 2) // Creates new array
.filter(x => x > 10) // Creates another array
.reduce((a, b) => a + b, 0);
}
// Good: single pass, no temporaries
function sumGood(arr) {
let sum = 0;
for (const x of arr) {
const doubled = x * 2;
if (doubled > 10) {
sum += doubled;
}
}
return sum;
}
// WeakMap for cache (auto garbage collection)
class ComputationCache {
constructor() {
this.cache = new WeakMap();
}
compute(obj, expensive) {
if (this.cache.has(obj)) {
return this.cache.get(obj);
}
const result = expensive(obj);
this.cache.set(obj, result);
return result;
}
}
// When obj is garbage collected, cache entry is auto-removed
// Pre-allocate for performance-critical code
class ParticleSystem {
constructor(maxParticles) {
this.maxParticles = maxParticles;
this.positions = new Float32Array(maxParticles * 2); // x, y pairs
this.velocities = new Float32Array(maxParticles * 2);
this.active = new Uint8Array(maxParticles);
this.count = 0;
}
addParticle(x, y, vx, vy) {
if (this.count >= this.maxParticles) {
return false;
}
const idx = this.count * 2;
this.positions[idx] = x;
this.positions[idx + 1] = y;
this.velocities[idx] = vx;
this.velocities[idx + 1] = vy;
this.active[this.count] = 1;
this.count++;
return true;
}
update(dt) {
for (let i = 0; i < this.count; i++) {
if (!this.active[i]) continue;
const idx = i * 2;
this.positions[idx] += this.velocities[idx] * dt;
this.positions[idx + 1] += this.velocities[idx + 1] * dt;
}
}
// No object creation in update loop!
}
// Avoid closure memory leaks
function createHandlerBad() {
const largeData = new Array(1000000).fill(0);
return function() {
// This closure keeps largeData in memory even if unused
console.log("Handler called");
};
}
function createHandlerGood() {
// Don't capture largeData if not needed
return function() {
console.log("Handler called");
};
}
// String concatenation
function buildStringBad(arr) {
let str = "";
for (const item of arr) {
str += item + ","; // Creates new string each iteration
}
return str;
}
function buildStringGood(arr) {
return arr.join(","); // More efficient
}
// Array growth
function arrayGrowthBad() {
const arr = [];
for (let i = 0; i < 10000; i++) {
arr.push(i); // May cause multiple reallocations
}
return arr;
}
function arrayGrowthGood() {
const arr = new Array(10000); // Pre-allocate
for (let i = 0; i < 10000; i++) {
arr[i] = i;
}
return arr;
}
// Test
const testArr = Array.from({length: 100}, (_, i) => i);
console.time("Bad");
sumBad(testArr);
console.timeEnd("Bad");
console.time("Good");
sumGood(testArr);
console.timeEnd("Good");
const particles = new ParticleSystem(1000);
particles.addParticle(0, 0, 1, 1);
particles.update(0.016); // 60 FPS frame
console.log("\nParticle position:",
particles.positions[0], particles.positions[1]);
6. Space-Time Tradeoff Analysis
| Problem | Time-optimized (more space) | Space-optimized (more time) |
|---|---|---|
| Fibonacci | O(n) time, O(n) space - memoization | O(n) time, O(1) space - iterative |
| Two Sum | O(n) time, O(n) space - hash map | O(n²) time, O(1) space - nested loops |
| String Search | O(n) time, O(m) space - KMP/suffix array | O(n×m) time, O(1) space - naive search |
| Range Query | O(log n) time, O(n) space - segment tree | O(n) time, O(1) space - linear scan |
| Caching | O(1) lookup, O(capacity) space | Recompute each time, O(1) space |
Example: Space-Time Tradeoffs
// Two Sum variations
function twoSumTimeFast(arr, target) {
const map = new Map();
for (let i = 0; i < arr.length; i++) {
const complement = target - arr[i];
if (map.has(complement)) {
return [map.get(complement), i];
}
map.set(arr[i], i);
}
return null;
// Time: O(n), Space: O(n)
}
function twoSumSpaceFast(arr, target) {
for (let i = 0; i < arr.length; i++) {
for (let j = i + 1; j < arr.length; j++) {
if (arr[i] + arr[j] === target) {
return [i, j];
}
}
}
return null;
// Time: O(n²), Space: O(1)
}
// Range sum query
class RangeSumPrecomputed {
constructor(arr) {
this.prefix = [0];
for (const num of arr) {
this.prefix.push(this.prefix[this.prefix.length - 1] + num);
}
// Space: O(n)
}
query(left, right) {
return this.prefix[right + 1] - this.prefix[left];
// Time: O(1)
}
}
class RangeSumOnDemand {
constructor(arr) {
this.arr = arr;
// Space: O(1) - just store reference
}
query(left, right) {
let sum = 0;
for (let i = left; i <= right; i++) {
sum += this.arr[i];
}
return sum;
// Time: O(n)
}
}
// Factorial: memoized vs iterative
const factorialMemo = (() => {
const cache = new Map([[0, 1], [1, 1]]);
return function factorial(n) {
if (cache.has(n)) return cache.get(n);
const result = n * factorial(n - 1);
cache.set(n, result);
return result;
};
})();
// Space: O(n) for cache
function factorialIterative(n) {
let result = 1;
for (let i = 2; i <= n; i++) {
result *= i;
}
return result;
}
// Space: O(1)
// LRU Cache: space for speed
class LRUCache {
constructor(capacity) {
this.capacity = capacity;
this.cache = new Map();
}
get(key) {
if (!this.cache.has(key)) return -1;
const value = this.cache.get(key);
this.cache.delete(key);
this.cache.set(key, value); // Move to end (most recent)
return value;
}
put(key, value) {
if (this.cache.has(key)) {
this.cache.delete(key);
} else if (this.cache.size >= this.capacity) {
const firstKey = this.cache.keys().next().value;
this.cache.delete(firstKey); // Remove least recent
}
this.cache.set(key, value);
}
// O(1) operations, O(capacity) space
}
// Test tradeoffs
const arr = Array.from({length: 1000}, (_, i) => i);
console.time("Precomputed setup");
const precomputed = new RangeSumPrecomputed(arr);
console.timeEnd("Precomputed setup");
console.time("Precomputed query");
for (let i = 0; i < 1000; i++) {
precomputed.query(10, 100);
}
console.timeEnd("Precomputed query");
console.time("On-demand query");
const onDemand = new RangeSumOnDemand(arr);
for (let i = 0; i < 1000; i++) {
onDemand.query(10, 100);
}
console.timeEnd("On-demand query");
console.log("\nFactorial 10 (memoized):", factorialMemo(10));
console.log("Factorial 10 (iterative):", factorialIterative(10));
// Choose based on:
// - Query frequency (many queries → precompute)
// - Available memory (limited → compute on demand)
// - Update frequency (frequent updates → on demand)
// - Access pattern (random → precompute, sequential → compute)
7. Auxiliary Space vs Total Space
| Concept | Definition | Example |
|---|---|---|
| Total Space | Input size + auxiliary space | Algorithm using O(n) input + O(log n) extra = O(n) total |
| Auxiliary Space | Extra space excluding input | QuickSort O(log n) for recursion stack |
| In-place | O(1) auxiliary space | Heap sort, bubble sort, selection sort |
| Out-of-place | O(n) or more auxiliary space | Merge sort with temp arrays |
| Recursion | Stack space = O(depth) | DFS on tree = O(h), BFS with queue = O(w) |
Example: Auxiliary Space Analysis
// Merge Sort: O(n) auxiliary space (out-of-place)
function mergeSort(arr) {
if (arr.length <= 1) return arr;
const mid = Math.floor(arr.length / 2);
const left = mergeSort(arr.slice(0, mid)); // O(n) space for slices
const right = mergeSort(arr.slice(mid));
return merge(left, right);
}
function merge(left, right) {
const result = []; // O(n) auxiliary space
let i = 0, j = 0;
while (i < left.length && j < right.length) {
if (left[i] < right[j]) {
result.push(left[i++]);
} else {
result.push(right[j++]);
}
}
return result.concat(left.slice(i)).concat(right.slice(j));
}
// Quick Sort: O(log n) auxiliary space (in-place, recursion only)
function quickSort(arr, low = 0, high = arr.length - 1) {
if (low < high) {
const pi = partition(arr, low, high);
quickSort(arr, low, pi - 1); // Stack space O(log n) average
quickSort(arr, pi + 1, high);
}
return arr;
}
function partition(arr, low, high) {
const pivot = arr[high];
let i = low - 1;
for (let j = low; j < high; j++) {
if (arr[j] < pivot) {
i++;
[arr[i], arr[j]] = [arr[j], arr[i]]; // In-place swap
}
}
[arr[i + 1], arr[high]] = [arr[high], arr[i + 1]];
return i + 1;
}
// Iterative QuickSort: O(log n) auxiliary for explicit stack
function quickSortIterative(arr) {
const stack = [];
stack.push(0);
stack.push(arr.length - 1);
while (stack.length > 0) {
const high = stack.pop();
const low = stack.pop();
if (low < high) {
const pi = partition(arr, low, high);
stack.push(low);
stack.push(pi - 1);
stack.push(pi + 1);
stack.push(high);
}
}
return arr;
}
// Tree traversal space complexity
class TreeNode {
constructor(val, left = null, right = null) {
this.val = val;
this.left = left;
this.right = right;
}
}
// DFS Recursive: O(h) stack space, h = tree height
function dfsRecursive(root, result = []) {
if (!root) return result;
result.push(root.val);
dfsRecursive(root.left, result);
dfsRecursive(root.right, result);
return result;
}
// DFS Iterative: O(h) explicit stack space
function dfsIterative(root) {
if (!root) return [];
const result = [];
const stack = [root];
while (stack.length > 0) {
const node = stack.pop();
result.push(node.val);
if (node.right) stack.push(node.right);
if (node.left) stack.push(node.left);
}
return result;
}
// BFS: O(w) queue space, w = max width
function bfs(root) {
if (!root) return [];
const result = [];
const queue = [root];
while (queue.length > 0) {
const node = queue.shift();
result.push(node.val);
if (node.left) queue.push(node.left);
if (node.right) queue.push(node.right);
}
return result;
}
// Morris Traversal: O(1) auxiliary space!
function morrisInorder(root) {
const result = [];
let current = root;
while (current) {
if (!current.left) {
result.push(current.val);
current = current.right;
} else {
// Find predecessor
let predecessor = current.left;
while (predecessor.right && predecessor.right !== current) {
predecessor = predecessor.right;
}
if (!predecessor.right) {
predecessor.right = current; // Create thread
current = current.left;
} else {
predecessor.right = null; // Remove thread
result.push(current.val);
current = current.right;
}
}
}
return result;
}
// Test
const arr1 = [64, 34, 25, 12, 22, 11, 90];
console.log("Merge sort:", mergeSort([...arr1]));
// Auxiliary: O(n)
console.log("Quick sort:", quickSort([...arr1]));
// Auxiliary: O(log n) average
console.log("Quick sort iterative:", quickSortIterative([...arr1]));
// Auxiliary: O(log n) explicit stack
const tree = new TreeNode(1,
new TreeNode(2, new TreeNode(4), new TreeNode(5)),
new TreeNode(3)
);
console.log("\nDFS recursive:", dfsRecursive(tree));
// Auxiliary: O(h)
console.log("DFS iterative:", dfsIterative(tree));
// Auxiliary: O(h)
console.log("BFS:", bfs(tree));
// Auxiliary: O(w)
console.log("Morris inorder:", morrisInorder(tree));
// Auxiliary: O(1) !
8. Memory Hierarchy Optimization
| Level | Size | Latency | Optimization |
|---|---|---|---|
| L1 Cache | 32-64 KB | ~1 ns (4 cycles) | Keep hot data < 32KB |
| L2 Cache | 256-512 KB | ~3 ns (12 cycles) | Working set < 256KB |
| L3 Cache | 8-32 MB | ~12 ns (40 cycles) | Shared, avoid conflicts |
| RAM | 4-64 GB | ~100 ns (200-300 cycles) | Minimize cache misses |
| SSD | 256 GB - 2 TB | ~50-150 μs | Batch I/O, async reads |
Example: Cache Optimization Techniques
// Cache line size: typically 64 bytes
// Goal: maximize data in each loaded cache line
// Bad: strided access pattern
function sumColumnBad(matrix) {
const rows = matrix.length;
const cols = matrix[0].length;
const sums = Array(cols).fill(0);
for (let j = 0; j < cols; j++) {
for (let i = 0; i < rows; i++) {
sums[j] += matrix[i][j]; // Column access - cache miss heavy
}
}
return sums;
}
// Good: sequential access pattern
function sumColumnGood(matrix) {
const rows = matrix.length;
const cols = matrix[0].length;
const sums = Array(cols).fill(0);
for (let i = 0; i < rows; i++) {
for (let j = 0; j < cols; j++) {
sums[j] += matrix[i][j]; // Row access - cache friendly
}
}
return sums;
}
// Loop blocking/tiling for matrix operations
function matrixMultiplyTiled(A, B, tileSize = 32) {
const n = A.length;
const C = Array(n).fill(0).map(() => Array(n).fill(0));
// Tile loops for L1 cache
for (let ii = 0; ii < n; ii += tileSize) {
for (let jj = 0; jj < n; jj += tileSize) {
for (let kk = 0; kk < n; kk += tileSize) {
// Inner tile computation
const iMax = Math.min(ii + tileSize, n);
const jMax = Math.min(jj + tileSize, n);
const kMax = Math.min(kk + tileSize, n);
for (let i = ii; i < iMax; i++) {
for (let k = kk; k < kMax; k++) {
const r = A[i][k];
for (let j = jj; j < jMax; j++) {
C[i][j] += r * B[k][j];
}
}
}
}
}
}
return C;
}
// Data alignment for cache lines
class AlignedArray {
constructor(size, alignment = 64) {
// In JS, use typed arrays for better memory layout
this.data = new Float64Array(size);
this.alignment = alignment;
}
get(index) {
return this.data[index];
}
set(index, value) {
this.data[index] = value;
}
}
// Prefetching hint (manual in some languages, automatic in JS)
function sumWithPrefetch(arr) {
let sum = 0;
const prefetchDistance = 8; // Prefetch ahead
for (let i = 0; i < arr.length; i++) {
// In C/C++, would use __builtin_prefetch(&arr[i + prefetchDistance])
sum += arr[i];
}
return sum;
}
// False sharing avoidance (multi-threaded)
class CacheLinePadded {
constructor(value) {
this.value = value;
// Pad to cache line size (64 bytes)
// In JS, this is less critical, but in C/C++:
// char padding[64 - sizeof(value)];
this.padding = new Array(7).fill(0); // Approximate padding
}
}
// Memory access patterns
function compareAccessPatterns(size) {
const arr = new Float64Array(size);
// Sequential access (cache friendly)
console.time("Sequential");
for (let i = 0; i < arr.length; i++) {
arr[i] = i;
}
console.timeEnd("Sequential");
// Strided access (cache unfriendly)
console.time("Strided");
const stride = 16;
for (let i = 0; i < stride; i++) {
for (let j = i; j < arr.length; j += stride) {
arr[j] = j;
}
}
console.timeEnd("Strided");
// Random access (worst case)
console.time("Random");
for (let i = 0; i < arr.length; i++) {
const idx = Math.floor(Math.random() * arr.length);
arr[idx] = idx;
}
console.timeEnd("Random");
}
// Benchmark cache effects
const matrix = Array(100).fill(0).map(() =>
Array(100).fill(0).map(() => Math.random())
);
console.time("Column sum (bad)");
sumColumnBad(matrix);
console.timeEnd("Column sum (bad)");
console.time("Column sum (good)");
sumColumnGood(matrix);
console.timeEnd("Column sum (good)");
console.log("\nAccess pattern comparison:");
compareAccessPatterns(1000000);
// Key principles:
// 1. Sequential > strided > random access
// 2. Keep working set in L1/L2 cache
// 3. Align data to cache line boundaries
// 4. Avoid false sharing in multi-threaded code
// 5. Use loop tiling for large arrays
// 6. Prefer SoA over AoS for SIMD
// 7. Minimize pointer chasing
Memory Optimization Summary
- Cache-friendly: Sequential access, SoA layout, loop tiling, prefetching, aligned data structures
- Space Optimization: Rolling arrays, bit packing, coordinate compression, implicit trees, lazy evaluation
- In-place: O(1) auxiliary space, two-pointer swap, reversal algorithm, Dutch flag, partition schemes
- Memory Pools: Object pooling, arena/bump allocators, free lists, slab allocation, reduced fragmentation
- GC-Aware: Object reuse, avoid temporaries, WeakMap/WeakSet, typed arrays, closure management
- Space-Time Tradeoff: Precompute vs compute on-demand, memoization vs iteration, caching strategies
- Auxiliary Space: Extra space excluding input, recursion stack O(depth), in-place O(1), Morris traversal
- Memory Hierarchy: L1 < 32KB hot data, sequential > strided > random, cache line 64 bytes, loop blocking