Table of Contents

very rough notes for my own reference.

Branch Prediction

  • Branch prediction is a way to improve the performance of branch instructions.
  • Always guess right (static prediction)
  • Dynamic predicion
  • Saturating counter (flip-flop)
  • Neural (latest!)

  • Do half (speculative processing)

Semaphore

  • A semaphore is a variable that can be accessed only through two standard atomic operations: wait and signal.
  • wait: If the semaphore is positive, decrement it. Otherwise, block.
  • signal: Increment the semaphore. If there are blocked processes, wake one of them up.
  • one of the simplest ways to solve the critical section problem.

Memory

  • DRAM: Slow, cheap, high capacity.
  • SRAM: Fast, expensive, low capacity.

  • How to make slow memory appear fast? Caching!
  • How to make small memory appear large? Virtual memory!

Cache

Idea: Use SRAM as a cache for DRAM.

  • Temporal locality: If a memory location is accessed, it is likely to be accessed again soon.
  • Spatial locality: If a memory location is accessed, nearby memory locations are likely to be accessed soon.

Some terminology:

  • Cache hit: The data is in the cache.
  • Cache miss: The data is not in the cache.
  • Hit rate: The fraction of memory accesses that are hits.
  • Hit time: Time to access the cache.
  • Miss rate: 1 - hit rate.
  • Miss penalty: Time to fetch data from DRAM + hit time.

Cache Organization

  • Cache line / block: The smallest unit of data that can be stored in the cache.
  • A block can store multiple words.
  • Address, let’s say 0x19ac80c2, can be split into
    • Address of a word: [tag][index][offset]
    • Address of a block: [tag][offset]
    • Tag: 0x19ac8
    • Index: 0x0c
    • Offset: 0x2
  • A cache will store: [tag][valid bit][dirty bit][data (maybe 4 words)]
    • Valid bit: Whether the cache line is valid.
    • Dirty bit: Whether the cache line has been modified. (this is a method for write-back cache)

Cache Write Policy

If no read-miss:

  • Write-through: Write to both cache and memory.
  • Write-back: Write to cache only, and write to memory only when the cache line is evicted.

If read-miss:

  • Write-allocate: Fetch the block from memory and write to cache.
  • Write-around: Write to memory only, and do not fetch the block.

Write-through problems

  • Slow, because we have to write to memory.
  • Solution: Write buffer. Write to buffer, and write to memory later.

Write-back problems

  • Wasteful to write to memory if the cache line is evicted without being modified.
  • Solution: Dirty bit. Only write to memory if the dirty bit is set.

Memory Abstraction

Let’s say if programs A and B both use “address 0x100” to store their variable i. Suppose if it uses physical address, then both A and B will use the same address. Cannot!

  • We can use Base + Limit registers.
  • Base: The starting address of the program.
  • Limit: The size of the program.
  • Real Address = Base + Virtual Address, and assert(Real Address < Base + Limit)

Partitioning (L9)

  • Fixed partitioning: Divide memory into fixed-size partitions.
    • Con: Partition need to be large enough to hold the largest process.
    • Con: Internal fragmentation (smaller process waste memory space).
  • Dynamic partitioning: Divide memory into variable-size partitions.
    • Free space is known as holes.
    • External fragmentation.

Allocation Algorithms (Assume Memory is Contiguous)

  • First-fit: Allocate the first hole that is big enough.
  • Best-fit: Allocate the smallest hole that is big enough.
  • Worst-fit: Allocate the largest hole.

Buddy System

Kind of like segment tree, split binary.

Disjoint Memory Schemes

Paging Scheme

  • Divide memory into fixed-size pages (logical memory, keep size as power of 2).
  • Divide process into fixed-size frames (physical memory, same size as page).
  • Map pages to frames (called page table).
  • Physical Address = Frame Number * Page Size + Offset

Observations

  • External fragmentation is eliminated.
  • Internal fragmentation is minimal (at most one page for each process).

Implementation

  • Full software, issue: requires two memory access for every memory reference.
  • Hardware Support! Translation Look-aside Buffer (TLB): A cache for page table entries.

Protection

  • Paging can include:
    • Access rights: Read, write, execute.

Scheduling Algorithms

Round-robin

  • Each process gets a fixed amount of time to run.
  • If the process finishes before the time is up, it is preempted.
  • If the process does not finish before the time is up, it is preempted and put at the back of the queue.

Priority

  • Preemptive: Higher priority processes are run first.
  • Non-preemptive: Once a process is running, it will run until it finishes.

  • Shortcomings: Starvation.
    • Solution 1: Aging. Increase the priority of a process as it waits.
    • Solution 2: Minimum time quantum. Ensure low-priority processes get to run for a minimum amount of time.

Multi-Level Feedback Queue (MLFQ)

  • Multiple queues with different priorities.
  • New job, highest.
  • If a process uses up its time quantum, demote.
  • If a process blocks, retain.

Lottery

  • Each process gets a number of tickets.
  • A random ticket is drawn, and the process with the winning ticket gets to run.
  • The more tickets a process has, the more likely it is to win.

Threads

  • A thread is a sequence of instructions that can be executed independently.
  • A process can have multiple threads.
  • Threads share the same address space.