Random Notes on Computer Organization stuff
[]Table of Contents
very rough notes for my own reference.
Branch Prediction
- Branch prediction is a way to improve the performance of branch instructions.
- Always guess right (static prediction)
- Dynamic predicion
- Saturating counter (flip-flop)
-
Neural (latest!)
- Do half (speculative processing)
Semaphore
- A semaphore is a variable that can be accessed only through two standard atomic operations: wait and signal.
- wait: If the semaphore is positive, decrement it. Otherwise, block.
- signal: Increment the semaphore. If there are blocked processes, wake one of them up.
- one of the simplest ways to solve the critical section problem.
Memory
- DRAM: Slow, cheap, high capacity.
-
SRAM: Fast, expensive, low capacity.
- How to make slow memory appear fast? Caching!
- How to make small memory appear large? Virtual memory!
Cache
Idea: Use SRAM as a cache for DRAM.
- Temporal locality: If a memory location is accessed, it is likely to be accessed again soon.
- Spatial locality: If a memory location is accessed, nearby memory locations are likely to be accessed soon.
Some terminology:
- Cache hit: The data is in the cache.
- Cache miss: The data is not in the cache.
- Hit rate: The fraction of memory accesses that are hits.
- Hit time: Time to access the cache.
- Miss rate: 1 - hit rate.
- Miss penalty: Time to fetch data from DRAM + hit time.
Cache Organization
- Cache line / block: The smallest unit of data that can be stored in the cache.
- A block can store multiple words.
- Address, let’s say 0x19ac80c2, can be split into
- Address of a word: [tag][index][offset]
- Address of a block: [tag][offset]
- Tag: 0x19ac8
- Index: 0x0c
- Offset: 0x2
- A cache will store: [tag][valid bit][dirty bit][data (maybe 4 words)]
- Valid bit: Whether the cache line is valid.
- Dirty bit: Whether the cache line has been modified. (this is a method for write-back cache)
Cache Write Policy
If no read-miss:
- Write-through: Write to both cache and memory.
- Write-back: Write to cache only, and write to memory only when the cache line is evicted.
If read-miss:
- Write-allocate: Fetch the block from memory and write to cache.
- Write-around: Write to memory only, and do not fetch the block.
Write-through problems
- Slow, because we have to write to memory.
- Solution: Write buffer. Write to buffer, and write to memory later.
Write-back problems
- Wasteful to write to memory if the cache line is evicted without being modified.
- Solution: Dirty bit. Only write to memory if the dirty bit is set.
Memory Abstraction
Let’s say if programs A and B both use “address 0x100” to store their variable i. Suppose if it uses physical address, then both A and B will use the same address. Cannot!
- We can use Base + Limit registers.
- Base: The starting address of the program.
- Limit: The size of the program.
- Real Address = Base + Virtual Address, and assert(Real Address < Base + Limit)
Partitioning (L9)
- Fixed partitioning: Divide memory into fixed-size partitions.
- Con: Partition need to be large enough to hold the largest process.
- Con: Internal fragmentation (smaller process waste memory space).
- Dynamic partitioning: Divide memory into variable-size partitions.
- Free space is known as holes.
- External fragmentation.
Allocation Algorithms (Assume Memory is Contiguous)
- First-fit: Allocate the first hole that is big enough.
- Best-fit: Allocate the smallest hole that is big enough.
- Worst-fit: Allocate the largest hole.
Buddy System
Kind of like segment tree, split binary.
Disjoint Memory Schemes
Paging Scheme
- Divide memory into fixed-size pages (logical memory, keep size as power of 2).
- Divide process into fixed-size frames (physical memory, same size as page).
- Map pages to frames (called page table).
- Physical Address = Frame Number * Page Size + Offset
Observations
- External fragmentation is eliminated.
- Internal fragmentation is minimal (at most one page for each process).
Implementation
- Full software, issue: requires two memory access for every memory reference.
- Hardware Support! Translation Look-aside Buffer (TLB): A cache for page table entries.
Protection
- Paging can include:
- Access rights: Read, write, execute.
Scheduling Algorithms
Round-robin
- Each process gets a fixed amount of time to run.
- If the process finishes before the time is up, it is preempted.
- If the process does not finish before the time is up, it is preempted and put at the back of the queue.
Priority
- Preemptive: Higher priority processes are run first.
-
Non-preemptive: Once a process is running, it will run until it finishes.
- Shortcomings: Starvation.
- Solution 1: Aging. Increase the priority of a process as it waits.
- Solution 2: Minimum time quantum. Ensure low-priority processes get to run for a minimum amount of time.
Multi-Level Feedback Queue (MLFQ)
- Multiple queues with different priorities.
- New job, highest.
- If a process uses up its time quantum, demote.
- If a process blocks, retain.
Lottery
- Each process gets a number of tickets.
- A random ticket is drawn, and the process with the winning ticket gets to run.
- The more tickets a process has, the more likely it is to win.
Threads
- A thread is a sequence of instructions that can be executed independently.
- A process can have multiple threads.
- Threads share the same address space.