

### **OBJECTIVES**

Mon 3/11 (4pm): Husky Alumni Visit from T-Mobile Q&A

CS work life after graduation - room TBA

■ Wed 3/13: Prof. Mohamed Ali- UWT CSS Grad Program

Assignment 2

Active Reading Quiz Posted - Chapter 19

Assignment 3

■ **Memory Virtualization** 

Chapter 15 - Address Translation

Chapter 16 - Segmentation

■ Chapter 17 - Free Space Management

Chapter 18 - Introduction to Paging

Chapter 19 - Translation Lookaside Buffer (TLB)

•

March 4, 2019 TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

### FEEDBACK FROM 2/27

- Can we schedule producers / consumers to run on different CPU cores?
- Yes, see Sloppy Counter example from Ch. 29 which create pthreads and assigns them to fixed CPU cores
- Uses sched setaffinity() API call
- http://faculty.washington.edu/wlloyd/courses/tcss422/ examples/Chapter29/sloppy.c
- Does realloc() overwrite the header?
- Realloc() should rewrite (update) the header with any information that has changed

March 4, 2019

TCSS422: Operating Systems [Winter 2019]
School of Engineering and Technology, University of Washington - Tacoma

L13.3

### FEEDBACK - 2

- Can we get an extension on our HW 2?
- 2 day extension until Tuesday @ 11:59p
- Be wary of using the debugger to find causes of deadlock in multithreaded
- What challenges may arise if trying to reproduce deadlock using a stepwise debugger?

March 4, 2019

TCSS422: Operating Systems [Winter 2019]

School of Engineering and Technology, University of Washington - Tacoma













### DYNAMIC RELOCATION OF PROGRAMS Hardware requirements: Requirements **HW** support Privileged mode CPU modes: kernel, user Base / bounds registers Registers to support address translation Translate virtual addr; check if in Translation circuitry, check limits bounds Privileged instruction(s) to Instructions for modifying base/bound update base / bounds regs registers Privileged instruction(s) Set code pointers to OS code to handle faults to register exception handlers Ability to raise exceptions For out-of-bounds memory access, or attempts to access privileged instr. TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma March 4, 2019 L13.11

# OS SUPPORT FOR MEMORY VIRTUALIZATION For base and bounds OS support required When process starts running Allocate address space in physical memory When a process is terminated Reclaiming memory for use When context switch occurs Saving and storing the base-bounds pair Exception handlers Function pointers set at OS boot time TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma







### **DYNAMIC RELOCATION**

- OS can move process data when not running
- 1. OS deschedules process from scheduler
- 2. OS copies address space from current to new location
- 3. OS updates PCB (base and bounds registers)
- 4. OS reschedules process
- When process runs new base register is restored to CPU
- Process doesn't know it was even moved!

March 4, 2019 TCSS422: Operating Systems [Winter 2019]
School of Engineering and Technology, University of Washington - Tacoma





### **MULTIPLE SEGMENTS**

- Memory segmentation
- Address space has (3) segments
  - Contiguous portions of address space
  - Logically separate segments for: code, stack, heap
- Each segment can placed separately
- Track base and bounds for each segment (registers)

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma















### **SHARED CODE SEGMENTS**

- Code sharing: enabled with HW support
- Supports storing shared libraries in memory only once
- DLL: dynamic linked library
- .so (linux): shraed object in Linux (under /usr/lib)
- Many programs can access them
- Protection bits: track permissions to segment

Segment Register Values(with Protection)

| Segment | Base | Size | Grows Positive? | Protection   |
|---------|------|------|-----------------|--------------|
| Code    | 32K  | 2K   | 1               | Read-Execute |
| Heap    | 34K  | 2K   | 1               | Read-Write   |
| Stack   | 28K  | 2K   | 0               | Read-Write   |

March 4, 2019 TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

**SEGMENTATION GRANULARITY** 

- Coarse-grained
- Manage memory as large purpose based segments:
  - Code segment
  - Heap segment
  - Stack segment



March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

L13.28









### FREE SPACE MANAGEMENT

- How should free space be managed, when satisfying variable-sized requests?
- What strategies can be used to minimize fragmentation?
- What are the time and space overheads of alternate approaches?

November 20, 2018

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

L15.33

### FREE SPACE MANAGEMENT

- Management of memory using
- Only fixed-sized units
  - Easy: keep a list
  - Memory request → return first free entry
    - Simple search
- With variable sized units
  - More challenging
  - Results from variable sized malloc requests
  - Leads to fragmentation

March 4, 2019

TCSS422: Operating Systems [Winter 2019]

School of Engineering and Technology, University of Washington - Tacoma













### **MEMORY HEADERS - 3**

- Size of memory chunk is:
- Header size + user malloc size
- N bytes + sizeof(header)
- Easy to determine address of header

```
void free(void *ptr) {
        header t *hptr = (void *)ptr - sizeof(header t);
```

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

L13.41

L13.42

### THE FREE LIST

■ Simple free list struct

```
typedef struct __node_t {
         int size;
         struct __node_t *next;
} nodet t;
```

- Use mmap to create free list
- 4kb heap, 4 byte header, one contiguous free chunk

```
// mmap() returns a pointer to a chunk of free space
node_t *head = mmap(NULL, 4096, PROT_READ|PROT_WRITE,
                             MAP_ANON|MAP_PRIVATE, -1, 0);
head->size = 4096 - sizeof(node_t);
head->next = NULL;
```

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd













### **MEMORY ALLOCATION STRATEGIES**

- Best fit
  - Traverse free list
  - Identify all candidate free chunks
  - Note which is smallest (has best fit)
  - When splitting, "leftover" pieces are small (and potentially less useful -- fragmented)
- Worst fit
  - Traverse free list
  - Identify largest free chunk
  - Split largest free chunk, leaving a still large free chunk

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma



### **MEMORY ALLOCATION STRATEGIES - 2**

### First fit

- Start search at beginning of free list
- Find first chunk large enough for request
- Split chunk, returning a "fit" chunk, saving the remainder
- Avoids full free list traversal of best and worst fit

### Next fit

- Similar to first fit, but start search at last search location
- Maintain a pointer that "cycles" through the list
- Helps balance chunk distribution vs. first fit
- Find first chunk, that is large enough for the request, and split
- Avoids full free list traversal

March 4, 2019 TCSS422: Operating Systems [Winter 2019]
School of Engineering and Technology, University of Washington - Tacoma

L13.51

### **SEGREGATED LISTS**

- For popular sized requestse.g. for kernel objects such as locks, inodes, etc.
- Manage as segregated free lists
- Provide object caches: stores pre-initialized objects
- How much memory should be dedicated for specialized requests (object caches)?
- If a given cache is low in memory, can request "slabs" of memory from the general allocator for caches.
- General allocator will reclaim slabs when not used

March 4, 2019 TCSS422: Operating Systems [Winter 2019]

School of Engineering and Technology, University of Washington - Tacoma







### **PAGING**

- Split up address space of process into <u>fixed sized pieces</u> called pages
- Alternative to <u>variable sized pieces</u> (Segmentation) which suffers from significant fragmentation
- Physical memory is split up into an array of fixed-size slots called page frames.
- Each process has a page table which translates virtual addresses to physical addresses

March 4, 2019

TCSS422: Operating Systems [Winter 2019]

School of Engineering and Technology, University of Washington - Tacoma

L13.57

L13.58

### **ADVANTAGES OF PAGING**

- Flexibility
  - Abstracts the process address space into pages
  - No need to track direction of HEAP / STACK growth
    - Just add more pages...
  - No need to store unused space
    - As with segments...
- Simplicity

March 4, 2019

- Pages and page frames are the same size
- Easy to allocate and keep a free list of pages

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma March 4, 2019

Page Table: **PAGING: EXAMPLE**  $VP0 \rightarrow PF3$  $VP1 \rightarrow PF7$ VP2 → PF5 VP3 → PF2 Consider a 128 byte address space with 16-byte pages page frame 0 of reserved for OS physical memory 16 Consider a 64-byte program (unused) page frame 1 address space page frame 2 page 3 of AS page 0 of AS page frame 3 64 0 (page 0 of page frame 4 (unused) the address space) 16 80 page 2 of AS page frame 5 (page 1) 32 96 (page 2) (unused) page frame 6 48 112 (page 3) page 1 of AS page frame 7 128 A Simple 64-byte Address Space 64-Byte Address Space Placed In Physical Memory TCSS422: Operating Systems [Winter 2019]

School of Engineering and Technology, University of Washington - Tacoma





### PAGING DESIGN QUESTIONS

- (1) Where are page tables stored?
- (2) What are the typical contents of the page table?
- (3) How big are page tables?
- (4) Does paging make the system too slow?

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

L13.61

### (1) WHERE ARE PAGE TABLES STORED?

- Example:
  - Consider a 32-bit process address space (up to 4GB)
  - With 4 KB pages
  - 20 bits for VPN (2<sup>20</sup> pages)
  - 12 bits for the page offset (2<sup>12</sup> unique bytes in a page)
- Page tables for each process are stored in RAM
  - Support potential storage of 2<sup>20</sup> translations
    - = 1,048,576 pages per process
  - Each page has a page table entry size of 4 bytes

March 4, 2019

TCSS422: Operating Systems [Winter 2019]

School of Engineering and Technology, University of Washington - Tacoma

### PAGE TABLE EXAMPLE

- With 2<sup>20</sup> slots in our page table for a single process
- Each slot dereferences a VPN
- Provides physical frame number
- Each slot requires 4 bytes (32 bits)
  - 20 for the PFN on a 4GB system with 4KB pages
  - 12 for the offset which is preserved
  - (note we have no status bits, so this is unrealistically small)



VPN<sub>0</sub>

- How much memory to store page table for 1 process?
  - 4,194,304 bytes (or 4MB) to index one process

March 4, 2019 TCSS422: Operating Systems [Winter 2019]
School of Engineering and Technology, University of Washington - Tacoma

L13.63

### NOW FOR AN ENTIRE OS

- If 4 MB is required to store one process
- Consider how much memory is required for an entire OS?
  - With for example 100 processes...
- Page table memory requirement is now 4MB x 100 = 400MB
- If computer has 4GB memory (maximum for 32-bits), the page table consumes 10% of memory

400 MB / 4000 GB

Is this efficient?

March 4, 2019 TCSS422: Operating Systems [Winter 2019]
School of Engineering and Technology, University of Washington - Tacoma





### **PAGE TABLE ENTRY - 2**

- Common flags:
- Valid Bit: Indicating whether the particular translation is valid.
- Protection Bit: Indicating whether the page could be read from, written to, or executed from
- Present Bit: Indicating whether this page is in physical memory or on disk(swapped out)
- Dirty Bit: Indicating whether the page has been modified since it was brought into memory
- Reference Bit(Accessed Bit): Indicating that a page has been accessed

March 4, 2019 TCSS422: Operating Systems [Winter 2019]
School of Engineering and Technology, University of Washington - Tacoma

### (3) HOW BIG ARE PAGE TABLES?

- Page tables are too big to store on the CPU
- Page tables are stored using physical memory
- Paging supports efficiently storing a sparsely populated address space
  - Reduced memory requirement
     Compared to base and bounds, and segments

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

## (4) DOES PAGING MAKE THE SYSTEM TOO SLOW?

- Translation
- Issue #1: Starting location of the page table is needed
  - HW Support: Page-table base register
    - stores active process
    - Facilitates translation

Page Table: VP0 → PF3

VP1 → PF7

VP2 → PF5

Stored in RAM  $\rightarrow$  VP3  $\rightarrow$  PF2

- Issue #2: Each memory address translation for paging requires an extra memory reference
  - HW Support: TLBs (Chapter 19)

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

L13.69

### **PAGING MEMORY ACCESS**

```
// Extract the VPN from the virtual address
2.
        VPN = (VirtualAddress & VPN_MASK) >> SHIFT
3.
        // Form the address of the page-table entry (PTE)
5.
        PTEAddr = PTBR + (VPN * sizeof(PTE))
6.
        // Fetch the PTE
7.
        PTE = AccessMemory(PTEAddr)
8.
10.
        // Check if process can access the page
        if (PTE.Valid == False)
11.
12.
                 RaiseException(SEGMENTATION_FAULT)
13.
        else if (CanAccess(PTE.ProtectBits) == False)
14.
                 RaiseException(PROTECTION_FAULT)
15.
        else
                 // Access is OK: form physical address and fetch it
16.
17.
                 offset = VirtualAddress & OFFSET_MASK
18.
                 PhysAddr = (PTE.PFN << PFN_SHIFT) | offset
19.
                 Register = AccessMemory(PhysAddr)
                    TCSS422: Operating Systems [Winter 2019]
    March 4, 2019
                                                                          L13.70
                    School of Engineering and Technology, University of Washington - Tacoma
```





### PAGING SYSTEM EXAMPLE

- Consider a 4GB Computer:
- With a 4096-byte page size (4KB)
- How many pages would fit in physical memory?
- Now consider a page table:
- For the page table entry, how many bits are required for the VPN?
- If we assume the use of 4-byte (32 bit) page table entries, how many bits are available for status bits?
- How much space does this page table require? Page Table Entries x Number of pages
- How many page tables (for user processes) would fill the entire 4GB of memory?

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma



### **OBJECTIVES**

- Chapter 19
  - TLB Algorithm
  - TLB Tradeoffs
  - TLB Context Switch

March 4, 2019

TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma

L13.75

### TRANSLATION LOOKASIDE BUFFER

- Legacy name...
- Better name, "Address Translation Cache"
- ■TLB is an on CPU cache of address translations
  - •virtual → physical memory

March 4, 2019

TCSS422: Operating Systems [Winter 2019]

School of Engineering and Technology, University of Washington - Tacoma











## TLB - ADDRESS TRANSLATION CACHE Key detail: For a TLB miss, we first access the page table in RAM to populate the TLB... we then requery the TLB All address translations go through the TLB TCSS422: Operating Systems [Winter 2019] School of Engineering and Technology, University of Washington - Tacoma









