

















OBJECTIVES - 5/30

Questions from 5/30
Assignment 2 - May 31 (June 4- no late penalty)
Assignment 3: (Tutorial) Introduction to Linux Kernel Modules
Memory Segmentation Activity + answers (available in Canvas)
Quiz 4 - Page Tables - Due June 6 @ 11:59am
Final exam - Thursday June 6 @ 3:40pm
Tutorial 3 - File Systems (Optional, Extra Credit)
Chapter 21/22: Beyond Physical Memory
Swapping Mechanisms, Swapping Policies
Ch. 36 I/O Devices, Ch. 37 Hard Disk Drives
Practice Final Exam

11

13

10

```
ASSIGNMENT 3:
INTRODUCTION TO LINUX KERNEL MODULES

Assignment 3 provides an introduction to kernel programming by demonstrating how to create a Linux Kernel Module

Kernel modules are commonly used to write device drivers and can access protected operating system data structures

For example: Linux task_struct process data structure

Assignment 3 Survey - select grade category:

Assignment category (45%)

Quizzes / Activities / Tutorials category (15%)

Lowest two grades in this category are dropped

May 30, 2024

INSAEZ: Operating Systems (Spring, 2024)
School of Engineering and Technology, University of Washington - Tacoma
```

OBJECTIVES - 5/30

Questions from 5/30
Assignment 2 - May 31 (June 4- no late penalty)
Assignment 3: (Tutorial) Introduction to Linux Kernel Modules
Memory Segmentation Activity + answers (available in Canvas)
Quiz 4 - Page Tables - Due June 6 @ 11:59am
Final exam - Thursday June 6 @ 3:40pm
Tutorial 3 - File Systems (Optional, Extra Credit)
Chapter 21/22: Beyond Physical Memory
Swapping Mechanisms, Swapping Policies
Ch. 36 I/O Devices, Ch. 37 Hard Disk Drives
Practice Final Exam

Slides by Wes J. Lloyd



**OBJECTIVES - 5/30** ■ Questions from 5/30 Assignment 2 - May 31 (June 4- no late penalty) Assignment 3: (Tutorial) Introduction to Linux Kernel Modules Memory Segmentation Activity + answers (available in Canvas) Quiz 4 - Page Tables - Due June 6 @ 11:59am Final exam – Thursday June 6 @ 3:40pm ■ Tutorial 3 - File Systems (Optional, Extra Credit) Chapter 21/22: Beyond Physical Memory Swapping Mechanisms, Swapping Policies ■ Ch. 36 I/O Devices, Ch. 37 Hard Disk Drives ■ Practice Final Exam May 30, 2024 L19.15

14 15



**OBJECTIVES - 5/30** ■ Questions from 5/30 Assignment 2 - May 31 (June 4- no late penalty) Assignment 3: (Tutorial) Introduction to Linux Kernel Modules Memory Segmentation Activity + answers (available in Canvas) Quiz 4 - Page Tables - Due June 6 @ 11:59am ■ Final exam - Thursday June 6 @ 3:40pm Tutorial 3 - File Systems (Optional, Extra Credit) Chapter 21/22: Beyond Physical Memory Swapping Mechanisms, Swapping Policies ■ Ch. 36 I/O Devices, Ch. 37 Hard Disk Drives ■ Practice Final Exam May 30, 2024 L19.17

17



LFU ■ LFU: Least frequently used Always replace page with the fewest # of accesses (front) Incorporates frequency of use - must track pg accesses Consider frequency of page accesses 0 1 2 0 1 3 0 3 1 2 1 What is the hit/miss ratio? Hit/miss ratio is=6 hits May 28, 2024 L18.19











With small cache sizes, for the looping sequential workload, why do FIFO and LRU fail to provide cache om. Unpredictable accesses require a random cache replacement policy for cache hits dly are too spread apart temporally to benefit from caching Unlike Random cache replacement, both FIFO and LRU fail to speculate memory accesses in advance to improve caching

24 25



IMPLEMENTING LRU - 2

Harness the Page Table Entry (PTE) Use Bit

HW sets to 1 when page is used

OS sets to 0

Clock algorithm (approximate LRU)

Refer to pages in a circular list

Clock hand points to current page

Loops around

IF USE\_BIT=1 set to USE\_BIT = 0

IF USE\_BIT=0 replace page

May 28, 2024

ISSAIZ: Openiting Statems (Spring 2024)

School of Engineering and Technology, University of Washington - Tacoma

LIS.27

27

29

31

26

CLOCK ALGORITHM

Not as efficient as LRU, but better than other replacement algorithms that do not consider history

The 80-20 Workfood

The 80-20 Workfood

OPT

IRU

Cold to Cold to

CLOCK ALGORITHM - 2

Consider dirty pages in cache
If DIRTY (modified) bit is FALSE
No cost to evict page from cache

If DIRTY (modified) bit is TRUE
Cache eviction requires updating memory
Contents have changed

Clock algorithm should favor no cost eviction

28

 OTHER SWAPPING POLICIES

Page swaps / writes
Group/cluster pages together
Collect pending writes, perform as batch
Grouping disk writes helps amortize latency costs

Thrashing
Occurs when system runs many memory intensive processes and is low in memory
Everything is constantly swapped to-and-from disk

May 28, 2024

TCSS422: Operating Systems Isrning 2024|
School of Engineering and Technology, University of Washington - Tacoma

LIE 31

Slides by Wes J. Lloyd

30

L19.5



**OBJECTIVES - 5/30** ■ Questions from 5/30 Assignment 2 - May 31 (June 4- no late penalty) Assignment 3: (Tutorial) Introduction to Linux Kernel Modules Memory Segmentation Activity + answers (available in Canvas) Quiz 4 - Page Tables - Due June 6 @ 11:59am Final exam - Thursday June 6 @ 3:40pm ■ Tutorial 3 - File Systems (Optional, Extra Credit) Chapter 21/22: Beyond Physical Memory Swapping Mechanisms, Swapping Policies Ch. 36 I/O Devices Ch. 37 Hard Disk Drives Practice Final Exam May 30, 2024 L19.33

33



**OBJECTIVES** Chapter 36 ■I/O: Polling vs Interrupts Programmed I/O (PIO) Port-mapped I/O (PMIO) Memory-mapped I/O (MMIO) Direct memory Access (DMA) May 30, 2024 L19.35

35



**COMPUTER SYSTEM ARCHITECTURE** General I/O Bus (e.g., PCI) Graphics Peripheral I/O Bus (e.g., SCSI, SATA, USB) FAST: High speed devices (e.g. video) are connected via a General I/O bus TCSS422: Operating Systems [Spring 2024] School of Engineering and Technology, Univ May 30, 2024 L19.37

36 37







**OS DEVICE INTERACTION** Common example of device interaction while ( STATUS -- BUSY) Poll- Is device available? write data to data register Command parameterization write command to command register **Send command** Doing so starts the device and executes the c while ( STATUS == BUSY) Poll - Is device done? ; //wait until device May 30, 2024 L19.41

41



INTERRUPTS VS POLLING ■ For longer waits, put process waiting on I/O to sleep Context switch (C/S) to another process ■ When I/O completes, fire an interrupt to initiate C/S back Advantage: better multi-tasking and CPU utilization Avoids: unproductive CPU cycles (polling) 1 : task 1 2 : task 2 CPU 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 Diagram of CPU utilization by interrupt TCSS422: Operating Systems [Spring 2024] School of Engineering and Technology, University of Washington - Tacoma May 30, 2024 L19.43

42 43



**INTERRUPTS VS POLLING - 3** Alternative: two-phase hybrid approach Initially poll, then sleep and use interrupts Issue: livelock problem Common with network I/O Many arriving packets generate many many interrupts Overloads the CPU! No time to execute code, just interrupt handlers! ■ Livelock optimization Coalesce multiple arriving packets (for different processes) into fewer interrupts Must consider number of interrupts a device could generate ersity of Washington - Tacoma

45

|                   | DEVICE I/O                               |
|-------------------|------------------------------------------|
| ■To interact DATA | with a device we must send/receive       |
| ■Two genera       | al approaches:                           |
| • Programn        | ned I/O (PIO):                           |
| Port ma           | pped I/O (PMIO)                          |
| Memory            | mapped I/O (MMIO)                        |
| Direct me         | emory access (DMA)                       |
| May 30, 2024      | TCSS422: Operating Systems (Spring 2024) |

Transfer Modes m transfer rate 

cycle time (MB/s) 5.2 383 ns PIO 8.3 240 ns 11.1 180 ns 16.7 120 ns 2.1 960 ns ingle-word DMA 4.2 480 ns 42 480 ns 13.3 150 ns Multi-word DMA 16.7 120 ns 20 100 ns 240 ns + 2 25.0 160 ns + 2 2 (Ultra ATA/33) 33.3 120 ns + 2 44.4 90 ns + 2 Ultra DMA 4 (Ultra ATA/66) 5 (Ultra ATA/100) 100 40 ns + 2 6 (Ultra ATA/133) 133 30 ns + 2 7 (Ultra ATA/167)[3 167 24 ns + 2

47

49

46

| Р              | ROGRAMMED I/O (PI                                                                                                | 0)  |
|----------------|------------------------------------------------------------------------------------------------------------------|-----|
| ■ CPU supports | d on the CPU onsumed performing I/O data movement (input/outpu PU is occupied with meaning)                      |     |
| PIO            | "over-burdened"                                                                                                  |     |
| CPU 1 1        | 1 1 C C C 2 2 2 2 2 1                                                                                            | 1 1 |
| Disk           | 1 1 1 1 1                                                                                                        |     |
|                | Diagram of CPU utilization                                                                                       |     |
| May 30, 2024   | TCSS422: Operating Systems [Spring 2024] School of Engineering and Technology, University of Washington - Tacoma |     |

**PIO DEVICES** Legacy serial ports Legacy parallel ports ■ PS/2 keyboard and mouse Legacy MIDI, joysticks Old network interfaces TCSS422: Operating Systems [Spring 2024]
School of Engineering and Technology, University of Washington - Tacoma May 30, 2024 L19.49

48



PORT MAPPED I/O (PMIO) ■ Device specific CPU I/O Instructions Follows a Complex Instruction Set - CISC model (Intel): Specific CPU instructions are used for device I/O ■ x86/x86-64: in and out instructions outb, outw, outl May 30, 2024 L19.51

51



**DIRECT MEMORY ACCESS (DMA)** Copy data in memory by offloading to "DMA controller" ■ Many devices (including CPUs) integrate DMA controllers CPU gives DMA: memory address, size, and copy instruction ■ DMA performs I/O independent of the CPU ■ DMA controller generates CPU interrupt when I/O completes 1 : task 1 2 : task 2 C : copy data from memory CPU 1 1 1 1 2 2 2 2 2 2 2 2 1 1 1 1 c c c 1 1 1 1 1 Diagram of CPU utilization by DMA TCSS422: Operating Systems [Spring 2024] School of Engineering and Technology, University of W May 30, 2024 L19.53

53



**DEVICE INTERACTION** ■ The OS must interact with a variety of devices Example: Consider a file system that works across a variety of types of disks: . SCSI, IDE, USB flash drive, DVD, etc. File system should be general purpose, where device specific I/O implementation details are abstracted ■ Device drivers use abstraction to provide general interfaces for vendor specific hardware In Linux: block devices May 30, 2024 L19.55

54 55













60 61



**HDD INTERFACE** Writing disk sectors is atomic (512 bytes) Sector writes are completely successful, or fail ■ Many file systems will read/write 4KB at a time Linux ext3/4 default filesystem blocksize - 4096 Same as typical memory page size May 30, 2024

63

65

62



**EXAMPLE: USDA SOIL EROSION MODEL** WEB SERVICE (RUSLE2) ■ Host ~2,000,000 small XML files totaling 9.5 GB on a ~20GB filesystem on a cloud-based Virtual Machine With default inode ratio (4096 block size), only ~488,000 files will fit Drive less than half full, but files will not fit! ■ HDDs support a minimum block size of 512 bytes OS filesystems such as ext3/ext4 can support "finer grained" management at the expense of a larger catalog Small inode ratio- inodes will considerable % of disk space

64



**HDD INTERFACE - 2** ■ Torn write When OS uses larger block size than HDD Block writes not atomic - they SPAN multiple HDD sectors Upon power failure only a portion of the OS block is written - can lead to data corruption... HDD access Sequential reads of sectors is fastest Random sector reads are slow Disk head continuously must jump to different tracks TCSS422: Operating Systems [Spring 2024]
School of Engineering and Technology, University of Washington - Tacoma May 30, 2024





70

## Concentric circle of sectors

Single side of platter contains 290 K tracks (2008)

Zones: groups of tracks with same # of sectors

Outer tracks have More sectors

TSS422: Operating Systems [Spring 2024] School of Engineering and Technology, University of Washington - Tacoma

EXAMPLE: SIMPLE DISK DRIVE

Single track disk
Head: one per surface of drive
Arm: moves heads across surface of platters

Rotates this way

head

Single Track Plus A Head

A Single Track Plus A Head

May 30, 2024

ISSUEZ: Operating Spitems (Spring 2024)
School of Engineering and Technology, University of Washington - Tacoma

L19-71

71

73



SINGLE-TRACK LATENCY:
THE ROTATIONAL DELAY

Rotational latency (Trotation): time to rotate to desired sector

Average Trotation is ~ about half the time of a full rotation

How to calculate Trotation from rpm
Calculate time for 1 rotation based on rpm
Convert rpm to rps
Divide by two (average rotational latency)

7200rpm = 8.33ms per rotation /2= ~4.166ms
10000rpm = 6ms per rotation /2= ~3ms
15000rpm = 4ms per rotation /2= ~2ms

A Single Track Plus A Head

Slides by Wes J. Lloyd







TRACK SKEW

Sectors are offset across tracks to allow time for head to reposition for sequential reads

Without track skew, when head is repositioned sector would have already been passed

Rotates this way

Three Tracks Track Shew Of 2

May 30, 2024

TCS422: Operating Systems [Spring 2024]
School of Engineering and Technology, University of Washington - Tacoma

77

O .



78 79







**MODERN HDD SPECS** ■ See sample HDD configurations here: Up to 20 TB https://www.westerndigital.com/products/data-centerdrives#hard-disk-hdd May 30, 2024 L19.83

83

85



SSTF - SHORTEST SEEK TIME FIRST Disk scheduling - which I/O request to schedule next ■ Shortest Seek Time First (SSTF) Order queue of I/O requests by nearest track Rotates this way Issue the request to 21 → issue the request to 2 TCSS422: Operating Systems [Spring 2024] School of Engineering and Technology, University of Washington - Tacoma May 30, 2024 L19.85



















95



**QUESTION 3 - TWO-LEVEL PAGE TABLE**  Consider a computer with 1 GB (230) of physical memory where the page size is 1024 bytes (1KB) (210). We would like to index memory pages using a two level page table consisting of a page directory which refers to page tables which are created on demand to index the entire memory space. For simplicity assume than 1GB=1000MB, 1MB=1000KB, 1KB=1000 bytes (a) For a two-level page table, divide the VPN in half. How many bits are required for the page directory index (PDI) in a two-level scheme? (b) How many bits are required for the page table index (PTI)? (c) How many bits are required for an offset to address any byte in the 1 KB page? TCSS422: Operating Systems [Spring 2024] School of Engineering and Technology, University of Washington - Tacoma May 30, 2024 L19.97









100 101





102 103



Q5 - 2 Consider the following free space list: (a) Consider the **next flt** allocation strategy. For this free list above, how many comparison operations must be performed to identify a free chunk of 30-bytes? (b) After the last free space identification, the chunk is split and the remaining free space is returned to the free space list. Now, consider the next fit allocation strategy. After finding a free space for the previous request, how many comparisons are required to identify a free chunk of 10-bytes? TCSS422: Operating Systems [Spring 2024] School of Engineering and Technology, Univ

105



