This is an old version of the compendium, written May 20, 2015, 8:36 p.m. Changes made in this revision were made by matsbyr. View rendered version.

TDT4186: Operating Systems

# Introduction

An operating system (OS) is a collection of software that provides an abstract machine on top of a physical machine. The OS provides resource management. An OS makes a computer easier to use for end users and programmers. Typical services provided by an OS includes storage/file systems, process/thread management, security and communications.

# Processes A process is an instance of a computer program that is being executed. It contains the progam code and its current memory state. A CPU can execute a single process at a time. Multiprogramming is a way for CPUs to fake concurrency by switching between different processes, often quickly. The Scheduler, using a scheduling algorithm, decides which process gets to use a CPU at any given time. ## Process control ### CPU Modes of execution

There is usually two modes of the processor in which it can execute the process. The first is usually the user mode, and the second a more privilieged kernel mode. User and kernel mode is sepearated from each other so that the system tables are safe from interference of user programs.

### Process interruption ||**Mechanism** || **Cause** || **Use** || || Interrupt || External to the execution of the current instruction || Reaction to an async external event || || Trap || Associated with the execution of the current instruction || Handling of an error or an exception condition|| || Supervisor call || Explicit request || Call to an operating system function. || ### Process switching

Conditions on which the OS takes control. CPU may switch to another process.

|| Clock interrupt || Time slice is up, and process is switched.||

|| Memory fault || Pointer points to something that isn'nt there. Issues I/O request to fetch what's missing. ||

|| I/O interrupt || OS determines that an I/O action is complete, and moves all processes which were waiting for the IO to the ready-queue. || ## Scheduling

Scheduling algorithms should be fair and balanced. For batch systems (systems which executes a set of jobs without interruptions), the goal is to maximize process throughput, minimize turnaround time, and maximize CPU utilization. For interactive systems (where there is interaction between user and system), the goal is to minimize response time, and ensure proportionality in latency (meeting the user's expectations). For real time systems (systems in which jobs depend on time), the goal is to meet deadlines and behave predictably.

A scheduling algorithm can either be preemptive or non-preemptive. A non-preemptive lets active processes decide when they are done using the CPU, while a preemptive algorithm can decide to switch processes at any time, that is it can interrupt the currently running process. ### Scheduling Algorithms in Batch Systems First-Come First-Served : Non-preemptive. Active processes are run to completion or blockage. New or unblocked processes are put in the ready queue. Maximizes CPU usage. Shortest Job First : Non-preemptive. Uses statistics to make educated guesses about run-time of processes. The ready queue is a priority queue prioritizing the shortest jobs first. Minimizes turnaround. Shortest Remaining Time Next : Preemptive. Like Shortest Job First, except a new process may take the place of the active process if its remaining time is lower than that of the active process. ### Scheduling Algorithms in Interactive Systems Round-Robin Scheduling : Preemptive. Each process is assigned a time interval, or quantum. Processes that run longer than quantum are switched. Switching also occurs when the active process becomes blocked, or terminates. The quantum time must be tweaked to allow the best balance between response time and CPU utilization. Priority Scheduling : Preemptive. Processes are given a priority score, assigned statically or dynamically. The ready queue is a priority queue on process priority. May use quantums. Shortest Process Next : Preemptive. Estimates which process is shortest, and prioritizes accordingly. Guaranteed Scheduling : Preemptive. When there are n processes, each process gets 1/n CPU cycles. Lottery Scheduling : Preemptive. Prioritize processes at (optionally weighted) random. ### Scheduling Algorithms in Real Time Systems TBD. ## Threads

We've talked about processes, and you might think they're cool. You might also think it would be even cooler if it could to more than one thing at the same time. Related things, like for instance a program for chatting could let you send and receive messages at the same time. Well, that's possible. Let us look at threads.

### Relationship between threads and processes

So couldn't you just hadve two processes for the example above? Well, yes. But with threads it's both easier and faster. Let's see what the differences are.

A process has the following:

- A virtual address space that holds the process image.

- Protected access to processors, other processes (so it can communicate with other processes) and I/O resources. So what is then a thread? Well, a process can hold one or more threads. A thread is a set of instructions. This means you can have one set of instructions for checking if there is a new message in your chat client, and another set of instructions which handles you sending a message. Each of these threads has the following: - A thread execution state

- A saved context of the process for when it's not running, which means it has a program counter pointing at the next instruction.

- An execution stack; what are the rest of the instructions?

- Some storage for the local variables in that thread.

- Access to memory and othe resources of the process. This is shared with the other threads of the process.

This means that the process itself has a process control block, and a user address space, while each thread has its own user and kernel stack, in addition to a thread control block, which holds a register, priority and related information for that thread. A single-threaded process does not need a thread control block, as it does not need to sepearate the instructions from others.

**Why use threads?** One could just use multiple processes, and let them communicate? Well, there are a few benefits of threads: It's faster! It takes less time to both create and terminate a thread than a process, as the overhead of a thread is smaller. It also takes less time to switch between threads in the same process. It's also far more efficient to communicate between threads than between processes.

### Types of threads There are different types of threads, as threads are higher up the "chain" than processes, we need to have some control over what the threads can access of OS functionality. There are therefore two broad-categories of thread implementations: #### User-level threads (ULT) With pure ULT all of the thread management is done by the application. This means that there is a threading library offered to the programmer. Here, the kernel is not even aware that there are any threading going on. The program usually starts with a single thread, then spawns threads as the program executes.

Advantages ~~with~~of ULT is that the thread switching does not require kernel mode from the CPU. It also means that the scheduling of the threads can be specific to each application, and that any OS can use ULT as long as ~~they~~it provides a threading library.

There are also disadvantages. In a typical OS many system calls are blockings, which means that when a ULT executes a system call that thread and all other threads in the process are blocked. Also, if you use pure ULT you cannot take advantage of multiprocessing, as the kernel can only assign a process to one processor at a time, and the kernel isn't aware that threre are sep~~earte sets of cunstrictions that coult~~arate sets of constrictions that could be placed on different processors.

A solution to the blocking system call problem, is _jacketing_. This means converting a blocking system call to a nonblocking system call thorugh the threading library. Eg. instead of calling the system I/O routine directly, the thread calls a application-level I/O jacket routine.

#### Kernel-level thread

This is the complete opposite. As you might have guessed, the kernel manages threading completely in a pure KLT system. There is no thread management at application level, just an API for the kernel thread facility.

Windows is an example of a KLT system. With KLT you have the obvious advantage of the kernel being aware of the threads, which means you can let threads run on different processors. This also means that if one thread is blocked, it can schedule another thread of the same process, which in turn means that even if a part of a program is waiting for say an I/O operation, you still do other things like looking at the GUI. Finally, KLT means you can multithread kernel routines. On the other hand, if you want to transfer control from one thread to another of the same process you'll have to be in kernel mode, which means that ULT is much faster (as it runs in user-mode all the time). #### Combined ULT-KLT

So what's good about each of the two? Well, KLT has an obvious speedup as it can place threads of different cores, while the ULT does not have to mode switch. May they be combined?

Yes, they can. One implementation is to create all threads as ULTs. Then map all threads to a smaller or equal number of KLTs (all ULTs has an associated KLT, and all KLTs has one or more ULTs). Solaris has this implementation.

### Multicore and multithreading

You might wonder about the possible speedup of multithreading when r~~an on diff~~un on different cores? Amdahl's law gives us an expression which helps us

$$\text{Speedup} = \frac{\text{time to execute program on a single processor}}{ \text{time to execute program on N parallel processors}} = \frac{1}{(1-f) + \frac{f}{N}}$$

The $f$ is is the fraction of code to be run which is infinitely parallaelizable. There are some nice graphs on page 197 in the book which shows us how attractive multicore organziation is if we are to believe this law. With $f = 0.9$ on 8 cores you'll have a performance gain of $4.7$!

## Concurrency **Key terms in concurrency: **

|| Atomic operation || Instructions in a set of instructions are indivisiable, and an operation is atomic if the whole set executes or none at all. ||

|| Critical section || A section of code in a process which accesses shared resources while another process is in a corresponding section of code, that is also accessing that resource.||

|| Mutual exclutsion || The requirement that only one process can be in a critical section and access a shared resource at a time. || || Liveclock || Two or more processes are continuously changing their state in response to change(s) in other process(es), without doing any useful work themselves. ||

|| Race condition || When the final result of a computation depends on which thread or process locks a resource first, and the relative timing between them. ||

|| Starvation || A process (or thread) halts because it never gets to do its next instruction, eg. it has to wait indefin~~etly for a~~itely for an I/O operation || || Deadlock || Two or more processes wait cannot proceed because they're waiting for one of the other processes to complete (eg. with an I/O operation). ||

We'll first start by looking at a few challenges with concurrency, and move on to look at implementations.

### Deadlocks A set of processes are deadlocked if each process in the set is waiting for an event that only another process in the set can cause. Four conditions must hold for there to be a deadlock: || Mutual exclusion || A resource can only be assigned to one or no process at once. || || Hold and wait || Processes currently holding resources can request more resources. || || No preemption || Processes cannot lose a resource against its will. || || Circular wait || There must be a circular chain of two or more processes, each waiting for a resource held by the next member of the chain. || Deadlock prevention can be done by preventing one or more of the four conditions above. Removing a deadlock after it has occurred usually entails killing processes, or doing resource rollbacks. #### Methods for handling deadlocks 1. Allow system to enter deadlock and then recover. 2. Ensure that the system will never enter a deadlock. 3. Ignore the problem and pretend that deadlock never occur in the system. ### Starvation ### Mutual exclusion As mentioned in the introduction for concurrency: Mutual exclusion is a requirement for processes so that their read/write-operations do not interfere with each other. If two processes read or write from/to the same location in memory at the same time, then you'll have no clue about what you end up with there. We'll look at a few hardware implementations of mutual exclusion: #### Interrupt disabling

Usually a resource can be used by a process, then the process is interrupted and thrown out of the cirritical section of code. The interrupt is caused by some other process wanting the resource. Stupid. Simply let the hardware disable the possibility of processor interruption before ciritical sections, then enable it again after the section. This only works with systems where only one process is running at a time, that is only uniprocessor systems.

#### Special machine instruction

So what happens if you have a multiprocessor system? We can introduce a special machine instruction, which won't work unless no other is accessing the requested resource. THhe foundation for this is that memory locations are unavailable when a process is accessing that location. These are the two most commonly implemented instructions:

**Compare & swap** You pass the instruction an address, a test value and a new value. If the current value of the address is the test value, then you replace the value with the new value. If that condition is false, you leave the addressed value unchanged. This operation is not subject to interruption.

This means that when a process calls for a change of value, it tests to see if the address it wants to work with holds the expected value. If the value is not what it expected, someone is probably working on the value right now. If for instance you have a memory location with value of 0. Then some processes $P_1, P_2 .. P_n$ wants access, the first one to find the value to be 0 gets to go first. Other processes goes into busy wait mode. When the process is done it resets the value. THhis means a memory value works as a key for the shared resource.

**Exchange instruction** This instruction takes a register address and a memory address, and exchanges the value of the address location with the register location.

The memory value is used as a key. The process which finds the memory value as say 0, may enter its critical section after exchanging the memory value with its own register key, which for some process may be 1. This means that for this example, the process in its critical section now has a register key 0, while the memory location has a value of 1, after the exchange. When its done it resets the memory value.

**So what's so great about the machine-instruction approach?** It's simple and you can easily verify that it works. It can alosso support any number of processes, on any number of processors, with any number of critical sections. There are a few disadvantages though. Both straarvation and deadlock is possible, and busy wait is employted which means you spend CPU time on waiting, which is inefficient.

### Process synchronization

You might want your processes to communicate, so they can work togehheter and not compete for resources. There are a few ways to do this~~, l~~. Let's look at them.

#### Semaphores

A semaphore is simply an integer value which is used for signaling between processes. There are three atomic operations:

- Initialization of semaphore. It's set to a nonnegative value.

- `semWait(s)`, decrements the value - `semSignal(s)`, increments the value.

This means that processes waiting for some other processes to finish can call `semWait(s)`. Then `s` is decremented for each process, which means that the number will represent the current number of processes waiting. The processes they're waiting for can call `semSignal(s)` to increment `s`, and a waiting process can go on.

This means semaphores may only be used as broadcasting. Everyone with access to the semaphore can say they want to be signaled, which means that the process signaling needs some sort of control on how many signals it sends. If you want all waiting procceses to continue, you need to call `semSignal(s)` for the number of times there are waiting proceses. As the semaphore is used a counter for the number of waiting processes, it's called a **counting semaphore** or a **general semaphore**.

A variant of semaphores are **binary semaphores**, which can be either 0 or 1. They have the following atomic operations:

- Initialization of semaphore. It's set to a nonnegative value.

- `semWaitB(s)`, if semaphore is zero, the processes calling is blocked. If 1 it can continue.

- `semSignalB(s)`, checks if any processes are blocked (which means checking if the semaphore is 0). If there are blocked process, one of them ~~are~~is freed. If no processes are waiting, the semaphore is set to 1.

A solution similar to the binary semaphore is the **mutual exclusion lock (mutex)**. This is a binary flag which locks a resource. The process sets the flag to 1 at the beginning, and 0 wh~~ich locks a~~en it's done with the resource. The ~~process sets the falg to 1 at the~~difference from a b~~eginning, and 0~~inary semaphore is that the only process wh~~en it's done with the resource. The difference from a binary semaphore is that the only processes which can rel~~ich can release the mutex, is the one ~~who caput~~that captured it. This means that the processes uses `semWait(s)` before a critical section, and `semSignal(s)` after the section.

Now as we can see, both counting and binary semaphores depend on a queue for the waiting processes. How can that be implemneted? The fairest solution would be a FIFO queue. Some semaphores do~~es not spc~~ not specify in which order procsesses are removed from its queue. These are called _weak semaphores_. Those that specify in which order processes are removed are called _strong semaphores_.

#### Monitors

Semaphores might seem simple, and we need something more advanced. What if we encompass everything in an objceect? Let's do that, and call it a monitor. The monitor can have the data you wnaant to protect for the mutex requir~~mene~~ement as a field. It can have methods to access that data, and use those methods to protect it. There cannot be any way of accessing the data outside the monitor, and only one process may be in the monitor at once, so that only one process accesses data at any time.

A monitor have three characteristics:

1. Local data variables are accessible only by the monitor's procedures and not by any external procedure 2. PA process enters the monitor by invoking one of its procedures

3. Only one process may be executing in the monitor at a time Further we may need some way to alert processes that something it's waiting for is done, so we have some methods:

- `cwait(c)`, where c is some condition. The processes is suspended on that condition.

- `csignal(c)`, where c is some condition. The process is resumed on that condition. In the monitor we'll have queues for each wait-condition. When `csignal` is called on the condition, we take one process from the queue and let it do its stuff.

The obvious advantage of monitors over semaphores is that the programmer gets help with ensuring that the mutex requirement is fulfilled, and that the processes operate synchronized. Just imagine a banking account system with a simple counting semaphore implementation. Semaphore resource access works if all processes are programmed corecrectly. Monitor resource access works if the monitor(s) are programmed correclttly. WHhich do you think is easier to verify?

#### Message passing So we want to send messages between two processes, which means one needs to send the message, and another needs to receive it. There are many implementations of this, but it's usually something similar to: - `send(destination, message)` - `receive(source, message)`

**Synchronization:** If one process is to receive a message of another, it implies some synchronization, which in turn involves som blocking and nonblocking cases. There are three possible combinations, while there are usually only two combinations implemented in a system:

- Blocking send and receive: Both are required to wait until message passing is done.

- Nonblocking send, blocking receive: The sending process can continue execution after send operation, while the receiving waits for a message to be received before continuing.

- Nonblocking send and receive: Both get to continue. Danger of losing messages, as a receive can have been executed before a send.

**Addressing:** With direct addressing both sender and receiver hasve to know the specific ID of the process. With indirect adressing you use some sort of common data structure like a queue for the messages (like a mailbox) Mailboxes can be viewed as owned by the creating process, or the OS.

**Message format:** The format of the messages depend on the OS implemenation, and the messaging facility and needs of the programmer. Some would prefer small messages which gives less overhead but more messages, or you can have bigger overhead and fewer messages. For large data you can just reference a file where you place it.

A message may have a header with type, source and destination, length and control information, like messages passed over the Internet. Control information can give information about which instruction the message is related to, or info about other messages coming in. **Qeueing discipline:** Here you have three options: - FIFO

- Priority

- Let receiver inspect message and select which to receive next **Mutual exclusion:** We can implement mutual exclusion with message passing by having a mailbox for the resource. The mailbox starts with one message. The process wanting the resource receives the message, and is placed in a queue if it's not available. When the process is done with the resource, it returns the message to the mailbox.

# Memory Management

The essential requirement of memory management is to provide a memory abstraction which allows dynamically allocating portions of memory to a process. This section details different memory management strategies. Terms in memory management: || Frame || A fixed-size block of main memroy. || || Page || A fixed-size block of data in secondary memory, which can be copied temporarily to a frame of main memory. ||

|| Segment || A variable length block of data in secondary meomory. The entire segment can be temporarliily copied to main memory (segmentation), or it can be divided into pages and they can be copied indivudidually to main memory (combined segmentation and paging). ||

__Physical representation:__ Physically, memory structures are usually stored in a tree-like hierarchy, with pointers to the next nodes. ## Memory partitioning

There are sevareral ways you can partition memmory, and each has its own strengths and drawebacks. Let's look closer at them. But first, let's look closer at some challenges the partiontioning methods meets.

### Fragmentation Describes a situation where space is wasted due to some design decision. In other words: Hight fragmentation means a lot of wasted space in memory.

**Internal fragmentation:** A process is assigned more space than it needs, so that other processes cannot utilize that memory. Eg. a process is assigned a parti~~otion of~~ tion of sizie 16 MB, but is of size 8 MB.

**External fragmnentation:** When there are small holes in memory which are hard to utilize. An example of this is when a process is assigned a contig~~ious part of memo~~uous part of memory, then frees a part in the middle which is hard to utilize, as it has to be of the size freed or smaller. To overcome external fragmentation you can use **compaction**, which mens that the OS shifts the processes so that they are a contgiiguous block.

### Addressing

There are different ways to address locations in memory:

|| Physical address || Actual address location. ||

|| Logical address || A reference to a memory location independent of the current assignment of data to memory. Must be translated to physical address || || Relative address || Address given relative some known point, usually a value in the register. || A process is given a private address space, which is a private area of memory denoted by a base address and a size limit. Processes use relative addresses which are translated to physical addresses by the OS. ### Memory management techniques **No memory management technique:** The entire memory is addressable by every process. Usually a complete disaster when multiprogramming is involved. Therefore each process is given its own address space.

**Fixed partitioning:** Main memory is divided into partiotions of fixed size, and processes can be loaded into a partition of equal ofr greater size, and will occupy the whole partition. It's easy to implement~~ion~~, and has little overhead for the OS. It can be implemented with either all partiotions having the same size, or with different sizes of partitions. Either way, partiotion sizes are set at system start. It's very inefficient use of memory though as there will be a maximum number of processes, and internal fragmentation.

**Dynamic partitioning:** Each process is assigned the exact space it needs. It's more efficient than fixed partionionng, and solves internal fragmentation. However it introduces external fragmneentations, whi~~hch in turns~~ch in turn means inefficient use of the processor as it has to use _coompaction_ to counter it.

Since it's expensive to do compaction, this cannot be done all the time. We need effective algortiithms for placement. These are used to place a process in a block of equal or larger size:

- Best fit: Place it in the block closest in size of the request. - First-fit: Scans from top, and places it in the first block it fits in.

- Next-fit Scans from last placement location, and placeses in the first block it fits in.

Best-fit is the worst one. First-fit is usually best, but often placesa a lot of small processes at the beginning of the address space.

**Buddy system:** In this sytem, memory block are available of size $2^K$, where $ L <= K <= U $, where $$ 2^L = \text{smallest size block that is allocated} $$ $$ 2^U = \text{largest size block that is allocated} $$ This means that $2^U$ usually is equal to the memory space available for allocation.

The name is given becuaause it splits the request into memory blocks which are placed individually, after som rule.

**Simple paging:** Here we divide the memory up in fixed sized frames, and processes in fixed size blocks. We keep a list of available frames in memory. When you want to allocate memory to a process, you create a _page table_ for the process, and place the process blocks into frames in memory. When you want to find the dat aa in memory, the page table holds the address of a frame. There's no extenrrnal segmentation, and a small amount of internal fragmentation. Paging is invisible to the programmer.

**Simple segmentation:** Here we divide processes into segments, and these are loaded into dynamically sized segments of memory which do not need to be contgiuous. There's a _segment table_ for each process, which tells us where the segments are located. There's no internal fragmentation, but there is external fragmentaion. Improved memory utilization compared to dynamic partitioning.

### Virtual Memory To release the programmer and user of the contraints of memory, we let them perceive a much larger memory. One part is the real memory (called _real memory_), and another part which is located on disk and is called _virtual memory._ This is used to decrease swapping times || Virtual address || The assigned virtual address to a resource, so that it can be accessed as it || || Virtual address space || Virtual storage assgined to a process || || Address space || Range fo memory addresses available to a process. || || Raal address || The address of a storage location in main memory. || **Virtual memory paging:** **Virtual memory segmentation:** ## Page Replacement Algorithms Now we've looked at memory management techniques which can be used. Now lets's got further, and look at how it combines with the actual swapping of processes. You see, if one process in memory is not ready to be executed, then in takes up space in memory. So you want to switch out not ready processes, with those that are ready. Those who aren't ready are dumped to secondary storage. Swapping takes time, especially when using an HDD instead of an SSD. There are several algorithms that can do this for us, each with it's own up- and downsides. ### Not Recently Used (NRU) Pages are given two flags: referenced and modified. Each time a page is read from, the referenced flag is set. Each time a page is written to, the modified flag is set. Every now and then (e.g. every 20 msec), a clock signal triggers the resetting of the referenced flag. Based on these flags, we can classify the pages at any given time in one of four classes: ||Class 0 || Not referenced, not modified.|| ||Class 1 || Not referenced, modified.|| ||Class 2 || Referenced, not modified.|| ||Class 3 || Referenced, modified.|| NRU swaps out a random page from the lowest-numbered non-empty class. ### First-In, First-Out (FIFO) Oldest page gets swapped out first. ### Least Recently Used (LRU) Swaps out the page which has remained unused for the longest time. ### WSClock

Combination of the Working Set and Clock algorithms. We regard the memory as an initially-empty cirular list of frames, and maintain a pointer to a single frame ("current page"). Each page has a referenced flag and a modified flag, as well as a ~~time of last use~~'time of last use' field. Again, eventual clock ticks reset the referenced flag. The algorithm:

t = some constant, e.g. 120 msec On page fault: // A page which is not in memory is requested, it gets loaded on-the fly. if current page is referenced: set current page as not referened set current page to next page restart procedure else: if time difference between now and current page's time of last use is larger than t: if current page is modified: schedule disk write for current page set current page to next page restart procedure else: swap out current page This means it will replace the first page which is not referenced nor modifed in the last time t. Pages before this that are referenced are marked as not referenced. Pages that are modified are scheduled for disk write. Additionally, we must consider the case where the current page pointer completes one entire circle in the circular list of frames. In the case where at least one write has been scheduled, the algorithm simply continues until the write is complete, at which point the newly-written page will be ready for eviction. On the other hand, if no writes have been scheduled, the first non-modified page will be swapped out. If there are no non-modified pages, the current page gets swapped out. # File systems A file system is an abstraction to store, retrieve and update a set of files. A file is a logical unit of information created by a process. Information stored in a file is persistent, and can be accessed by different processes. ## File Structure There are many different ways to logically structure a file. Here is a list of three common structures: Byte sequence

: Maximum flexibility. The OS does not help, and does not get in the way. UNIX, MS-DOS and Windows uses this.

Record sequence : Crap. Reads and writes entire fixed-size records at once. Old, punch card-based systems used this, but it is really unpopular now. Tree : Tree of variable size records, allows quick lookup on a specific key. Some mainframes use this. ## File Types Operating systems often support different file types. Here are some common file types: || Regular files || Good ol' regular binary files. || || Directories || System files for maintaining the structure of the file system. || || Character special files || IO-devices masquerading as byte-sequenced files to abstract writing and reading to and from IO devices. || || Block special files || IO-devices masquerading as record-sequenced files to abstract writing and reading to and from IO devices. || ## Directories A file system usually organizes its files in some sort of directory organization. Here are some different ways to organize files: || Single-Level Directory Systems || Every file is in one large root directory. Often used on simple embedded devices. || || Hierarchical Directory Systems || Allows directories to contain other directories, creating a directory tree. Tree directory systems allow for sane organization of files. || ## File System Layout A file system is typically stored on a partition on a disk. Every partition starts with a boot block, but other than that, layout on disk can vary wildly from file system to file system. || Boot block || Here, the boot block contains administrative data about the file system, in addition to the bootstrapping program which is run if an operating system attempts to boot the partition. || || File allocation table || The file allocation table (FAT) contains a list of all the clusters/chunks in the file data area, together with a pointer to the next cluster for each cluster. || || Root directory || The root directory is the statically allocated logical root of the file system, containing files and subdirectories. || || File data area || The file data area contains the rest of the data, organized in clusters as referenced from the file allocation table. || # I/O ## Organization of the I/O-function With _programmed I/O_ the processor issues an I/O-command, then busy waits until the I/O operation is completes. This is obviously ineffective. With _interrupt-driven I/O_ the processor issues the I/O command. Then if the instruction is _nonblocking_ it can continue to execute the program, and is interrupted when the I/O operation is done. If the operation is _blocking_ then the process is put in a blocked state and has to wait for further execution as the I/O operation completes. ### Direct memory access (DMA) With _Direct memory access (DMA)_ you have a DMA module that controls the exchange of data between main memory and the I/O-module. In this case the processor sends a request of transfer to the DMA, and is then interrupted when the DMA is done, and the entire block is transferred.

In the request, the CPU gives the DMA whether the operation is a read or write, the address of the I/O device, the starting location of memory to read/write from/to, and the number of words to be read or written.

There are three major designs you can use to implement DMA. You can either have DMA, CPU and I/O-modules on one bus, but this'll obviously not be very effective. Instead you can have one bus, but let the DMA have a direct bus tot eh I/O -devices. This may be called integrated I/O and DMA. A third option is to have one system bus, and a seperate I/O bus between DMA and I/O devices. That is, not direct bus to each I/O, but one common bus for all I/O devices. ## I/O Buffering I/O also has buffering. It tries to fetch and return data to some place closer to the I/O device than disk, some time before and after I/O request. This is done for two reasons: - A program needing I/O must wait for the I/O to complete. This can be either busy wait or interrupt, but it has to wait in both cases. The I/O is relatively slow compared to many other devices, so the wait can be veyr long. - If you directly give data to a user process it will interfere with the OS swapping of processes. The OS cannot swap a processes out of memory while it has already locked locations in memory during a block transfer. It's important to notice that I/O devices transfer and store data diffrently. Some devices to it _block-oriented_, which means they store and transfer data in blocks. Drives, USB keys and other secondary storage does this, Other devices are _stream-oriented_, which means they transfer data in and out as a stream of bytes. Printers, communication ports (eg. LAN), mouse and keyboard and oher pointing devices does this (and mostly everything that's not secondary storage). Here you can transfer either byte- or line-at-a-time. Now we have three ways of implementing this: - With a _single buffer_ the OS asings a part of the main memory to the I/O operation when it's requested. It then reads a block into the buffer, then transfer it to user space, and immidately starts reading the next block. This is called _reading ahead_. - If you have a a _double buffer_ the process can tranfer to/from two buffers at the same time. This means it can eg. read and write data at the same time. - The third option is a _circular buffer_, which is just any more than two buffers. By using buffering we can smooth out peaks in I/O demand, but we cannot use buffering so that a I/O device can keep up with a process indefinitely. It will increase perfomance, but it isn't perect. ## Disk scheduling ### Disk properties Disks are very slow, especially compared to other components in the computer. || Seek time || time it takes to position the head at the track|| || Rotational delay || Time it takes for the appropriate sectors to line up with the head || || Access time || Sum of seek time and rotational delay || || Transfer time || The time it takes to perform the read or write operation. $T = \frac{b}{rN}$, where T is transfer time, b is number of bytes to be transferred, N is number of bytes on track and r is rotation spiid (rev/sec) || As data is often spread over diffrent tracks (seemingly random) you cannot depend on only improving the specs of the disks. We need som extra techniques to help us. ### Disk Scheduling policies The system uses most time on seek, as sector access involves selection of tracks at random. We need to improve this. How about helping the disk by letting the OS order the I/O requests in some way, so that the time of seek can be less? Clever. There are several policies for this, placed in two groups . The first group uses the data request to determine which one to do first, while the other group does not depend on the request at all. We'll start with the latter group. #### Selection according to requestor || Random || A benchmark that is used to compare these policies is _random scheudling_, which takes a random I/O request and passes to the disk. || || First-in-first-out (FIFO) || A simple queue. New requests at the end, and do the oldest first. This one's the fairest of them all, as the oldest always goes first. It has performes well if the requests are to clustered file sectors, but it will often approximate random scheduling in perofmrance. || || Priority (PRI) || Perform I/O by the priority of the process which sent the requests. This leaves control over I/O operations outside the disk queue managament, so it's not used to optimize disk utilization, but in the case other goals of the system are more important. || || Last in first out (LIFO) || A simple queue. New requests at the end, and do the newest first. This means there'll be little to no arm movement (process sends more requests to same sectors after one another) , and it will increase throughput. There is the possibility of starvation though, if the request is big. || #### Selection according to requested item || Shortest service time first (SSTF) || This one selects the I/O request which will take the least movement of the disk arm, which means it will always get the minimum seek time. As the arm can go both ways, there may be a case where the distance is the same to two requested location. In that case it uses randomness to decide which way to go. It performes better than FIIFO, but it does not guarantee that average seek time is minimum (as it is greedy and just looks one step ahead). || || SCAN || Arm can only go one way, and satisfies all requests in taht direction, until it reaches the last track in that direction. Prevents starvations. Performs similar to SSTF unless the request pattern is very unusual, but it does not exploit locality as well as SSTF as it is more likely to take the route most recently traversed. || || SCAN with LOOK policy|| Same as SCAN, but as the arm moves in a direction it can not only stop at the last track, but also at the last request in that direction. _It looks ahead for requests._ || SCAN (with or without LOOK) has some problems. It favors the latest arriving job, but not only that, it also favors tracks nearest to both inner- and outermost tracks. We need to look further. These algorithms solves these issues. || C-SCAN (circular SCAN) || Scans in one direction, and jumps to start of track when it reaches the end. This reducsed max delay that new requests experience.. SCAN: Expected time for a scan from inner to outer track is $t$, then expected service interval for sectors t the periphery is $2t$. C-SCAN: Interval is on the order of $t + s_{\text{max}}$, where $s_{\text{max}}$ is max seek time || || N-step-SCAN|| Segment queue into queues of length $N$.Queues processed one at a time using SCAN. New requests are added to a new queue. Approaches SCAN in performance for large $N$. Performes like FIFO for $N=1$. || || FSCAN || Two queues. When SCAN starts, it fulfills all requests in one queue, while the other queue is empty. The empty queue takes all new requests. The new requests has to wait for the next SCAN.|| ## I/O Caching Disk cache is a buffer in main memory for disk sectors which exploits the principle of locality . When a request for a particular sector arrives, it checks if the requested data is in cache, and if it is the request is serviced by the cache. If it isn't in cache, it's loaded into cache. There's a huge design consideration when it comes to caching: Which sector in cache should be replaced? || **Name** || **Description** || **Implementation** || || Least recently used (LRU) || Replace the least recently used, that is the block which has been in memory longest without any reference to it. || A stack of pointers. Top is the most recently used block, while the bottom is the least recently used. When a block is used, move it to the top of the stack. When you need to swap, remove the bottom one, and push the new one on the top of the stack. || || Least frequently used (LFU) || Replace the least frequently used. || Have a counter on each block. Increment on use. || LFU may seem good, but it can be biased. Say that a block is not used for a long time, then suddenly used a lot because of the principle of locality. Then you have a huge counter for a block which may not be used for along time. What now? This lead to FBS, which combines LRU and LFU. ** Frequency-based stack (FBS)** You have a stack as with LRU, and a count on each block as with LFU. On the top of the stack the most recently used is, while on the bottom you have the least recently used. We divide the stack in three, "New", "Middle" and "Old". Only blocks in "Old" are eglible for replacement. Blocks reference counts are not incremented on blocks in the "New" section. That is, when a block in the middle- or old section's are referenced their counts are incremented. # Multicore and multi computers ## UMA (uniform memory access) ## NUMA (non-uniform memory access) ## Multi CPU OS implementations || __Master/slave__ || One CPU is responsible for giving tasks to the other CPUs. Scales badly as message passing will take up more and more time and space on the bus as the number of CPUs increase. For small scale however it's a simple and good system. || || __To each their own__ || Each CPU has its own private OS( i.e. they can be different). Can share code this way. However, __a huge minus__ is the lack of shared I/O buffer if one CPU makes use of an I/O operation, meaning that other CPUs won't be able to for instance read from the I/O buffer and thereby share the workload. For this reason this model is rarely, if ever used. || || __Symmetric multiprocessor (SMP)__ || Each CPU has its own fraction of the same OS, being responsible for some system tasks. Makes for a complicated OS(especially in the event of a general __n__ CPU system). Furthermore, synchronization-issues pose a significant challenge. Still, this is the most widely used system as it scales well. || ## Synchronization || Check to see if memory is locked. If not, lock and then read or write, treated as an atomic operation. || || Peterson's software solution. || ## Scheduling Which thread and which CPU to run it. ### Time sharing Makes use of a shared priority queue (which may become a bottleneck). || Smart scheduling || critical threads may be set to spin on their tasks instead of being switched.|| || Affinity scheduling || Have same CPU run the same thread again for good cache and TLB use. || ### Space sharing - Have related threads simultaneously. ### Gang scheduling Related threads run at the same time across the CPUs. Useful for cooperating threads. ## Intercommunication ### Message passing ### process distribution # Virtualization A software-computer. Less critical if it crashes as it's just a program that can be killed and restarted || Hypervisor 1 || Runs on a computer without OS underneath. In the event of system calls the guest OS wil trap to the hypervisor.|| || Hypervisor 2 || Runs on top of an OS (Virtualbox, Vmware). Runs normal instructions directly on the hardware, but will have to emulate sensitive instructions || || Paravirtualization || Guest OS is modified such that sensitive instructions are replaced with syscalls to the hypervisor. ||