This is an old version of the compendium, written Sept. 18, 2014, 1:07 a.m. Changes made in this revision were made by stiaje. View rendered version.

TDT4186: Operating Systems

# Introduction An operating system (OS) is a collection of software that provides an abstract machine on top of a physical machine. The OS provides resource management. An OS makes a computer easier to use for end users and programmers. Typical services provided by an OS includes storage/file systems, process/thread management, security and communications. # Processes A process is an instance of a computer program that is being executed. It contains the progam code and its current memory state. A CPU can execute a single process at a time. Multiprogramming is a way for CPUs to fake concurrency by switching between different processes, often quickly. The Scheduler, using a scheduling algorithm, decides which process gets to use a CPU at any given time. ## Scheduling Scheduling algorithms should be fair and balanced. For batch systems, the goal is to maximize process throughput, minimize turnaround time, and maximize CPU utilization. For interactive systems, the goal is to minimize response time, and ensure proportionality in latency (meeting the user's expectations). For real time systems, the goal is to meet deadlines and behave predictably. A scheduling algorithm can either be preemptive or non-preemptive. A non-preemptive lets active processes decide when they are done using the CPU, while a preemptive algorithm can decide to switch processes at any time. ### Scheduling Algorithms in Batch Systems

|| First-Come First-Served || : Non-preemptive. Active processes are run to completion or blockage. New or unblocked processes are put in the ready queue. Maximizes CPU usage. ~~|| ||~~ Shortest Job First || : Non-preemptive. Uses statistics to make educated guesses about run-time of processes. The ready queue is a priority queue prioritizing the shortest jobs first. Minimizes turnaround. ~~|| ||~~ Shortest Remaining Time Next || : Preemptive. Like Shortest Job First, except a new process may take the place of the active process if its remaining time is lower than that of the active process. ||

### Scheduling Algorithms in Interactive Systems

||Round-Robin Scheduling|| : Preemptive. Each process is assigned a time interval, or quantum. Processes that run longer than quantum are switched. Switching also occurs when the active process becomes blocked, or terminates. The quantum time must be tweaked to allow the best balance between response time and CPU utilization.~~|| ||~~ Priority Scheduling|| : Preemptive. Processes are given a priority score, assigned statically or dynamically. The ready queue is a priority queue on process priority. May use quantums.~~|| ||~~ Shortest Process Next|| : Preemptive. Estimates which process is shortest, and prioritizes accordingly.~~|| ||~~ Guaranteed Scheduling|| : Preemptive. When there are n processes, each process gets 1/n CPU cycles.~~|| ||~~ Lottery Scheduling|| : Preemptive. Prioritize processes at (optionally weighted) random.||

### Scheduling Algorithms in Real Time Systems TBD. # Memory Management The essential requirement of memory management is to provide a memory abstraction which allows dynamically allocating portions of memory to a process. This section details different memory management strategies. ## No Memory Abstraction The entire memory is addressable by every process. Usually a complete disaster when multiprogramming is involved. ## Address Spaces A process is given a private address space, which is a private area of memory denoted by a base address and a size limit. Processes use relative addresses which are translated to absolute addresses by the OS. ## Swapping Swapping dumps a non-running process' address space to disk to free up memory when the system runs out of space. Swapping takes time, especially when using an HDD instead of an SSD. ## Virtual Memory To decrease swapping times, address spaces are split into chunks called pages. Each page can then be swapped independently, even while the process is running. When a page which is not in memory is requested, it gets loaded on-the fly. This is called a page fault. ## Physical Representation Physically, memory structures are usually stored in a tree-like hierarchy, with pointers to the next nodes. ## Page Replacement Algorithms ### Not Recently Used (NRU) Pages are given two flags: referenced and modified. Each time a page is read from, the referenced flag is set. Each time a page is written to, the modified flag is set. Every now and then (e.g. every 20 msec), a clock signal triggers the resetting of the referenced flag. Based on these flags, we can classify the pages at any given time in one of four classes: ||Class 0 || Not referenced, not modified.|| ||Class 1 || Not referenced, modified.|| ||Class 2 || Referenced, not modified.|| ||Class 3 || Referenced, modified.|| NRU swaps out a random page from the lowest-numbered non-empty class. ### First-In, First-Out (FIFO) Oldest page gets swapped out first. ### Least Recently Used (LRU) Swaps out the page which has remained unused for the longest time. ### WSClock Combination of the Working Set and Clock algorithms. We regard the memory as an initially-empty cirular list of frames, and maintain a pointer to a single frame ("current page"). Each page has a referenced flag and a modified flag, as well as a time of last use field. Again, eventual clock ticks reset the referenced flag. The algorithm: t = some constant, e.g. 120 msec On page fault: if current page is referenced: set current page as not referened set current page to next page restart procedure else: if time difference between now and current page's time of last use is larger than t: if current page is modified: schedule disk write for current page set current page to next page restart procedure else: swap out current page Additionally, we must consider the case where the current page pointer completes one entire circle in the circular list of frames. In the case where at least one write has been scheduled, the algorithm simply continues until the write is complete, at which point the newly-written page will be ready for eviction. On the other hand, if no writes have been scheduled, the first non-modified page will be swapped out. If there are no non-modified pages, the current page gets swapped out. # File systems A file system is an abstraction to store, retrieve and update a set of files. A file is a logical unit of information created by a process. Information stored in a file is persistent, and can be accessed by different processes. ## File Structure There are many different ways to logically structure a file. Here is a list of three common structures:

|| Byte sequence || : Maximum flexibility. The OS does not help, and does not get in the way. UNIX, MS-DOS and Windows uses this. ~~|| ||~~ Record sequence || : Crap. Reads and writes entire fixed-size records at once. Old, punch card-based systems used this, but it is really unpopular now. ~~|| ||~~ Tree || : Tree of variable size records, allows quick lookup on a specific key. Some mainframes use this. ||

## File Types Operating systems often support different file types. Here are some common file types: || Regular files || Good ol' regular binary files. || || Directories || System files for maintaining the structure of the file system. || || Character special files || IO-devices masquerading as byte-sequenced files to abstract writing and reading to and from IO devices. || || Block special files || IO-devices masquerading as record-sequenced files to abstract writing and reading to and from IO devices. || ## Directories A file system usually organizes its files in some sort of directory organization. Here are some different ways to organize files: || Single-Level Directory Systems || Every file is in one large root directory. Often used on simple embedded devices. || || Hierarchical Directory Systems || Allows directories to contain other directories, creating a directory tree. Tree directory systems allow for sane organization of files. || ## File System Layout A file system is typically stored on a partition on a disk. Every partition starts with a boot block, but other than that, layout on disk can vary wildly from file system to file system. || Boot block || Here, the boot block contains administrative data about the file system, in addition to the bootstrapping program which is run if an operating system attempts to boot the partition. || || File allocation table || The file allocation table (FAT) contains a list of all the clusters/chunks in the file data area, together with a pointer to the next cluster for each cluster. || || Root directory || The root directory is the statically allocated logical root of the file system, containing files and subdirectories. || || File data area || The file data area contains the rest of the data, organized in clusters as referenced from the file allocation table. || # IO # Deadlocks A set of processes is deadlocked if each process in the set is waiting for an event that only another process in the set can cause. Four conditions must hold for there to be a deadlock: || Mutual exclusion || A resource can only be assigned to one or no process at once. || || Hold and wait || Processes currently holding resources can request more resources. || || No preemption || Processes cannot lose a resource against its will. || || Circular wait || There must be a circular chain of two or more processes, each waiting for a resource held by the next member of the chain. || Deadlock prevention can be done by preventing one or more of the four conditions above. Removing a deadlock after it has occurred usually entails killing processes, or doing resource rollbacks. # Multicore and multi computers ## UMA (uniform memory access) ## NUMA (non-uniform memory access) ## Multi CPU OS implementations || __Master/slave__ || One CPU is responsible for giving tasks to the other CPUs. Scales badly as message passing will take up more and more time and space on the bus as the number of CPUs increase. For small scale however it's a simple and good system. || || __To each their own__ || Each CPU has its own private OS( i.e. they can be different). Can share code this way. However, __a huge minus__ is the lack of shared I/O buffer if one CPU makes use of an I/O operation, meaning that other CPUs won't be able to for instance read from the I/O buffer and thereby share the workload. For this reason this model is rarely, if ever used. || || __Symmetric multiprocessor (SMP)__ || Each CPU has its own fraction of the same OS, being responsible for some system tasks. Makes for a complicated OS(especially in the event of a general __n__ CPU system). Furthermore, synchronization-issues pose a significant challenge. Still, this is the most widely used system as it scales well. || ## Synchronization || Check to see if memory is locked. If not, lock and then read or write, treated as an atomic operation. || || Peterson's software solution. || ## Scheduling Which thread and which CPU to run it. ### Time sharing Makes use of a shared priority queue (which may become a bottleneck). || Smart scheduling || critical threads may be set to spin on their tasks instead of being switched.|| || Affinity scheduling || Have same CPU run the same thread again for good cache and TLB use. || ### Space sharing - Have related threads simultaneously. ### Gang scheduling Related threads run at the same time across the CPUs. Useful for cooperating threads. ## Intercommunication ### Message passing ### process distribution # Virtualization A software-computer. Less critical if it crashes as it's just a program that can be killed and restarted || hypervisor 1 || runs on a computer witrhout OS underneath. In the event of system calls the guest OS wil trap to the hypervisor.|| || hypervisor 2 || Runs on top of an OS (Virtualbox, Vmware). Runs normal instructions directly on the hardware, but will have to emulate sensitive instructions || || Paravirtualization || Guest OS is modified such that sensitive instructions are replaced with syscalls to the hypervisor. ||