TTK4147: Real-time Systems
Introduction
Real-time computing
-
Has guaranteed or predictable response time
-
Worst-case execution time is most interesting
-
Optimized for predictability while general purpose is optimized for speed and throughput
-
Not really possible to do with general-purpose operating systems
-
Not a fancy way of saying fast computing
-
Not irrelevant even if hardware is fast enough
Constraints
Type | Missed deadline |
Hard real-time | total system failure |
Firm real-time | results have no value |
Soft real-time | results have less value |
Task triggers
Type | Tasks performed |
Periodic | at a regular time interval |
Sporadic | when event occurs, limits on how often the event can occur |
Aperiodic | when event occurs, no limitations on when it can occur |
General purpose vs. real-time
General purpose | Real-time |
Optimized for speed and throughput | Optimized for predictability |
Usually comes with a wide range of applications, tools, etc. | Usually a minimized system without anything that is not necessary |
Mouse, keyboard and screen | Serial interface or SSH (Specialized I/O) |
Real-time operating systems (RTOS)
FreeRTOS
-
Minimalistic operating system for microcontrollers that often do not use operating systems
-
Run on many platforms
VxWorks and QNX
-
Proprietary operating systems for embedded and real-time systems
-
Stable and proven platforms
-
Expensive licenses
Real-time variants of Linux
-
Various methods for improving the real-time capabilities of Linux
-
RT-Linux, RTAI, Xenomai, preempt-RT
From book
Real-time system definition
Any system in which the time at which output is produced is significant. This is usually because the input corresponds to some movement in the physical world, and the output has to relate to that same movement. The lag from input time to output time must be sufficiently small for acceptable timeliness.
- Cohesion
- is concerned with how well a module holds together – its internal strength.
- Coupling
- by comparison is a measure of the interdependence of program modules. If two modules pass control information between them, they are said to possess high (or tight) coupling.
Within all design methods, a good decomposition is one that has strong cohesion and loose coupling. This principle is equally true in sequential and concurrent programming domains.
Various types of real-time systems: reactive systems, time-aware systems, time-triggered and event-triggered systems
Basic characteristics of a general real-time or embedded computer system
Real-time control, concurrent control of separate system components, low-level programming, support for numerical computation, largeness and complexity, extreme reliability and safety
Stages of design and implementation
requirements specification → system design → detailed design → coding → testing
Real-time language should be secure, readable, flexible, simple, portable and efficient
Real-time definitions overview
Hardware and Operating systems
Microprocessors
System-on-Chip (SoC) are typically used for more complex microcontrollers that also include graphic processing units, digital signal processors, FPGAs, etc.
4 types of instructions
- Processor-memory
- .
- Processor-I/O
- .
- data processing
- .
- control
- .
Interrupts
- Program
- Generated by some condition that is a result from an instruction execution
- Timer
- Generated by a timer within the processor that ca be programmed to trigger at regular interval or after a certain amount of time has passed
- I/O
- Generated by an I/O controller to signal completion of an operation or some other condition
- Hardware failure
- Generated by a failure in the hardware
Can have sequential interrupts or nested interrupts (interrupts within interrupts)
Memory hierarchy
- Inboard memory
-
- Registers
- Cache
- Outboard Storage
-
- Magnetic Disk
- CD-ROM
- CD-RW
- DVD-RW
- DVD-RAM
- Blu-Ray
- Off-line Storage
-
- Magnetic Tape
Cache
-
Write policy determines how often the changes in cache is written back to the main memory
-
Update main memory every time cache block value is changed
-
Update main memory when the cache block value is changed – leaves main memory in obsolete state
-
Caches are divided into levels to exploit locality (lower levels = faster).
CPU ↔ L1 cache ↔ L2 cache ↔ L3 cache ↔ Main memory
Transferring data from I/O to main memory
- Continous polling
- (CPU)
- Interrupt driven
- (CPU)
- Direct memory access
- Writes to DMA module instead of CPU. CPU only informed when transfer is complete.
Reasons for having operating system
-
Convenience – OS is the layer between computer hardware and user, interact with OS instead of hardware
- Instruction set architecture (ISA) – Machine language instructions that a computer can perform
- Facilitation of concurrency. Scheduling.
- Abstraction of hardware and stability.
Hardware to OS
- Application binary interface (ABI) – User programs and libraries can access OS and hardware (ISA) services through system calls
OS to libraries
- Application programming interface (API) – Provide functions that other user programs can use to access the OS and hardware through system calls and other commonly used functionality
Libraries come with OS but not strictly part of OS
-
Efficiency
- OS allow efficient use of the computer system resources – scheduling, memory control, I/O access
-
Ability to evolve
-
Integrate updated hardware
-
Development of new services
-
Upgrading the OS and programs
-
From book
-
Processor: Controls the operation of the computer and performs its data processing functions. Where there is only one processor, it is often referred to as the central processing unit (CPU)
- One of the processor’s functions is to exchange data with memory. Makes use of two internal registers: a memory address register (MAR), which specifies the address in memory for the next read or write; and a memory buffer register (MBR). Which contains the data to be written into memory or which receives the data read from memory.
-
Main memory: Stores data and programs. This memory is typically volatile.
-
I/O modules: Move data between the computer and its external environment.
-
System bus: Provides communication among processors, main memory, and I/O modules.
-
Microprocessors are now multiprocessors; each chip (called a socket) contains multiple processors (called cores), each with multiple levels of large memory caches, and multiple logical processors sharing the execution units of each core.
-
Info about kernel: http://www.linfo.org/kernel.html
-
The classic microprocessor is giving way to the system on a chip (SoC), where not just the CPUs and caches are on the same chip, but also many of the other components of the system, such as DSPs, GPUs, I/O devices (such as radios and codecs), and main memory.
-
The replacement algorithm chooses, within the constraints of the mapping function, which block to replace when a new block is to be loaded into the cache and the cache already has all slots filled with other blocks
-
An OS is a program that controls the execution of application programs and acts as an interface between applications and the computer hardware. It has three objectives
-
Convenience – an OS makes a computer more convenient to use
-
Efficiency – An OS allows the computer system resources to be used in an efficient manner
-
Ability to evolve – An OS should be constructed in such a way as to permit the effective development, testing, and introduction of new system functions without interfering with service
-
-
A monolithic kernel is implemented as a single process, with all elements sharing the same address space
-
A microkernel architecture assigns only a few essential functions to the kernel, including address spaces, IPC, and basic scheduling. Other OS services are provided by processes, sometimes called servers, that run in user mode and are treated like any other application by the microkernel
Multiprogramming and memory
Multiprogramming:
- Maximize processor utilization, providing reasonable response times, let different tasks use different resources, allow tasks to communicate with each other
Process
-
A process is an entity that can be assigned to and executed by a processor
-
Each process has their own set of data
-
Dispatcher – Tells the OS which process to start running
-
5 state process model:
-
Running – process currently executing in the CPU
-
Ready – Process is in a queue waiting to be executed
-
Blocked – running process has to wait for an event, remain blocked until the event occur, a separate queue for the blocked processes
-
New – Process has been created but not yet added to the pool of executable processes
-
Exit – Process that is removed from the pool of executable processes, but still exists in memory
-
-
Swapping – when there is no room for a process in main memory, it is swapped to disk -Seven-state process model includes suspend states for ready and blocked
-
Process image
-
User data – the modifiable part of the user space. May include program data, a user stack area, and programs that may be modified
-
User program – the program to be executed
-
Stack – Each process has one or more LIFO stacks associated with it. A stack is used to store parameters and calling addresses for procedure and system calls
-
Process control block – data needed by the OS to control the process
-
-
Modes of execution – necessary to protect OS data structures from interference from user programs
-
User mode – less-privileged mode, user programs typically execute in this mode
-
System mode – more privileged mode, also referred to as control mode or kernel mode, kernel of the operating system
-
-
Process creation
- The OS assigns a process identifier to the new process, allocate memory space for the process, initialize the process control block, add the new process to appropriate lists (ready or suspended)
-
Switching modes
-
Process: switch which process is running
-
Mode: If no interrupts are waiting, continue running the program
-
Threads vs processes
-
The part that own the resources and data is called process or task
-
The part or parts that follows an execution path is called a thread
-
Multithreading is that the operating system supports multiple threads within each process
-
Threads are faster than processes – both when it comes to switching and creating/exiting
-
Threads share resources which makes it easier to communicate, synchronize etc.
User-level threads
-
The application performs the thread management and scheduling
-
The kernel does not know that the thread exists
-
Advantages: Fast thread switch, portability, custom scheduling for task
Kernel-level threads
-
The kernel schedule each of the threads in the process
-
Advantages: If one thread of a process is blocked, the kernel can schedule another to run, different threads of a process can run on different CPU cores
User and kernel combined
-All threads made in user-space, multiple user level threads are mapped into a number of kernel threads
Memory management
-
Protection
-
No process should modify the memory another process is currently using
-
No normal programs should be able to access any memory belonging to the kernel
-
Processor should have hardware functionality for enforcing memory protection
-
-
Sharing
- Processes need to co-operate – protection needs to be flexible enough to allow inter-process communication
-
Organization – Logical and physical
-
Placement algorithms
-
Best fit – chooses the block that is closest in size to the request
-
First fit – begins to scan memory from the beginning and chooses the first available block that is large enough
-
Next fit – begins to scan memory from the location of the last placement and chooses the next available block that is large enough
-
-
Paging and segmentation – Both divide the memory of a program into smaller parts. Both introduce some variant of a table that keep track of where the different parts of a process is stored
-
Paging divide the data of a process into fixed-size pages that are stored in fixed-size frames in memory
-
Segmentation divide the data of a process into variable-length segments and store these in memory
-
Combination – segmentation from user/programmers point of view, paging from memory’s point of view
-
Virtual memory – secondary memory can be accessed as if it was main memory
-
A process access memory with a virtual address, which can either be in main or secondary
-
Some pages/segments of a process can be in main memory while others in secondary
-
Advantages: “unlimited” memory, only data that is actually used are placed in main memory
-
Disadvantages: thrashing, meaning that a system spend most of its time moving pages/segments between main and secondary memory
-
Problem for real-time – if a part of a real-time process is suddenly in secondary memory, it can give an unexpected delay
From book
-
When OS creates a process (child) at the explicit request of another process (parent), the action is referred to as process spawning
-
When all processes in main memory are in Blocked, the OS can suspend one process by putting it in the Suspend state and transferring it to secondary memory. Same with ready.
-
A process image consists of user data, user program, stack and process control block (process id, processor state information, process control information)
-
Functions of an OS kernel:
-
Process management – creation and termination, scheduling and dispatching, switching, synchronization and support for IPC, management of process control blocks
-
Memory management – allocation of address space to processes, swapping, page and segment management
-
I/O management – buffer management, allocation of I/O channels and devices to processes
-
Support functions – interrupt handling, accounting, monitoring
-
-
Process switching may occur any time the OS has gained control from the currently running process, caused by events like clock interrupt, I/O interrupt, memory fault)
-
In a multiprocessor system, more than one process can be in the running state
-
Two main characteristics of a process:
-
Resource ownership – process or task – virtual address space to hold the process image. A process may from time to time be allocated control or ownership of resources (main memory, I/O or files). The OS performs a protection function to prevent unwanted interference between processes with respect to resources.
-
Scheduling/Execution – lightweight process – follows an execution path through one or more programs. May be interleaved with other processes. Thus, a process has an execution state and a dispatching priority and is the entity that is scheduled and dispatched by the OS.
-
-
Threads within the same process can communicate with each other without invoking the kernel, because they share memory and files
-
Any alteration of a resource by one thread affects the environment of the other threads in the same process. It is therefore necessary to synchronize the activities of the activities of the various threads so that they do not interfere with each other or corrupt data structures.
-
User-level threads – all of the work of thread management is done by the application and the kernel is not aware of the existence of threads (only sees the process).
-
Advantages of user-level threads instead of kernel-level threads(KLT)
-
Thread switching does not require kernel mode privileges because all of the thread management data structures are within the user address space of a single process. Saves overhead of mode switches (user to kernel and back)
-
Scheduling can be application specific. One application may benefit most from RR scheduling while another from priority based.
-
ULTs can run on any OS. No changes are required to the underlying kernel support ULTs.
-
-
Disadvantages of ULTs compared to KLTs
-
In a typical OS, many system calls are blocking. As a result, when a ULT executes a system call, not only is that thread blocked, but also all of the threads within the process are blocked
-
In a pure ULT strategy, a multithreaded application can not take advantage of multiprocessing. A kernel assigns one process to only one processor at a time.
-
-
KLT – are threads within a process that are maintained by the kernel. Because they are recognized by the kernel, multiple threads within one process can execute in parallel on a multiprocessor and the blocking of one thread does not block the entire process. However, a mode switch is required to switch from one thread to another.
-
Multithreaded native applications are characterized by having a small number of highly threaded processes
-
Multiprocess applications are characterized by the presence of many single-threaded processes
-
Process relate to resource ownership, thread relates to program execution
-
Memory management terms
-
Frame – a fixed length block of main memory
-
Page – a fixed length block of data that resides in secondary memory. A page of data may temporarily be copied into a frame of main memory
-
Segment – a variable length block of data that resides in secondary memory. An entire segment may temporarily be copied into an available region of main memory (segmentation) or the segment may be divided into pages which can be individually copied into memory (combined segmentation and paging)
-
Internal fragmentation – wasted space internal to a partition due to the fact that the block of data loaded is smaller than the partition
-
External fragmentation – memory that is external to all partitions becomes increasingly fragmented. Lots of small holes in memory and memory utilization declines
-
-
Memory management techniques
-
Fixed partitioning – main memory is divided into a number of static partitions at system generation time. A process may be loaded into a partition of equal or greater size. Simple, but inefficient use of memory due to internal fragmentation
-
Dynamic partitioning – Partitions are created dynamically, so that each process is loaded into a partition of exactly the same size as that process. No internal fragmentation and more efficient use of main memory, but inefficient use of processor due to the need for compaction to counter external fragmentation
-
Simple Paging – Main memory is divided into a number of equal-size frames. Each process is divided into a number of equal-size pages of same length as frames. A process is loaded by loading all of its pages into available, not necessarily contiguous frames. No external fragmentation, but a small amount of internal fragmentation.
-
Simple segmentation – each process is divided into a number of segments. A process is loaded by loading all of its segments into dynamic partitions that need not be contiguous. No internal fragmentation, improved memory utilization and reduced overhead compared to dynamic partitioning, but external fragmentation
-
Virtual memory paging – as with simple paging, except that it is not necessary to load all of the pages of a process. Nonresident pages that are needed are brought in later automatically. No external fragmentation, higher degree of multiprogramming, large virtual address space, but overhead of complex memory management
-
Virtual memory segmentation – as with simple segmentation, except it is not necessary to load all of the segments of a process. Nonresident segments that are needed are brought in later automatically. No internal fragmentation, higher degree of multiprogramming, large virtual address space, protection and sharing support, but overhead of complex memory management
-
-
Paging – main memory is divided into many small equal-size frames. Each process is divided into frame-size pages. Smaller process requires fewer pages; larger processes require more. When a process is brought in, all of its pages are loaded into available frames, and a page table is set up. This approach solves many of the problems with partitioning.
-
Segmentation – a process is divided into a number of segments that need not be of equal size. When a process is brought in, all of its segments are loaded into available regions of memory, and a segment table is set up
-
Virtual memory terminology
-
Virtual memory – a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory. The addresses a program may use to reference a memory are distinguished from the addresses the memory system uses to identify physical storage sites, and program-generated addresses are translated automatically to the corresponding machine addresses. The size of virtual storage is limited by the addressing scheme of the computer system and by the amount of secondary memory available and not by the actual number of main storage locations.
-
Virtual address – the address assigned to a location in virtual memory to allow that location to be accessed as though it were part of main memory
-
Virtual address space – the virtual storage assigned to a process
-
Address space – the range of memory addresses available to a process
-
Real address – the address of a storage location in main memory
-
-
Virtual memory paging
- Not all pages of a process need be in main memory frames for the process to run. Pages may be read in as needed. Reading a page into main memory may require writing a page out to secondary memory
-
Virtual memory segmentation
- Not all segments of a process need be in main memory for the process to run. Segments may be read in as needed. Reading a segment into main memory may require writing one or more segments to secondary memory
-
Trashing – throwing out a piece of memory just before needing it and having to go get it back almost immediately – swapping pieces rather than executing instructions
Uniprocessor scheduling, inter-process communication and FreeRTOS
- Long-term scheduling
- Which processes that are allowed to be added to the pool of processes being executed
- Medium-term scheduling
- Decide which processes are fully or partially swapped into main memory
- Short-term scheduling
- Decide which of the currently ready processes the CPU should execute – called the dispatcher – main objective is to allocate processor time to optimize certain aspects of system behavior
Scheduling criteria
- Turnaround time
- interval between submission of a process and its completion
- Response time
- time from the submission of a request until the start of response
- Deadlines
- when process completion deadlines can be specified, as many deadlines as possible should be met
- Predictability
- a given job should run in about the same amount of time and at about the same cost regardless of the load on the system
- Throughput
- maximize number of processes completed per unit of time
- Processor utilization
- Percentage of time the processor is busy
- Fairness
- processes thread the same and no process suffer starvation
- Enforcing priorities
- favor higher-priority processes
- Balancing resources
- if a resource is overused, processes not needing it should be favored
- Priority based scheduling
- each priority gets its own queue, the dispatcher will check higher priority queues first
Scheduling methods
- First come first serve – dispatcher chooses the process that has been in the ready queue the longest
- Round robin – each process is given a time slice
- Shortest process next – shortest executing time
- Shortest remaining time – preemptive version of SPN
- Highest response ratio next – dispatcher chooses process from execution time and time waited in queue
$R = \frac{w+s}{s}$ $R$ : response ratio,$w$ : time waited in queue,$s$ : service time
- Feedback – decrease priority each time process is preempted, does not require knowledge of execution time
FreeRTOS
- Minimalistic operating system for microcontrollers that often does not use operating systems
- Run on many platforms
- Very small OS
- Tasks – regular process with own stack, is preemptive
- Co-Routines – intended for use on small processors that have severe RAM constraints, share a single stack
- Task model: running, ready, blocked and suspended
- Scheduling
- Either co-operative or preemptive
- Scheduler activated by a timer interrupt every clock tick
- Equal priority tasks are scheduled with Round-Robin
- Use semaphores, mutexes and queues (message passing)
- Task notification
Inter-process communication - sharing information, synchronization with
- Semaphores – wait/signal, counting/binary – provide synchronization between threads but does not protect the shared buffer
- Mutex – A mutex is a binary semaphore WITH the concept of ownership – only the process/thread that locked the mutex can unlock it
- Message passing – synchronization and exchange of data, blocking/non-blocking
- Race condition – multiple processes or threads read and write to the same data locations
From book
- Multiprogramming – the management of multiple processes within a uniprocessor system
- Multiprocessing – the management of multiple processes within a multiprocessor
-
Distributed processing – the management of multiple processes executing on multiple, distributed computer systems.
-
Concurrency encompasses a host of design issues, including communication among processes, sharing and competing for resources, synchronization of activities of multiple processes, and allocation of processor time to processes
-
Atomic operations – a function or action implemented as a sequence of one or more instructions that appears to be indivisible. The sequence of instruction is guaranteed to execute as a group, or not execute at all, having no visible effect on system state. Atomicity guarantees isolation from concurrent processes.
-
Critical section – a section of code within a process that requires access to shared resources and that must not be executed while another process is in a corresponding section of code
-
Deadlock – two or more processes are unable to proceed because each is waiting for one of the others to do something
-
Livelock – two or more processes continuously change their states in response to changes in the other process(es) without doing any useful work
-
Mutual exclusion – the requirement that when one process is in a critical section that accesses shared resources, no other process may be in a critical section that accesses any of those shared resources
-
Race condition – multiple threads or processes read and write a shared data item and the final result depends on the relative timing of their execution
-
Starvation – a runnable process is overlooked indefinitely by the scheduler
-
-
To guarantee mutual exclusion, it is sufficient to prevent a process from being interrupted. This capability can be provided in the form of primitives defined by the OS kernel for disabling and enabling interrupts
-
Concurrency mechanisms
-
Semaphore – an integer value used for signaling among processes. Only three operations may be performed on a semaphore, all of which are atomic: initialize, decrement, increment.
-
Binary semaphore – takes only values 0 and 1
-
Mutex – binary semaphore with the concept of ownership. This means that only the process/thread that locked the mutex can unlock it.
-
Condition variable – a data type that is used to block a process or thread until a particular condition is true
-
Event flags – a memory word used as a synchronization mechanism.
-
Messages – two processes exchange information and may be used for synchronization
-
Spinlocks – mutual exclusion mechanism in which a process executes in an infinite loop waiting for the value of a lock variable to indicate availability
-
-
In message passing both sender and receiver can be blocking or non-blocking
- Non-blocking solves the problem where a message might be lost or a process fails before it sends an anticipated message. Prevents the receiving process from being blocked indefinitely
-
Indirect addressing – messages are sent to a shared data structure consisting of queues that can temporarily hold messages. The strength of the use of indirect addressing is that by decoupling the sender and receiver it allows for greater flexibility in the use of messages
Linux and real-time variants
Linux kernel – monolithic kernel with loadable modules
Concurrency
- Atomic operations
- guaranteed to perform without interruptions or interference
- Spinlocks
- A thread that spins instead of sleeps until the mutex it is waiting for is released
- Semaphores
- Semaphores for kernel space
- Barriers
- Enforce order in which instructions are executed
Modular monolithic kernel
– Includes virtually all of the OS functionality in one large block of code that runs as a single process with a single address space. All the functional components of the kernel have access to all of its internal data structures and routines. Linux is structured as a collection of modules
-
Loadable modules – A module is an object file whose code can be linked to and unlinked from the kernel at run-time. A module is executed in kernel mode on behalf of the current process.
- Dynamic linking and stackable modules
Linux page replacement
- Least frequently used policy – each page in main memory has a counter, which is incremented each time it is accessed. Periodically decremented. The lower the counter, the less frequently it is used and will be preferred for replacement.
Kernel memory allocation
– Manages physical main memory page frames
-
Primary function is to allocate and deallocate frames for particular users (like user-space processes, dynamically allocated kernel data, static kernel code, page cache)
-
A buddy algorithm is used so memory for the kernel can be allocated and deallocated in units of one or more pages
-
Page allocator alone would be inefficient because the kernel requires small short-term memory chunks in odd sizes
-
Slab allocation – used by Linux to accommodate small chunks
Linux scheduling
-
Three classes – FIFO, Round-robin, non-real-time threads
-
Linux kernel not intended to be preemptive, and can’t be fully preemptive regardless of kernel configuration
Linux thread model
-
Originally there were no threads in Linux, only processes. Threads could be implemented in user space inside the processes. One blocked thread would block the whole process
-
Now, processes and threads are almost the same, both are running as separate “kernel threads”. Threads share the same virtual memory. Context switch between threads are faster than between processes. The kernel schedules each kernel thread individually.
User space
– Use System Calls to request kernel services. Kernel can notify user-space program with Signals
Kernel space
-
Reasons for developing in kernel space: User space programs can only access the services already provided by the kernel. Need to access an unsupported device or a special feature of the CPU of the kernel. Functionality can be implemented in kernel space to reduce the number of “context switches” which reduce performance.
-
Disadvantages with kernel space: The normal user-space libraries can’t be used in kernel space
-
Root is the highest level of privilege you can have in user-space. Not the same as kernel space.
Real-time
-
Linux is not a real-time system and the kernel is not preemptive.
-
Soft real-time capabilities can be improved. Configure the kernel to be as preemptive as possible
-
How to get harder real-time
-
Make Linux kernel full preemptive, thus more predictable – effect is uncertain
-
Add small real-time co-kernel that runs below Linux (RT-Linux, RTAI, Xenomai) Linux tasks cannot disturb real-time tasks
-
Outsource the hard real-time tasks to a slave microcontroller – provide better control of latencies
- When outsourcing some of the real-time tasks to a specialized microcontroller, the Linux computer act as a supervisor. Suitable for systems that have a small, separate real-time component.
-
Xenomai
previous versions used a co-kernel or a dual kernel
-
Dual kernel approach – Cobalt core runs beside the Linux kernel. Best performance, lowest latency.
-
Single kernel approach – native Linux kernel normally patched with PREEMPT-RI – Mercury core
-
Real-time device model (RTDM) – if a Xenomai program uses a Linux driver e.g. some hardware device, the driver will execute in Linux context, not Xenomai – not hard real-time and inefficient due to many switches between Linux and Xenomai
From book
-
Most UNIX kernels are monolithic. Monolithic kernels include virtually all the OS functionality in one large block of code that runs as a single process with a single address space.
-
Linux is structured as a collection of modules, a number of which can be automatically loaded and unloaded on demand.
-
A module is an object file whose code can be linked to and unlinked from the kernel at runtime.
-
Although Linux may be considered monolithic, its modular structure overcomes some of the difficulties in developing and evolving the kernel.
-
Dynamic linking – a kernel module can be loaded and linked into the kernel while the kernel is already in memory and executing. A module can also be unlinked and removed from memory at any time
-
Stackable modules – the modules are arranged in a hierarchy. Individual modules serve as libraries when they are referenced by client modules higher up in the hierarchy, and as clients when they reference modules further down.
-
Kernel components
- Signals
- Kernel uses signals to call into a process. Used to notify a process of certain faults.
- System calls
- The system call is the means by which a process requests a specific kernel service. Six categories – file system, process, scheduling, IPC, socket and miscellaneous
- Process and scheduler
- Manages and schedules processes
- Virtual memory
- Allocates and manages virtual memory for processes
- File system
- Provide a global hierarchical namespace for files, directories, and other file-related objects and provide file system functions
- Network protocols
- Support the sockets interface to users for the TCP/IP protocol suite
- Character device drivers
- Manage devices that require the kernel to send or receive data one byte at a time, such as terminals, modems and printers
- Block device drivers
- Manage devices that read and write data in blocks, such as various forms of secondary memory
- Network device drivers
- Manage network interface cards and communications ports that connect to network devices, such as bridges and routers
- Traps and faults
- Handle traps and faults generated by the processor, such as a memory fault
- Physical memory
- Manages the pool of page frames in real memory and allocates pages for virtual memory
- Interrupts
- handle interrupts for peripheral devices
Linux tasks
-
A process, or task, in Linux is represented by a task_struct data structure. The structure contains information about the following categories
- State, scheduling information, identifiers, ICP, links, times and timers, file system, address space, processor-specific context
-
States – running, interruptible, uninterruptible, stopped, zombie (terminated but in process table)
Linux threads
-
Traditional UNIX systems support a single thread of execution per process, modern UNIX systems typically provide support for multiple kernel-level threads per process
-
A new process is created by coping the attributes of the current process. A new process can be cloned so that it shares resources, such as files, signal handlers and virtual memory. When the two processes share the same virtual memory, they function as threads within a single process.
-
Although clones processes that are part of the same process group can share the same memory space, they cannot share the same user stacks. Thus the clone() call creates separate stack spaces for each process
Linux namespace
- A namespace enables a process (or multiple processes that share the same namespace) to have a different view of the system than other processes that have other associated namespaces. One of the overall goals of namespaces is to support the implementation of control groups, a tool for lightweight virtualization that provides a process or group of processes with the illusion that they are the only processes in the system
Memory management
-
Three-level page table structure
-
Page directory – an active process has a single page directory that is the size of one page. Each entry in the page directory points to one page of the page middle directory. The page directory must be in main memory for an active process.
-
Page middle directory – may span multiple pages. Each entry in the page middle directory points to one page in the page table
-
Page table – may span multiple pages. Each page table entry refers to one virtual page in the process
-
-
Each time a page is accessed, the age variable is incremented. Linux periodically sweeps through the global page pool and decrements the age variable for each page as its rotates through all the pages in main memory. A page with the age 0 is the best candidate for replacement.
Real-time Scheduling
Fixed-Priority Scheduling (FPS)
-
If preemptive (which it probably should be), then the highest priority available task will always run.
-
Optimal priority assignment for periodic, independent tasks is rate monotonic.
- The shorter the period, the higher priority
Earliest deadline first (EDF)
-
The priority of each task will change dynamically based on the current deadlines. EDF is able to schedule any task set as long as it takes less time to execute than what is available – optimal.
-
EDF is more difficult to implement than fixed priority, it has more overhead. Difficult to incorporate tasks without deadlines in EDF. FPS is more predictable if the system gets overloaded.
Scheduling tests
- Sufficient
- If the test is passed, the tasks are certain to meet their deadlines. Utilization based test. Periods that are multiples of each other are considered to be in the same family.
- Necessary
- If the test is failed, the tasks will miss their deadlines.
- Exact
- Both necessary and sufficient. Response-time analysis – a method for calculating the worst case execution time of a task when there are higher priority tasks that interfere.
Utilization-based scheduability test is a simple test of fixed priority scheduling with rate monotonic priorities. It is sufficient (not necessary).
where
Tasks that have periods that are multiples of each other are considered to be in the same family.
Example
- All tasks are in family (Periods: 20, 40, 80)
-
Not necessary to use 3 task limit (78.0%), instead the 1 task limit (100%).
-
Schedulability for EDF can also be tested with utilization, but with another formula than for FPS;
This means that EDF is able to schedule any task set as long as it takes less time to execute than what is available. In this sense we can say that EDF is optimal.
- Utilization test cannot be used on deadline monotonic!
- Have to do graphical analysis or response time analysis
Priority inversion
– when a high priority task has to wait for a lower-priority task. Okay if high-priority task is blocked by low priority due to shared resource. Not okay if medium priority prevents the low-priority task from completing and return the mutex.
Priority Ceiling protocols
- Priority inheritance
- Reduces the effect of priority inversions but does not prevent deadlocks – ceiling protocols do. Makes it easier to calculate the worst case blocking time.
- Immediate ceiling priority protocol (ICPP)
- Each shared resource is assigned the priority of the highest priority task that uses it. When a task is allowed to access a shared resource, it will inherit its priority. This means that when a task starts using a resource, it is guaranteed to be uninterrupted until it is done with the resource. Minimize the time a resource is locked.
- Original ceiling priority protocol (OCPP)
- A task inherits the priority of a resource ONLY if it blocks a higher priority task. A task can only access a shared resource if its priority is higher than the priority of all locked resources. Does not need to know the priority of the tasks in advance.
ICCP is easier to implement than OCPP, which has to keep track of what is blocking what. ICCP leads to less context switches as blocking is prior to first execution. ICPP requires more priority movements as this happens with all resource usage. OCPP changes priority only if an actual block has occurred.
Deadline monotonic priority
Rate monotonic is not suitable when deadline is shorter than period (D < T). Not the same as EDF, as priorities are still static. Same as rate monotonic when D = T.
Adding block time to response time analysis
The usage function: 1 for shared resources that are used by lower and higher (than i) priority tasks. 0 for others.
Scheduling aperiodic tasks
Deadline monotonic is suitable for sporadic tasks. In a rate/deadline monotonic scheme, an aperiodic task could be given the lowest priority (suffer from starvation). For better response of aperiodic tasks, we use Execution-time servers which maintain a budget for clients. An Execution-time server runs as one task / process (periodic or sporadic) in the system. It can be included in a response time analysis modeled as such a task.
- Periodic/Polling server
- Each period the computation time budget is available for any pending clients – if a client needs more, it will continue next period. If one or more clients are pending, they will be scheduled within the server. The budget has to be used when the server is running or it is wasted.
- Defferable server
- Only lose budget when used. Better response time for aperiodic tasks. Back-to-back, where one budget is used just after another.
- Sporadic server
- Budget is replenished one-time period after it has started to be used.
Finding worst case execution time
Highest theoretical value, actual value (impossible to get), worst execution time found when testing x number of times
From book
-
No actual tasks exist at run-time; each minor cycle is just a sequence of procedure calls
- Task based scheduling – tasks exist at run time
-
Earliest deadline first – the absolute deadlines are computed at run time and hence the scheme is described as dynamic
Tests
- Sufficient
- Pass the test will meet deadlines
- Necessary
- Fail the test will miss deadlines
- Exact
- Necessary and sufficient
Simple task model
- Fixed set of tasks
- Periodic, with known periods
- Independent
- Overheads, context-switching times are ignored
- Deadline equal to period
- Fixed worst-case execution time
Sporadic tasks have a minimum inter-arrival time – require D<T
Aperiodic tasks – do not have minimum inter-arrival times. Can run aperiodic tasks at a priority below the priorities assigned to hard processes, therefore, they cannot steal, in preemptive systems, resources from the hard processes – often miss deadlines.
Fixed-Priority Scheduling(FPS) vs. Earliest deadline first (EDF)
- FPS is easier to implement as priorities are static
- EDF is dynamic and requires a more complex runtime system which will have higher overhead
- It is easier to incorporate tasks without deadlines into FPS; arbitrary deadline is more artificial
- Easier to incorporate other factors into the notion of priority rather than deadline
- FPS more predictable during overload situations, but EDF gets more out of the processor
QNX
QNX microkernel
-
A microkernel is an operating system kernel that only contain what has to run in kernel mode. (Scheduling, memory management and inter-process communication (IPC).
-
There is no “kernel process”, the CPU executes code in the kernel only when an explicit kernel call is made, an exception is raised or to response to a hardware interrupt. All based around ICP.
Fundamental services in QNX
-
Thread services via POSIX thread-creation primitives
-
Signal services via POSIX signal primitives
-
Message-passing services – the microkernel handles the routing of all messages between all threads throughout the entire system
-
Synchronization services via POSIX thread-synchronization
-
Scheduling services – the microkernel schedules threads for execution using the various POSIX real-time scheduling policies.
-
Timer services – the microkernel provides the rich set of POSIX timer services
-
Process management services – managing processes, memory and the pathname space
-
POSIX – portable operating system interface
-
To make compatibility between operating systems easier
-
Defines the API (application programming interface) between user and programs and the kernel
-
QNX is POSIX certified, but non-UNIX
QNX is suitable for real-time and embedded because
-
QNX is scalable – possible to make very small/large
-
Microkernel is in general suitable for real-time applications
-
No large non-preemptive kernel processes
-
Only very short periods that a process is in a non-preemptive kernel state (entry and exit)
-
-
Scheduling – highest priority process will run. Priority inheritance. Sporadic server
-
Built in inter-process communication
QNX is not open source – very expensive
Threads and processes
-
A thread is the unit of scheduling – QNX schedules threads, not processes
-
A process is a container for threads, each process must have at least one thread and it also contains address space, etc.
-
Huge thread model – but most are variants of blocked state
Scheduling
-
One ready queue for each priority level
- 256 priorities, 0 is idle, 64-255 is root-only priorities, rest is configurable
-
Scheduling is performed when:
-
The running thread is blocked since it has to wait for some event. When unblocked it is placed on the end of its ready queue
-
The running thread is preempted due to a higher priority thread that becomes ready. The preempted thread is placed in the front of its ready queue.
-
The running thread voluntarily yields the processor, and is placed in the end of its ready queue. The scheduler selects the next thread, which can be the one that just yielded.
-
Scheduling methods – FIFO, round robin, sporadic server
-
Sporadic server
- Is in QNX implemented as a thread that will keep its normal priority for as long as it is within its budget. When budget is exceeded, the priority will drop to a low level until the budget is replenished
Interrupts
- low latencies, because interrupts are enabled almost all of the time (only certain critical code sections requires interrupt to be disabled). QNX supports nested interrupts.
Message passing
-
Things that typically are implemented in the kernel are instead implemented as user-space processes.
-
The primary inter-process communication mechanism in QNX is message passing
- One process creates a channel. When another process attaches to the channel, they can exchange messages
-
Synchronous message passing
-
Client communication – MsgSend() gets SEND blocked, unless server is already RECEIVE blocked gets REPLY blocked when server receives message get READY when server MsgReply()
-
Server communication – MsgReceive() when ready to reply to clients get RECEIVE blocked, unless client is already SEND blocked get READY when client does MsgSend() send MsgReply() which will not block, since client is already synced
-
-
Message passing has priority inheritance
-
Server receives a message from a high priority client – server inherits the (higher) priority when the client sends the message
-
Server receives a message from a low priority client – server inherits the (lower) priority when the server receives the message
-
Processes share memory – protected by MUTEX etc.
QNX process manager (procnto) is required for all run-time system
-
User process, not in kernel mode
-
Provides process management, manages process creation, destruction and process attributes such as user ID and group ID. Manages a range of memory-protection capabilities, shared libraries and inter-process POSIX shared-memory primitives
Memory management
– provides protection unlike many other RTOS (choose not to because of large overhead)
- The overhead is less of a problem than the advantages it provides – adds robustness, threads that try to access memory which it is not allowed to access can be killed. Mitigates problems caused by accidental overwriting some memory location, which is typically very difficult to debug
QNX Qnet
-
Used for accessing remote file systems, scale applications and extend applications from a single machine to several machines and distribute processes among those CPUs
-
Qnet file system is shared between Qnet nodes. Resources and IPC mechanisms are available on the file system, is available for other nodes. Includes message passing, named semaphores, signals and message passing queues.
VxWorks
-
Features like – microkernel, fast response time, broad support, scalable and deterministic
-
Board support package (BSP) – supports different CPUs, like x86, ARM and Power PC
Memory management
-
Originally no memory protection – all tasks share memory. Convenient for communication between tasks, but on task can overwrite the memory of another
-
Newest version has memory management
-
Newer versions also have real-time process (RTP) - similar to process. Task is similar to thread. Like QNX, all tasks are scheduled independently by the kernel
-
Non-overlapped memory model – all virtual addresses in the system are unique – allow the same system to work with memory management units (MMU) in the CPU enabled or disabled
Kernel
-
Earlier versions were running on top of 3rd party kernels (VRTX or pSOS) – Wind River systems developed their own “Wind Kernel”
-
Older versions were running in “kernel mode” and had access to all memory
-
In newer versions, there is both user and kernel mode, and memory protection (more microkernel in the right sense of the term)
Task model
- Consists of states ready, suspended, delayed, pended. Tasks can be in more than one state at a time
Single-core scheduling
- VxWorks native scheduler is preemptive, highest priority task will always run, same priority tasks can either be FCFS or round robin (same as QNX)
Multi-core scheduling
-
All tasks can run on any CPU core by default
-
A task can be given a CPU affinity – a task will only run on that CPU core (same memory cache – less data moved)
-
A task can reserve a CPU core – only this task can run on that CPU core
Foreground and background RTPs
-
Foreground RTPs is assigned to a time partition
-
Background RTPs run within a time partition when all foreground RTPs have completed
Inter-process communication
-
Semaphores, mutex, message queues
-
Also has task lock – can disable and re-enable preemption. Faster than semaphores and useful if you need frequent and short protected access. Prevents all other tasks from running, not just the tasks contending for the resource
Wind River Linux
-
a Linux based system as an alternative to VxWorks based system. Provided with both PREEMPT-RT and co-kernel
-
VxWorks for hard real-time and Wind River Linux for soft/hard-ish real-time systems
Embedded Linux and Multiprocessor scheduling
Embedded development
-
One or more targets and one host computer (usually)
-
Host computer – program and compile code (more powerful)
-
Target computer – embedded computer that runs the code
Embedded Linux
-
Advantages – customizable and configurable, runs on a large number of architectures, large and active user community, free
-
Disadvantages – not hard real-time by default, too many alternatives, difficult, hardware capability issues
Consumer grade ARM boards
-
Typically uses an ARM SoC CPU in the range of single-core 700MHz to quad-Core 1.8 GHz – similar CPUs that are used in mobile phones
-
Hybrid between a micro-controller and a desktop computer – HDMA, VGA, USB, Ethernet, UART, SPI, I2C, GPIO, PWM, ADC and external interrupts
-
Advantages – inexpensive, small, low power consumption, working Linux distribution is usually supplied, open hardware platform, low-level and high-level I/O
-
Disadvantages – not real-time by default, unknown reliability, intended for hobbyists, not industry
Target vs Host
- Use only target – easy to set up, less powerful, limited development tools
Develop on host, build on target
- Might not have necessary build tools on target. Host has good development tools
Develop and build on host, transfer to target
-
Transfer binaries built on host to target
-
Will not work if host and targets are too different (different shared libraries)
-
Difficult to develop and test functionality that only the target has
-
Cross-compile – a compiler that runs on one computer and makes binaries for another. Challenging to set up but very good solution when it works.
-
Cross-compile and Flash – for hardware that does not have removable storage, network etc. use an external tool or debugger, to flash some content to some internal flash memory (JTAG)
-
Cross-compiler toolchain for C consists at least of a C-library, compiler and binary tools (linker, assembler etc.)
-
Easier ways to get a cross toolchain – install a pre-compiled toolchain, use a tool to make the toolchain, use a build system
Embedded Linux build systems
-
compromise between using a binary distribution and having to make everything from scratch
-
Buildroot – builds a root file-system image, no binary packages, muck simpler to use, understand and modify
Best is to use whatever tool that offers best support for your hardware
uClibc
-
Used as alternative for embedded Linux, instead of glibc
-
Not ABI compatibility between versions. Not everything in glibc is supported in uClibc
Embedded Linux vs QNX/VxWorks
-
The setup of embedded Linux system and development environment is provided when using commercial systems as QNX or VxWorks (cross-toolchain, tool to configure and make a file system of image file)
-
Freeness of Linux comes at a cost
Microprocessor scheduling
-
Multicore computers are computers with multiple CPU cores on the same physical CPU chip, share memory and usually share operating system
-
Assignment of processes to cores:
-
Static assignment to a core – each process is permanently assigned to a core, one queue of processes for each core, minimal scheduling overhead, may result in one core being idle while another one is very busy
-
Global queue – each process is added to a global queue of processes. When a processor core is idle it will select the next process and start running it. No core is idle as long as there is available work. Preempted or blocked processes are likely to run on a different core next, which makes caching less effective
-
Dynamic load balancing – each process is assigned to a core – may be moved, depending on load
-
-
How to assign processes:
-
Master/slave architecture – kernel scheduling functions always run on the same master core, slave cores send requests to the master. Simple and requires little changes from uniprocessor scheduling. Master can become a bottleneck and a single point of failure
-
Peer architecture – kernel scheduling functions can run on any core. Each core schedule itself with available processes. Must make sure that no two cores run the same process.
-
A process can get full access to a processor core of its own, like VxWorks
-
-
Real-time on multicore
-
Difficult since another potential uncertainty is introduced
-
Moving a thread from one core to another will cause a longer and more unpredictable delay than normal scheduling
-
From book
- Real-time computing may be defined as that type of computing in which the correctness of the system depends not only on the logical results of the computation but also on the time at which the results are produced.
Windows
Kernel and executive
-
Typical kernel-level tasks are divided into the Kernel and the Executive
-
Kernel controls how the processors are used – schedule threads. Exception and interrupt handling. Synchronize between multiprocessors
-
The Executive provides the core OS services – runs as kernel-level thread. Provides memory management; creates, manages and deletes processes and threads; manages I/O etc.
-
The kernel is in many ways similar to a micro-kernel. But since the executive (and many other components) also are running as kernel-mode, Windows is not a micro-kernel system
-
Other kernel-mode components – hardware abstraction layer between kernel and hardware. Means that kernel does not need to consider the actual hardware. Device drivers – extend the functionality of the Executive, typically used for interacting with specific I/O device
User-mode processes
-
Special system processes – user-mode services that manage the system, session manager and authentication subsystem
-
Service processes – “server” programs. Only way to run a non-system background program
-
Environment subsystems – different “OS personalities” or environment (Win32, POSIX)
-
User applications – normal user programs (.exe) and libraries (.dll)
Client/server model
-
For OS services
-
Use remote procedure calls, which is a type of inter-process communication
- The caller/client causes a procedure to run in another address space belonging to a server
- Could be another computer or the same computer but in different virtual address space
Windows objects
-
Windows is written in C but has object -based design
-
Some of the internal system components of Windows exists as objects. Things like processes, threads etc. are objects. The object manager keeps track of all the objects in the system
-
Objects have both data (attributes) and procedures (services). Attributes are only accessible through services, thus it is possible to protect from unauthorized control
Windows processes
-
Implemented as a process object
-
Virtual address space and other attributes
-
Contains one or more threads that execute code and can be scheduled by the Kernel
-
Has a handle table with references to the different objects known to the process
Windows threads
-
Threads are similar to other systems – scheduled by the Kernel, share address space with other threads in the same process. A thread pool is a collection of worker threads that is available to perform asynchronous callbacks in the background
-
Fibers – one level below threads, one thread can schedule multiple fibers within itself. Run in the context of the thread that schedule it
-
User-mode scheduling (UMS) – lightweight mechanisms for applications to schedule its own threads. Differs from fibers as a UMS thread, has its own context
-
Both fibers and UMS have potential problems like one blocked fiber/thread blocks all
-
States – Same as the 5-state model except for the extra transitions state, means that a thread is ready to run, but all its resources are not yet available
Windows scheduling
-
Real-time priority class (16 to 31), fixed priorities that never change. Round-robin between threads with same priority
-
Variable priority class (0 to 15), vary depending on use, can get boosted to 15 if starved. Will never become higher than 15
Multiprocessor scheduling
-
Any thread can run on any processor
-
By default, soft affinity is used, which means that the Kernel will try to assign a thread to its previous processor so that cache can be reused
-
Can also be assigned with hard affinity, which means that it will only run on one processor
Windows dispatcher objects are used for synchronization
Concurrency mechanisms
-
Critical section protects critical code – similar to mutex
-
Slim reader-writer – like critical section
-
Condition variables
-
Lock-free synchronization – atomic access to a resource without using a lock/mutex, a thread cannot get preempted until it is done with the resource
Windows embedded/real-time – Windows Embedded Compact Edition
-
Familiar for many developers.
-
256 priority levels – guaranteed upper bound on high-priority thread scheduling and on delay in running high-priority interrupt service routines (ISRs)
-
Typical RT functionality:
-
Support for handling priority inversion with priority inheritance
-
Support for nested interrupts to ensure that high-priority events are not delayed
-
Support for 1-ms system tick timing
-
Advanced thread timing and scheduling
-
Support for synchronization objects such as semaphores, mutexes and critical sections
-
-
Board support package system (BSP) – x86, ARM, MIPS, SH
From book
Core
- The core of Windows NT, consisting of the Kernel and Executive was designed for 32-bit machines and included object-oriented features
Kernel-mode components
-
Executive – contains the core OS services, such as memory management, process and thread management, security, I/O, IPC
-
Kernel – controls execution of processors. The Kernel manages thread scheduling, process switching, exception and interrupt handling, and multiprocessor synchronization. Unlike the rest of the Executive and the user level, the Kernel’s own code does not run in threads
-
Hardware abstraction layer (HAL) – maps between generic hardware commands and responses and those unique to a specific platform. It isolates the OS from platform-specific hardware differences. The HAL makes each computer’s system bus, DMA controller, interrupt controller, system timers, and memory controller look the same to the Executive and Kernel components. It also delivers the support needed for SMP
-
Device drivers – dynamic libraries that extend the functionality of the Executive. These include hardware device drivers that translate user I/O function calls into specific hardware device I/O requests and software components for implementing file systems, network protocols, and any other system extensions that need to run in kernel mode
-
Windowing and graphics system – implements the GUI functions, such as dealing with windows, user interface controls and drawing
Executive modules
- I/O manager, Cache manager, object manager, plug-and-play manager, power manager, security reference monitor, virtual memory manager, process/thread manager, configuration manager, advanced local procedure call (ALPC) facility
User-mode processes
- special system processes, service processes, environment subsystem, user applications
The Windows OS services, the environment subsystems, and the applications are structured using the client/server computing model
Threads and symmetric multiprocessing (SMP)
-
OS routines can run on any available processor, and different routines can execute simultaneously on different processors
-
Multiple threads within the same process may execute on different processors simultaneously
-
Server processes may use multiple threads to process requests from more than one client simultaneously
-
Processes share data and resources with IPC
Windows objects
-
Used in cases where data are intended for user mode access or when data access is shared or restricted
-
Encapsulation – an object consists of one or more items of data, called attributes, and one or more procedures that may be performed on those data, called services
-
Object class and instance – an object class is a template that lists the attributes and services of an object and defines certain objects characteristics. The OS can create specific instances of an object class as needed
-
Inheritance – the Executive uses inheritance to extend object classes by adding new features. Every Executive class is based on a base class which specifies virtual methods that support creating, naming, securing and deleting objects. Dispatcher objects are Executive objects that inherit the properties of an event object, so they can use common synchronization methods
-
Polymorphism – API functions are used to manipulate objects of any type
-
Process and thread management
-
An application consists of one or more processes.
-
A process provides the necessary resources to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, but can create additional threads from any of its threads
-
A thread is the entity within a process that can be scheduled for execution. All threads of a process share its virtual address space and system resources. In addition, each thread maintains exception handlers, a scheduling priority, thread local storage, a unique thread identifier, and a set of structures the system will use to save the thread context until it is scheduled. On a multiprocessor computer, the system can simultaneously execute as many threads as there are processors on the computer
-
A job object allows groups of processes to be managed as a unit. Job objects are nameable, securable, shareable objects that control attributes of the processes associated with them
-
A thread pool is a collection of worker threads that efficiently execute asynchronous callbacks on behalf of the application. Used to reduce the number of application threads and provide memory management of the worker threads
-
A fiber is a unit of execution that must be manually scheduled by the application. Fiber has state information about the threads stack, a subset of its register, and the fiber data provided during fiber creation
-
User-mode scheduling (UMS) is a lightweight mechanism that applications can use to schedule their own threads
Characteristics of Windows processes
-
Implemented as objects
-
Processes can be created as a new process or as a copy of an existing process
-
An executable process may contain one or more threads
-
Both process and thread objects have built-in synchronization capabilities
Concurrency mechanisms
- Provide synchronization among threads
-
Executive dispatcher objects, user-mode critical sections, slim reader-writer locks, condition variables and lock-free operations
- Synchronization objects – notification event, synchronization event, mutex, semaphore, waitable timer, file, process, thread
Memory management
-
The windows virtual memory manager controls how memory is allocated and how paging is performed. The memory manager is designed to operate over a variety of platforms and to use page sizes ranging from 4 Kbytes to 64Kbytes
-
Virtual address map – on 32-bit platforms, each Windows user process sees a separate 32-bit address space, allowing 4 Gb of virtual memory per process. By default, half of this memory is reserved for the OS, so each user actually has 2 Gb of available virtual address space and all processes share most of the upper 2 GB of system space when running in kernel mode
-
Swapping – Paging will hold items that haven’t been accessed in a long time, whereas swapping holds items that were recently taken out of memory
Scheduling
-
Windows implements a preemptive scheduler with a flexible system of priority levels that includes RR scheduling within each level and, for some levels, dynamic priority variation on the basis of their current thread activity. Threads are the unit of scheduling in Windows rather than processes
-
Priorities in Windows are organized into two classes: real time and variable, each consists of 16 priority levels. Each level has a FIFO queue.
-
Soft affinity – dispatcher tries to assign a ready thread to the same processor it last ran on
-
Hard affinity – thread executes only on certain processors
Android
-
Same as Linux kernel but, some changes to the kernel, instead of glibc, it uses its own standard C-library Bionic. It does not use the same X Window System that most Linux graphical user interfaces uses
-
Regular Linux programs and libraries are not directly compatible Android
-
Use ahead-of-time compilation instead of just-in-time
Android applications
-
Consists of one or more instances of four types of components:
-
Activities – A screen visible as a user-interface. An application can have multiple activities, each for a separate task. Activities can be external, thus it can be started by another application
-
Services – a background task. Can continue to run even when another application is used.
-
Content providers – An interface to application data. Data can be stored in a file, SQLite database etc.
-
Broadcast receivers – Responds to system wide broadcast announcements. Events from other applications or the system can have an effect on the application
Inter-process communication
-
Not typical Linux IPC mechanisms
-
Has its own addition to the Linux kernel called Binder, which implements a light weight remote procedure call (RPC) capability. Allow communication between processes running on two different virtual machines
Android and real-time
- Android is probably not the first choice for a real-time system – but could be useful in some situation. A mixed-criticality system with touch-screen interface. A personal (always carried) device needing to do something in real-time
From book
-
Standard OS for IoT
-
Defined as a software stack that includes the OS kernel, middleware, and key applications
- Complete software stack – not just OS. Android is a form of embedded Linux
-
All applications that the user interacts with directly are part of the application layer
-
Open-source android architecture
-
Every Android application runs in its own process, with its own instances of the Dalvik virtual machine
-
Linux kernel
-
The OS kernel for Android is similar to, but not identical with, the standard Linux kernel distribution. One noteworthy change is that the Android kernel lacks drivers not applicable in mobile environments, making the kernel smaller
-
Relies on its Linux kernel for core system services such as security, memory management, process management, network stack and driver model
-
The kernel acts as an abstraction layer between the hardware and the rest of the software stack and enables Android to use the wide range of hardware drivers that Linux support
-
-
Process and thread management
-
Four types of application components
-
Activities – correspond to a single scree visible as a user interface
-
Services – perform background operations that take a considerable amount of time to finish
-
Content providers – act as an interface to application data that can be used by the application
-
Broadcast receivers – respond to system-wide broadcast announcements
-
-
Default allocation of process and treads is a single of each. To avoid slowing down the user interface when slow and/or blocking operations occur in a component, the developer can create multiple treads within a process and/or multiple processes within an application
-
-
IPC
-
Does not use pipes, shared memory, messages, sockets, semaphores or signals
-
Binder – provides a lightweight remote procedure call (RPC) capability that is efficient in terms of both memory and processing requirements, and is well suited to the requirements of an embedded system
- The Binder is used to mediate all interaction between two processes. A component in one process (the client) issues a call. This call is directed to the Binder in the kernel, which passes the call on to the destination process (the service). The return goes through the Binder and is delivered to the calling process.
-
Comparison of OS
No OS
-
When to use
-
Simple applications
-
When system is only doing one (or very few) things simultaneously
-
-
Advantages
-
Full control
-
Can be deterministic
-
No CPU and memory is spent on OS, on low-powered microcontrollers
-
Some low-level operations are much simpler than a OS
-
-
Disadvantages
-
Tricky to do multiple things
-
Often need to read CPU datasheet to use features
-
Often not scalable
-
Free RTOS
-
When to use
-
Simple applications
-
When you need multitasking etc. on a simple microcontroller
-
-
Advantages
-
The advantages of bare metal at the same time as simple operating system capabilities
-
Operating system for hardware that can’t run other operating systems
-
-
Disadvantages
- For better or worse, a different type of system than the other we have covered
Android
-
When to use
-
On hand-held devices
-
Graphical systems in ARM CPUs
-
-
Advantages
-
Wide adoption
-
Many apps, libraries etc. available
-
Linux kernel customized and minimized for hardware
-
Every app run within its own virtual machine
-
-
Disadvantages
- Not intended for real-time system or control systems
Linux
-
When to use
-
Soft real-time
-
Developers have Linux experience
-
When many different programs and libraries are needed/useful
-
-
Advantages
-
Support for many platforms and system types
-
Large user community
-
Scalable and configurable
-
Many free and open source tools, programs, libraries etc.
-
-
Disadvantages
-
Not hard real-time
-
Unpredictable periods when kernel is non-preemptive
-
Costly to spend time on setting up system instead of buying something that just works
-
Real-time Linux
-
When to use
-
Linux with hard real-time
-
Mixed criticality systems
-
-
Advantages
-
A fully functional Linux operating system with the addition of hard real-time
-
Separation between real-time tasks and regular Linux processes
-
-
Disadvantages
-
Real-rime tasks cannot fully use the Linux system
-
Difficult to set up, experience with Linux in not directly applicable for development with real-time tasks
-
Windows
-
When to use
-
Developers have Windows experience
-
Existing Windows code
-
-
Advantages
-
Embedded compact has suitable capabilities for hard real-time
-
Many tools, programs, libraries etc.
-
-
Disadvantages
-
Cost
-
Consumer versions are not suitable for embedded/real-time
-
QNX
-
When to use
-
When a proven, commercial real-time system is required
-
Distributed control systems
-
-
Advantages
-
Hard real-time
-
Specialized development tools
-
Microkernel
-
Customizable
-
-
Disadvantages
-
Cost
-
Environment is unfamiliar for many
-
VxWorks
-
When to use
-
When a proven, commercial real-time system is required
-
Arguably lower-weight than QNX
-
-
Advantages
-
Hard real-time
-
Specialized development tools
-
Microkernel
-
Customizable
-
Same vendor provides commercial Linux alternative
-
-
Disadvantages
-
Cost
-
Environment is unfamiliar for many
-
Uncategorized
Microkernel vs monolithic kernel A microkernel differs from a monolithic kernel in that it only the most basic functions are available from system calls; the kernel is broken down into separate processes, known as servers. This structure is implemented in many real-time operating systems. The practical purpose of using a microkernel is to sacrifice some performance for reliability. In a monolithic structure, a service is obtained by a single system call. In a microkernel structure a service is obtained through Inter-process Communication (IPC), this causes overhead due to the required context switch. However, since the microkernel is divided into different servers, if one fails, the other servers will work efficiently. In critical operations as control systems often are, this is an extremely important feature.
Utilization test example
Is this task set schedulable?
Task | Execution time |
Period |
A | 2 | 8 |
B | 3 | 10 |
C | 4 | 16 |
Response time analysis example
Use earliest deadline first (EDF)
Task | Execution time |
Period |
Deadline |
A | 2 | 8 | 8 |
B | 3 | 10 | 7 |
C | 4 | 16 | 16 |
Shortest deadline is
Response time is
where
Finding convergent response times:
Graphical EDF
Using the same task set as above.
Priority invertion examples
Higher priority = higher value
Symbols:
X
- Normal execution
Q
- Execution with resource
$Q$ V
- Execution with resource
$V$ B
- Execution with both (resource
$Q$ and$V$ ) -
- Sleeping (preempted)
*
- Waiting on resource
Inheritance example
Task set:
Task | Arrival | Priority | Execution |
A | 0 | 1 | XQQQX |
B | 4 | 2 | XXXXXXXXX |
C | 2 | 3 | XQX |
Without priority inheritance
|1
|0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|
A |X|Q|-|Q|-|-|-|-|-|-|-|-|-|Q|-|-|X|
B | | | | |X|X|X|X|X|X|X|X|X| | | | |
C | | |X|*|*|*|*|*|*|*|*|*|*|*|Q|X| |
With priority inheritance
|1
|0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|
A |X|Q|-|Q|Q|-|-|-|-|-|-|-|-|-|-|-|X|
B | | |-|-|-|-|-|X|X|X|X|X|X|X|X|X| |
C | | |X|*|*|Q|X| | | | | | | | | | |
Priority ceiling example
Task set:
Task | Arrival | Priority | Execution |
A | 0 | 1 | XQQQX |
B | 4 | 2 | XXXXXXXXX |
C | 2 | 3 | XQBVX |
D | 4 | 4 | XVBQX |
Without ceiling protocol (nor inheritance)
|1 |2
|0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1|2|3|
A |X|Q|-|Q|-|-|-|-|-|-|-|-|-|-|-|Q|-|-|-|-|-|-|-|X|
B | | | | |-|-|X|X|X|X|X|X|X|X|X| | | | | | | | | |
C | | |X|*|*|*|*|*|*|*|*|*|*|*|*|*|-|-|-|Q|B|V|X| |
D | | | | |X|V|*|*|*|*|*|*|*|*|*|*|B|Q|X| | | | | |
With ceiling protocol (ICPP)
|1 |2
|0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1|2|3|
A |X|Q|Q|Q|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|X|
B | | | | |-|-|-|-|-|-|-|-|-|-|X|X|X|X|X|X|X|X|X| |
C | | |-|-|-|-|-|-|-|X|Q|B|V|X| | | | | | | | | | |
D | | | | |X|V|B|Q|X| | | | | | | | | | | | | | | |