Table of Contents

Introduction
Hardware and Operating systems
Multiprogramming and memory
Uniprocessor scheduling, inter-process communication and FreeRTOS
Linux and real-time variants
Real-time Scheduling
QNX
VxWorks
Embedded Linux and Multiprocessor scheduling
Windows
Android
Comparison of OS
1. No OS
2. Free RTOS
3. Android
4. Linux
5. Real-time Linux
6. Windows
7. QNX
8. VxWorks
Uncategorized

‹

TTK4147: Real-time Systems

Tags:

Introduction

Real-time computing

Has guaranteed or predictable response time
Worst-case execution time is most interesting
Optimized for predictability while general purpose is optimized for speed and throughput
Not really possible to do with general-purpose operating systems
Not a fancy way of saying fast computing
Not irrelevant even if hardware is fast enough

Constraints

Type	Missed deadline
Hard real-time	total system failure
Firm real-time	results have no value
Soft real-time	results have less value

Task triggers

Type	Tasks performed
Periodic	at a regular time interval
Sporadic	when event occurs, limits on how often the event can occur
Aperiodic	when event occurs, no limitations on when it can occur

General purpose vs. real-time

General purpose	Real-time
Optimized for speed and throughput	Optimized for predictability
Usually comes with a wide range of applications, tools, etc.	Usually a minimized system without anything that is not necessary
Mouse, keyboard and screen	Serial interface or SSH (Specialized I/O)

Real-time operating systems (RTOS)

FreeRTOS

Minimalistic operating system for microcontrollers that often do not use operating systems
Run on many platforms

VxWorks and QNX

Proprietary operating systems for embedded and real-time systems
Stable and proven platforms
Expensive licenses

Real-time variants of Linux

Various methods for improving the real-time capabilities of Linux
RT-Linux, RTAI, Xenomai, preempt-RT

From book

Real-time system definition

Any system in which the time at which output is produced is significant. This is usually because the input corresponds to some movement in the physical world, and the output has to relate to that same movement. The lag from input time to output time must be sufficiently small for acceptable timeliness.

Cohesion: is concerned with how well a module holds together – its internal strength.
Coupling: by comparison is a measure of the interdependence of program modules. If two modules pass control information between them, they are said to possess high (or tight) coupling.

Within all design methods, a good decomposition is one that has strong cohesion and loose coupling. This principle is equally true in sequential and concurrent programming domains.

Various types of real-time systems: reactive systems, time-aware systems, time-triggered and event-triggered systems

Basic characteristics of a general real-time or embedded computer system

Real-time control, concurrent control of separate system components, low-level programming, support for numerical computation, largeness and complexity, extreme reliability and safety

Stages of design and implementation

requirements specification → system design → detailed design → coding → testing

Real-time language should be secure, readable, flexible, simple, portable and efficient

Real-time definitions overview

Hardware and Operating systems

Microprocessors

System-on-Chip (SoC) are typically used for more complex microcontrollers that also include graphic processing units, digital signal processors, FPGAs, etc.

4 types of instructions

Processor-memory: .
Processor-I/O: .
data processing: .
control: .

Interrupts

Program: Generated by some condition that is a result from an instruction execution
Timer: Generated by a timer within the processor that ca be programmed to trigger at regular interval or after a certain amount of time has passed
I/O: Generated by an I/O controller to signal completion of an operation or some other condition
Hardware failure: Generated by a failure in the hardware

Can have sequential interrupts or nested interrupts (interrupts within interrupts)

Memory hierarchy

Inboard memory

Registers
Cache

Outboard Storage

Magnetic Disk
CD-ROM
CD-RW
DVD-RW
DVD-RAM
Blu-Ray

Off-line Storage

Magnetic Tape

Cache

Write policy determines how often the changes in cache is written back to the main memory
- Update main memory every time cache block value is changed
- Update main memory when the cache block value is changed – leaves main memory in obsolete state

Caches are divided into levels to exploit locality (lower levels = faster).

CPU ↔ L1 cache ↔ L2 cache ↔ L3 cache ↔ Main memory

Transferring data from I/O to main memory

Continous polling: (CPU)
Interrupt driven: (CPU)
Direct memory access: Writes to DMA module instead of CPU. CPU only informed when transfer is complete.

Reasons for having operating system

Convenience – OS is the layer between computer hardware and user, interact with OS instead of hardware
- Instruction set architecture (ISA) – Machine language instructions that a computer can perform
- Facilitation of concurrency. Scheduling.
- Abstraction of hardware and stability.

Hardware to OS

Application binary interface (ABI) – User programs and libraries can access OS and hardware (ISA) services through system calls

OS to libraries

Application programming interface (API) – Provide functions that other user programs can use to access the OS and hardware through system calls and other commonly used functionality

Libraries come with OS but not strictly part of OS

Efficiency
- OS allow efficient use of the computer system resources – scheduling, memory control, I/O access
Ability to evolve
- Integrate updated hardware
- Development of new services
- Upgrading the OS and programs

From book

Processor: Controls the operation of the computer and performs its data processing functions. Where there is only one processor, it is often referred to as the central processing unit (CPU)
- One of the processor’s functions is to exchange data with memory. Makes use of two internal registers: a memory address register (MAR), which specifies the address in memory for the next read or write; and a memory buffer register (MBR). Which contains the data to be written into memory or which receives the data read from memory.
Main memory: Stores data and programs. This memory is typically volatile.
I/O modules: Move data between the computer and its external environment.
System bus: Provides communication among processors, main memory, and I/O modules.
Microprocessors are now multiprocessors; each chip (called a socket) contains multiple processors (called cores), each with multiple levels of large memory caches, and multiple logical processors sharing the execution units of each core.
Info about kernel: http://www.linfo.org/kernel.html
The classic microprocessor is giving way to the system on a chip (SoC), where not just the CPUs and caches are on the same chip, but also many of the other components of the system, such as DSPs, GPUs, I/O devices (such as radios and codecs), and main memory.
The replacement algorithm chooses, within the constraints of the mapping function, which block to replace when a new block is to be loaded into the cache and the cache already has all slots filled with other blocks
An OS is a program that controls the execution of application programs and acts as an interface between applications and the computer hardware. It has three objectives
- Convenience – an OS makes a computer more convenient to use
- Efficiency – An OS allows the computer system resources to be used in an efficient manner
- Ability to evolve – An OS should be constructed in such a way as to permit the effective development, testing, and introduction of new system functions without interfering with service
A monolithic kernel is implemented as a single process, with all elements sharing the same address space
A microkernel architecture assigns only a few essential functions to the kernel, including address spaces, IPC, and basic scheduling. Other OS services are provided by processes, sometimes called servers, that run in user mode and are treated like any other application by the microkernel

Multiprogramming and memory

Multiprogramming:

Maximize processor utilization, providing reasonable response times, let different tasks use different resources, allow tasks to communicate with each other

Process

A process is an entity that can be assigned to and executed by a processor
Each process has their own set of data
Dispatcher – Tells the OS which process to start running
5 state process model:
- Running – process currently executing in the CPU
- Ready – Process is in a queue waiting to be executed
- Blocked – running process has to wait for an event, remain blocked until the event occur, a separate queue for the blocked processes
- New – Process has been created but not yet added to the pool of executable processes
- Exit – Process that is removed from the pool of executable processes, but still exists in memory
Swapping – when there is no room for a process in main memory, it is swapped to disk -Seven-state process model includes suspend states for ready and blocked
Process image
- User data – the modifiable part of the user space. May include program data, a user stack area, and programs that may be modified
- User program – the program to be executed
- Stack – Each process has one or more LIFO stacks associated with it. A stack is used to store parameters and calling addresses for procedure and system calls
- Process control block – data needed by the OS to control the process
Modes of execution – necessary to protect OS data structures from interference from user programs
- User mode – less-privileged mode, user programs typically execute in this mode
- System mode – more privileged mode, also referred to as control mode or kernel mode, kernel of the operating system
Process creation
- The OS assigns a process identifier to the new process, allocate memory space for the process, initialize the process control block, add the new process to appropriate lists (ready or suspended)
Switching modes
- Process: switch which process is running
- Mode: If no interrupts are waiting, continue running the program

Threads vs processes

The part that own the resources and data is called process or task
The part or parts that follows an execution path is called a thread
Multithreading is that the operating system supports multiple threads within each process
Threads are faster than processes – both when it comes to switching and creating/exiting
Threads share resources which makes it easier to communicate, synchronize etc.

User-level threads

The application performs the thread management and scheduling
The kernel does not know that the thread exists
Advantages: Fast thread switch, portability, custom scheduling for task

Kernel-level threads

The kernel schedule each of the threads in the process
Advantages: If one thread of a process is blocked, the kernel can schedule another to run, different threads of a process can run on different CPU cores

User and kernel combined

-All threads made in user-space, multiple user level threads are mapped into a number of kernel threads

Memory management

Protection
- No process should modify the memory another process is currently using
- No normal programs should be able to access any memory belonging to the kernel
- Processor should have hardware functionality for enforcing memory protection
Sharing
- Processes need to co-operate – protection needs to be flexible enough to allow inter-process communication
Organization – Logical and physical
Placement algorithms
- Best fit – chooses the block that is closest in size to the request
- First fit – begins to scan memory from the beginning and chooses the first available block that is large enough
- Next fit – begins to scan memory from the location of the last placement and chooses the next available block that is large enough
Paging and segmentation – Both divide the memory of a program into smaller parts. Both introduce some variant of a table that keep track of where the different parts of a process is stored
- Paging divide the data of a process into fixed-size pages that are stored in fixed-size frames in memory
- Segmentation divide the data of a process into variable-length segments and store these in memory
- Combination – segmentation from user/programmers point of view, paging from memory’s point of view

Virtual memory – secondary memory can be accessed as if it was main memory

A process access memory with a virtual address, which can either be in main or secondary
Some pages/segments of a process can be in main memory while others in secondary
Advantages: “unlimited” memory, only data that is actually used are placed in main memory
Disadvantages: thrashing, meaning that a system spend most of its time moving pages/segments between main and secondary memory
Problem for real-time – if a part of a real-time process is suddenly in secondary memory, it can give an unexpected delay

From book

When OS creates a process (child) at the explicit request of another process (parent), the action is referred to as process spawning
When all processes in main memory are in Blocked, the OS can suspend one process by putting it in the Suspend state and transferring it to secondary memory. Same with ready.
A process image consists of user data, user program, stack and process control block (process id, processor state information, process control information)
Functions of an OS kernel:
- Process management – creation and termination, scheduling and dispatching, switching, synchronization and support for IPC, management of process control blocks
- Memory management – allocation of address space to processes, swapping, page and segment management
- I/O management – buffer management, allocation of I/O channels and devices to processes
- Support functions – interrupt handling, accounting, monitoring
Process switching may occur any time the OS has gained control from the currently running process, caused by events like clock interrupt, I/O interrupt, memory fault)
In a multiprocessor system, more than one process can be in the running state
Two main characteristics of a process:
- Resource ownership – process or task – virtual address space to hold the process image. A process may from time to time be allocated control or ownership of resources (main memory, I/O or files). The OS performs a protection function to prevent unwanted interference between processes with respect to resources.
- Scheduling/Execution – lightweight process – follows an execution path through one or more programs. May be interleaved with other processes. Thus, a process has an execution state and a dispatching priority and is the entity that is scheduled and dispatched by the OS.
Threads within the same process can communicate with each other without invoking the kernel, because they share memory and files
Any alteration of a resource by one thread affects the environment of the other threads in the same process. It is therefore necessary to synchronize the activities of the activities of the various threads so that they do not interfere with each other or corrupt data structures.
User-level threads – all of the work of thread management is done by the application and the kernel is not aware of the existence of threads (only sees the process).
Advantages of user-level threads instead of kernel-level threads(KLT)
- Thread switching does not require kernel mode privileges because all of the thread management data structures are within the user address space of a single process. Saves overhead of mode switches (user to kernel and back)
- Scheduling can be application specific. One application may benefit most from RR scheduling while another from priority based.
- ULTs can run on any OS. No changes are required to the underlying kernel support ULTs.
Disadvantages of ULTs compared to KLTs
- In a typical OS, many system calls are blocking. As a result, when a ULT executes a system call, not only is that thread blocked, but also all of the threads within the process are blocked
- In a pure ULT strategy, a multithreaded application can not take advantage of multiprocessing. A kernel assigns one process to only one processor at a time.
KLT – are threads within a process that are maintained by the kernel. Because they are recognized by the kernel, multiple threads within one process can execute in parallel on a multiprocessor and the blocking of one thread does not block the entire process. However, a mode switch is required to switch from one thread to another.
Multithreaded native applications are characterized by having a small number of highly threaded processes
Multiprocess applications are characterized by the presence of many single-threaded processes
Process relate to resource ownership, thread relates to program execution
Memory management terms
- Frame – a fixed length block of main memory
- Page – a fixed length block of data that resides in secondary memory. A page of data may temporarily be copied into a frame of main memory
- Segment – a variable length block of data that resides in secondary memory. An entire segment may temporarily be copied into an available region of main memory (segmentation) or the segment may be divided into pages which can be individually copied into memory (combined segmentation and paging)
- Internal fragmentation – wasted space internal to a partition due to the fact that the block of data loaded is smaller than the partition
- External fragmentation – memory that is external to all partitions becomes increasingly fragmented. Lots of small holes in memory and memory utilization declines
Memory management techniques
- Fixed partitioning – main memory is divided into a number of static partitions at system generation time. A process may be loaded into a partition of equal or greater size. Simple, but inefficient use of memory due to internal fragmentation
- Dynamic partitioning – Partitions are created dynamically, so that each process is loaded into a partition of exactly the same size as that process. No internal fragmentation and more efficient use of main memory, but inefficient use of processor due to the need for compaction to counter external fragmentation
- Simple Paging – Main memory is divided into a number of equal-size frames. Each process is divided into a number of equal-size pages of same length as frames. A process is loaded by loading all of its pages into available, not necessarily contiguous frames. No external fragmentation, but a small amount of internal fragmentation.
- Simple segmentation – each process is divided into a number of segments. A process is loaded by loading all of its segments into dynamic partitions that need not be contiguous. No internal fragmentation, improved memory utilization and reduced overhead compared to dynamic partitioning, but external fragmentation
- Virtual memory paging – as with simple paging, except that it is not necessary to load all of the pages of a process. Nonresident pages that are needed are brought in later automatically. No external fragmentation, higher degree of multiprogramming, large virtual address space, but overhead of complex memory management
- Virtual memory segmentation – as with simple segmentation, except it is not necessary to load all of the segments of a process. Nonresident segments that are needed are brought in later automatically. No internal fragmentation, higher degree of multiprogramming, large virtual address space, protection and sharing support, but overhead of complex memory management
Paging – main memory is divided into many small equal-size frames. Each process is divided into frame-size pages. Smaller process requires fewer pages; larger processes require more. When a process is brought in, all of its pages are loaded into available frames, and a page table is set up. This approach solves many of the problems with partitioning.
Segmentation – a process is divided into a number of segments that need not be of equal size. When a process is brought in, all of its segments are loaded into available regions of memory, and a segment table is set up
Virtual memory terminology
- Virtual memory – a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory. The addresses a program may use to reference a memory are distinguished from the addresses the memory system uses to identify physical storage sites, and program-generated addresses are translated automatically to the corresponding machine addresses. The size of virtual storage is limited by the addressing scheme of the computer system and by the amount of secondary memory available and not by the actual number of main storage locations.
- Virtual address – the address assigned to a location in virtual memory to allow that location to be accessed as though it were part of main memory
- Virtual address space – the virtual storage assigned to a process
- Address space – the range of memory addresses available to a process
- Real address – the address of a storage location in main memory
Virtual memory paging
- Not all pages of a process need be in main memory frames for the process to run. Pages may be read in as needed. Reading a page into main memory may require writing a page out to secondary memory
Virtual memory segmentation
- Not all segments of a process need be in main memory for the process to run. Segments may be read in as needed. Reading a segment into main memory may require writing one or more segments to secondary memory
Trashing – throwing out a piece of memory just before needing it and having to go get it back almost immediately – swapping pieces rather than executing instructions

Uniprocessor scheduling, inter-process communication and FreeRTOS

Long-term scheduling: Which processes that are allowed to be added to the pool of processes being executed
Medium-term scheduling: Decide which processes are fully or partially swapped into main memory
Short-term scheduling: Decide which of the currently ready processes the CPU should execute – called the dispatcher – main objective is to allocate processor time to optimize certain aspects of system behavior

Scheduling criteria

Turnaround time: interval between submission of a process and its completion
Response time: time from the submission of a request until the start of response
Deadlines: when process completion deadlines can be specified, as many deadlines as possible should be met
Predictability: a given job should run in about the same amount of time and at about the same cost regardless of the load on the system
Throughput: maximize number of processes completed per unit of time
Processor utilization: Percentage of time the processor is busy
Fairness: processes thread the same and no process suffer starvation
Enforcing priorities: favor higher-priority processes
Balancing resources: if a resource is overused, processes not needing it should be favored
Priority based scheduling: each priority gets its own queue, the dispatcher will check higher priority queues first

Scheduling methods

First come first serve – dispatcher chooses the process that has been in the ready queue the longest
Round robin – each process is given a time slice
Shortest process next – shortest executing time
Shortest remaining time – preemptive version of SPN
Highest response ratio next – dispatcher chooses process from execution time and time waited in queue $R = \frac{w+s}{s}$
- $R$: response ratio, $w$: time waited in queue, $s$: service time
Feedback – decrease priority each time process is preempted, does not require knowledge of execution time

FreeRTOS

Minimalistic operating system for microcontrollers that often does not use operating systems
Run on many platforms
Very small OS
Tasks – regular process with own stack, is preemptive
Co-Routines – intended for use on small processors that have severe RAM constraints, share a single stack
Task model: running, ready, blocked and suspended
Scheduling
- Either co-operative or preemptive
- Scheduler activated by a timer interrupt every clock tick
- Equal priority tasks are scheduled with Round-Robin
Use semaphores, mutexes and queues (message passing)
- Task notification

From book

Multiprogramming – the management of multiple processes within a uniprocessor system
Multiprocessing – the management of multiple processes within a multiprocessor
Distributed processing – the management of multiple processes executing on multiple, distributed computer systems.
Concurrency encompasses a host of design issues, including communication among processes, sharing and competing for resources, synchronization of activities of multiple processes, and allocation of processor time to processes
- Atomic operations – a function or action implemented as a sequence of one or more instructions that appears to be indivisible. The sequence of instruction is guaranteed to execute as a group, or not execute at all, having no visible effect on system state. Atomicity guarantees isolation from concurrent processes.
- Critical section – a section of code within a process that requires access to shared resources and that must not be executed while another process is in a corresponding section of code
- Deadlock – two or more processes are unable to proceed because each is waiting for one of the others to do something
- Livelock – two or more processes continuously change their states in response to changes in the other process(es) without doing any useful work
- Mutual exclusion – the requirement that when one process is in a critical section that accesses shared resources, no other process may be in a critical section that accesses any of those shared resources
- Race condition – multiple threads or processes read and write a shared data item and the final result depends on the relative timing of their execution
- Starvation – a runnable process is overlooked indefinitely by the scheduler
To guarantee mutual exclusion, it is sufficient to prevent a process from being interrupted. This capability can be provided in the form of primitives defined by the OS kernel for disabling and enabling interrupts
Concurrency mechanisms
- Semaphore – an integer value used for signaling among processes. Only three operations may be performed on a semaphore, all of which are atomic: initialize, decrement, increment.
- Binary semaphore – takes only values 0 and 1
- Mutex – binary semaphore with the concept of ownership. This means that only the process/thread that locked the mutex can unlock it.
- Condition variable – a data type that is used to block a process or thread until a particular condition is true
- Event flags – a memory word used as a synchronization mechanism.
- Messages – two processes exchange information and may be used for synchronization
- Spinlocks – mutual exclusion mechanism in which a process executes in an infinite loop waiting for the value of a lock variable to indicate availability
In message passing both sender and receiver can be blocking or non-blocking
- Non-blocking solves the problem where a message might be lost or a process fails before it sends an anticipated message. Prevents the receiving process from being blocked indefinitely
Indirect addressing – messages are sent to a shared data structure consisting of queues that can temporarily hold messages. The strength of the use of indirect addressing is that by decoupling the sender and receiver it allows for greater flexibility in the use of messages

Linux and real-time variants

Linux kernel – monolithic kernel with loadable modules

Concurrency

Atomic operations: guaranteed to perform without interruptions or interference
Spinlocks: A thread that spins instead of sleeps until the mutex it is waiting for is released
Semaphores: Semaphores for kernel space
Barriers: Enforce order in which instructions are executed

Modular monolithic kernel

– Includes virtually all of the OS functionality in one large block of code that runs as a single process with a single address space. All the functional components of the kernel have access to all of its internal data structures and routines. Linux is structured as a collection of modules

Loadable modules – A module is an object file whose code can be linked to and unlinked from the kernel at run-time. A module is executed in kernel mode on behalf of the current process.
- Dynamic linking and stackable modules

Linux page replacement

Least frequently used policy – each page in main memory has a counter, which is incremented each time it is accessed. Periodically decremented. The lower the counter, the less frequently it is used and will be preferred for replacement.

Kernel memory allocation

– Manages physical main memory page frames

Primary function is to allocate and deallocate frames for particular users (like user-space processes, dynamically allocated kernel data, static kernel code, page cache)
A buddy algorithm is used so memory for the kernel can be allocated and deallocated in units of one or more pages
Page allocator alone would be inefficient because the kernel requires small short-term memory chunks in odd sizes
Slab allocation – used by Linux to accommodate small chunks

Linux scheduling

Three classes – FIFO, Round-robin, non-real-time threads
Linux kernel not intended to be preemptive, and can’t be fully preemptive regardless of kernel configuration

Linux thread model

Originally there were no threads in Linux, only processes. Threads could be implemented in user space inside the processes. One blocked thread would block the whole process
Now, processes and threads are almost the same, both are running as separate “kernel threads”. Threads share the same virtual memory. Context switch between threads are faster than between processes. The kernel schedules each kernel thread individually.

User space

– Use System Calls to request kernel services. Kernel can notify user-space program with Signals

Kernel space

Reasons for developing in kernel space: User space programs can only access the services already provided by the kernel. Need to access an unsupported device or a special feature of the CPU of the kernel. Functionality can be implemented in kernel space to reduce the number of “context switches” which reduce performance.
Disadvantages with kernel space: The normal user-space libraries can’t be used in kernel space
Root is the highest level of privilege you can have in user-space. Not the same as kernel space.

Real-time

Linux is not a real-time system and the kernel is not preemptive.
Soft real-time capabilities can be improved. Configure the kernel to be as preemptive as possible
How to get harder real-time
- Make Linux kernel full preemptive, thus more predictable – effect is uncertain
- Add small real-time co-kernel that runs below Linux (RT-Linux, RTAI, Xenomai) Linux tasks cannot disturb real-time tasks
- Outsource the hard real-time tasks to a slave microcontroller – provide better control of latencies
  - When outsourcing some of the real-time tasks to a specialized microcontroller, the Linux computer act as a supervisor. Suitable for systems that have a small, separate real-time component.

Xenomai

previous versions used a co-kernel or a dual kernel

Dual kernel approach – Cobalt core runs beside the Linux kernel. Best performance, lowest latency.
Single kernel approach – native Linux kernel normally patched with PREEMPT-RI – Mercury core
Real-time device model (RTDM) – if a Xenomai program uses a Linux driver e.g. some hardware device, the driver will execute in Linux context, not Xenomai – not hard real-time and inefficient due to many switches between Linux and Xenomai

From book

Most UNIX kernels are monolithic. Monolithic kernels include virtually all the OS functionality in one large block of code that runs as a single process with a single address space.
Linux is structured as a collection of modules, a number of which can be automatically loaded and unloaded on demand.
- A module is an object file whose code can be linked to and unlinked from the kernel at runtime.
- Although Linux may be considered monolithic, its modular structure overcomes some of the difficulties in developing and evolving the kernel.
- Dynamic linking – a kernel module can be loaded and linked into the kernel while the kernel is already in memory and executing. A module can also be unlinked and removed from memory at any time
- Stackable modules – the modules are arranged in a hierarchy. Individual modules serve as libraries when they are referenced by client modules higher up in the hierarchy, and as clients when they reference modules further down.

Kernel components

Signals: Kernel uses signals to call into a process. Used to notify a process of certain faults.
System calls: The system call is the means by which a process requests a specific kernel service. Six categories – file system, process, scheduling, IPC, socket and miscellaneous
Process and scheduler: Manages and schedules processes
Virtual memory: Allocates and manages virtual memory for processes
File system: Provide a global hierarchical namespace for files, directories, and other file-related objects and provide file system functions
Network protocols: Support the sockets interface to users for the TCP/IP protocol suite
Character device drivers: Manage devices that require the kernel to send or receive data one byte at a time, such as terminals, modems and printers
Block device drivers: Manage devices that read and write data in blocks, such as various forms of secondary memory
Network device drivers: Manage network interface cards and communications ports that connect to network devices, such as bridges and routers
Traps and faults: Handle traps and faults generated by the processor, such as a memory fault
Physical memory: Manages the pool of page frames in real memory and allocates pages for virtual memory
Interrupts: handle interrupts for peripheral devices

Linux tasks

A process, or task, in Linux is represented by a task_struct data structure. The structure contains information about the following categories
- State, scheduling information, identifiers, ICP, links, times and timers, file system, address space, processor-specific context
States – running, interruptible, uninterruptible, stopped, zombie (terminated but in process table)

Linux threads

Traditional UNIX systems support a single thread of execution per process, modern UNIX systems typically provide support for multiple kernel-level threads per process
A new process is created by coping the attributes of the current process. A new process can be cloned so that it shares resources, such as files, signal handlers and virtual memory. When the two processes share the same virtual memory, they function as threads within a single process.
Although clones processes that are part of the same process group can share the same memory space, they cannot share the same user stacks. Thus the clone() call creates separate stack spaces for each process

Linux namespace

A namespace enables a process (or multiple processes that share the same namespace) to have a different view of the system than other processes that have other associated namespaces. One of the overall goals of namespaces is to support the implementation of control groups, a tool for lightweight virtualization that provides a process or group of processes with the illusion that they are the only processes in the system

Memory management

Three-level page table structure
- Page directory – an active process has a single page directory that is the size of one page. Each entry in the page directory points to one page of the page middle directory. The page directory must be in main memory for an active process.
- Page middle directory – may span multiple pages. Each entry in the page middle directory points to one page in the page table
- Page table – may span multiple pages. Each page table entry refers to one virtual page in the process
Each time a page is accessed, the age variable is incremented. Linux periodically sweeps through the global page pool and decrements the age variable for each page as its rotates through all the pages in main memory. A page with the age 0 is the best candidate for replacement.

Real-time Scheduling

Fixed-Priority Scheduling (FPS)

If preemptive (which it probably should be), then the highest priority available task will always run.
Optimal priority assignment for periodic, independent tasks is rate monotonic.
- The shorter the period, the higher priority

Earliest deadline first (EDF)

The priority of each task will change dynamically based on the current deadlines. EDF is able to schedule any task set as long as it takes less time to execute than what is available – optimal.
EDF is more difficult to implement than fixed priority, it has more overhead. Difficult to incorporate tasks without deadlines in EDF. FPS is more predictable if the system gets overloaded.

Scheduling tests

Sufficient: If the test is passed, the tasks are certain to meet their deadlines. Utilization based test. Periods that are multiples of each other are considered to be in the same family.
Necessary: If the test is failed, the tasks will miss their deadlines.
Exact: Both necessary and sufficient. Response-time analysis – a method for calculating the worst case execution time of a task when there are higher priority tasks that interfere.

Utilization-based scheduability test is a simple test of fixed priority scheduling with rate monotonic priorities. It is sufficient (not necessary).

$$ U \equiv \sum_{i=1}^N{\frac{C_i}{T_i}} \leq N(2^{\frac{1}{N}}-1), $$

where $U$ is the utilization, $N$ the number of tasks, $T_i$ the period of task $i$ and $C_i$ the computational time of task $i$.

Tasks that have periods that are multiples of each other are considered to be in the same family.

Example

All tasks are in family (Periods: 20, 40, 80)
Not necessary to use 3 task limit (78.0%), instead the 1 task limit (100%).
Schedulability for EDF can also be tested with utilization, but with another formula than for FPS;

$$ \sum_{i=1}^N \frac{C_i}{T_i} \leq 1 $$

This means that EDF is able to schedule any task set as long as it takes less time to execute than what is available. In this sense we can say that EDF is optimal.

Utilization test cannot be used on deadline monotonic!
- Have to do graphical analysis or response time analysis

Priority inversion

– when a high priority task has to wait for a lower-priority task. Okay if high-priority task is blocked by low priority due to shared resource. Not okay if medium priority prevents the low-priority task from completing and return the mutex.

Priority Ceiling protocols

Priority inheritance: Reduces the effect of priority inversions but does not prevent deadlocks – ceiling protocols do. Makes it easier to calculate the worst case blocking time.
Immediate ceiling priority protocol (ICPP): Each shared resource is assigned the priority of the highest priority task that uses it. When a task is allowed to access a shared resource, it will inherit its priority. This means that when a task starts using a resource, it is guaranteed to be uninterrupted until it is done with the resource. Minimize the time a resource is locked.
Original ceiling priority protocol (OCPP): A task inherits the priority of a resource ONLY if it blocks a higher priority task. A task can only access a shared resource if its priority is higher than the priority of all locked resources. Does not need to know the priority of the tasks in advance.

ICCP is easier to implement than OCPP, which has to keep track of what is blocking what. ICCP leads to less context switches as blocking is prior to first execution. ICPP requires more priority movements as this happens with all resource usage. OCPP changes priority only if an actual block has occurred.

Deadline monotonic priority

Rate monotonic is not suitable when deadline is shorter than period (D < T). Not the same as EDF, as priorities are still static. Same as rate monotonic when D = T.

Adding block time to response time analysis

The usage function: 1 for shared resources that are used by lower and higher (than i) priority tasks. 0 for others. $R_i = C_i + B_i + I_i$

Scheduling aperiodic tasks

Deadline monotonic is suitable for sporadic tasks. In a rate/deadline monotonic scheme, an aperiodic task could be given the lowest priority (suffer from starvation). For better response of aperiodic tasks, we use Execution-time servers which maintain a budget for clients. An Execution-time server runs as one task / process (periodic or sporadic) in the system. It can be included in a response time analysis modeled as such a task.

Periodic/Polling server: Each period the computation time budget is available for any pending clients – if a client needs more, it will continue next period. If one or more clients are pending, they will be scheduled within the server. The budget has to be used when the server is running or it is wasted.
Defferable server: Only lose budget when used. Better response time for aperiodic tasks. Back-to-back, where one budget is used just after another.
Sporadic server: Budget is replenished one-time period after it has started to be used.

Finding worst case execution time

Highest theoretical value, actual value (impossible to get), worst execution time found when testing x number of times

From book

No actual tasks exist at run-time; each minor cycle is just a sequence of procedure calls
- Task based scheduling – tasks exist at run time
Earliest deadline first – the absolute deadlines are computed at run time and hence the scheme is described as dynamic

Tests

Sufficient: Pass the test will meet deadlines
Necessary: Fail the test will miss deadlines
Exact: Necessary and sufficient

Simple task model

Fixed set of tasks
Periodic, with known periods
Independent
Overheads, context-switching times are ignored
Deadline equal to period
Fixed worst-case execution time

Sporadic tasks have a minimum inter-arrival time – require D<T

Aperiodic tasks – do not have minimum inter-arrival times. Can run aperiodic tasks at a priority below the priorities assigned to hard processes, therefore, they cannot steal, in preemptive systems, resources from the hard processes – often miss deadlines.

Fixed-Priority Scheduling(FPS) vs. Earliest deadline first (EDF)

FPS is easier to implement as priorities are static
EDF is dynamic and requires a more complex runtime system which will have higher overhead
It is easier to incorporate tasks without deadlines into FPS; arbitrary deadline is more artificial
Easier to incorporate other factors into the notion of priority rather than deadline
FPS more predictable during overload situations, but EDF gets more out of the processor

QNX

QNX microkernel

A microkernel is an operating system kernel that only contain what has to run in kernel mode. (Scheduling, memory management and inter-process communication (IPC).
There is no “kernel process”, the CPU executes code in the kernel only when an explicit kernel call is made, an exception is raised or to response to a hardware interrupt. All based around ICP.

Fundamental services in QNX

Thread services via POSIX thread-creation primitives
Signal services via POSIX signal primitives
Message-passing services – the microkernel handles the routing of all messages between all threads throughout the entire system
Synchronization services via POSIX thread-synchronization
Scheduling services – the microkernel schedules threads for execution using the various POSIX real-time scheduling policies.
Timer services – the microkernel provides the rich set of POSIX timer services
Process management services – managing processes, memory and the pathname space
POSIX – portable operating system interface
To make compatibility between operating systems easier
Defines the API (application programming interface) between user and programs and the kernel
QNX is POSIX certified, but non-UNIX

QNX is suitable for real-time and embedded because

QNX is scalable – possible to make very small/large
Microkernel is in general suitable for real-time applications
- No large non-preemptive kernel processes
- Only very short periods that a process is in a non-preemptive kernel state (entry and exit)
Scheduling – highest priority process will run. Priority inheritance. Sporadic server
Built in inter-process communication

QNX is not open source – very expensive

Threads and processes

A thread is the unit of scheduling – QNX schedules threads, not processes
A process is a container for threads, each process must have at least one thread and it also contains address space, etc.
Huge thread model – but most are variants of blocked state

Scheduling

One ready queue for each priority level
- 256 priorities, 0 is idle, 64-255 is root-only priorities, rest is configurable
Scheduling is performed when:
- The running thread is blocked since it has to wait for some event. When unblocked it is placed on the end of its ready queue
- The running thread is preempted due to a higher priority thread that becomes ready. The preempted thread is placed in the front of its ready queue.
- The running thread voluntarily yields the processor, and is placed in the end of its ready queue. The scheduler selects the next thread, which can be the one that just yielded.
- Scheduling methods – FIFO, round robin, sporadic server

Sporadic server

Is in QNX implemented as a thread that will keep its normal priority for as long as it is within its budget. When budget is exceeded, the priority will drop to a low level until the budget is replenished

Interrupts

low latencies, because interrupts are enabled almost all of the time (only certain critical code sections requires interrupt to be disabled). QNX supports nested interrupts.

Message passing

Things that typically are implemented in the kernel are instead implemented as user-space processes.
The primary inter-process communication mechanism in QNX is message passing
- One process creates a channel. When another process attaches to the channel, they can exchange messages
Synchronous message passing
- Client communication – MsgSend() gets SEND blocked, unless server is already RECEIVE blocked  gets REPLY blocked when server receives message  get READY when server MsgReply()
- Server communication – MsgReceive() when ready to reply to clients  get RECEIVE blocked, unless client is already SEND blocked  get READY when client does MsgSend()  send MsgReply() which will not block, since client is already synced
Message passing has priority inheritance
- Server receives a message from a high priority client – server inherits the (higher) priority when the client sends the message
- Server receives a message from a low priority client – server inherits the (lower) priority when the server receives the message

Processes share memory – protected by MUTEX etc.

QNX process manager (procnto) is required for all run-time system

User process, not in kernel mode
Provides process management, manages process creation, destruction and process attributes such as user ID and group ID. Manages a range of memory-protection capabilities, shared libraries and inter-process POSIX shared-memory primitives

Memory management

– provides protection unlike many other RTOS (choose not to because of large overhead)

The overhead is less of a problem than the advantages it provides – adds robustness, threads that try to access memory which it is not allowed to access can be killed. Mitigates problems caused by accidental overwriting some memory location, which is typically very difficult to debug

QNX Qnet

Used for accessing remote file systems, scale applications and extend applications from a single machine to several machines and distribute processes among those CPUs
Qnet file system is shared between Qnet nodes. Resources and IPC mechanisms are available on the file system, is available for other nodes. Includes message passing, named semaphores, signals and message passing queues.

VxWorks

Features like – microkernel, fast response time, broad support, scalable and deterministic
Board support package (BSP) – supports different CPUs, like x86, ARM and Power PC

Memory management

Originally no memory protection – all tasks share memory. Convenient for communication between tasks, but on task can overwrite the memory of another
Newest version has memory management
Newer versions also have real-time process (RTP) - similar to process. Task is similar to thread. Like QNX, all tasks are scheduled independently by the kernel
Non-overlapped memory model – all virtual addresses in the system are unique – allow the same system to work with memory management units (MMU) in the CPU enabled or disabled

Kernel

Earlier versions were running on top of 3rd party kernels (VRTX or pSOS) – Wind River systems developed their own “Wind Kernel”
Older versions were running in “kernel mode” and had access to all memory
In newer versions, there is both user and kernel mode, and memory protection (more microkernel in the right sense of the term)

Task model

Consists of states ready, suspended, delayed, pended. Tasks can be in more than one state at a time

Single-core scheduling

VxWorks native scheduler is preemptive, highest priority task will always run, same priority tasks can either be FCFS or round robin (same as QNX)

Multi-core scheduling

All tasks can run on any CPU core by default
A task can be given a CPU affinity – a task will only run on that CPU core (same memory cache – less data moved)
A task can reserve a CPU core – only this task can run on that CPU core

Foreground and background RTPs

Foreground RTPs is assigned to a time partition
Background RTPs run within a time partition when all foreground RTPs have completed

Inter-process communication

Semaphores, mutex, message queues
Also has task lock – can disable and re-enable preemption. Faster than semaphores and useful if you need frequent and short protected access. Prevents all other tasks from running, not just the tasks contending for the resource

Wind River Linux

a Linux based system as an alternative to VxWorks based system. Provided with both PREEMPT-RT and co-kernel
VxWorks for hard real-time and Wind River Linux for soft/hard-ish real-time systems

Embedded Linux and Multiprocessor scheduling

Embedded development

One or more targets and one host computer (usually)
Host computer – program and compile code (more powerful)
Target computer – embedded computer that runs the code

Embedded Linux

Advantages – customizable and configurable, runs on a large number of architectures, large and active user community, free
Disadvantages – not hard real-time by default, too many alternatives, difficult, hardware capability issues

Consumer grade ARM boards

Typically uses an ARM SoC CPU in the range of single-core 700MHz to quad-Core 1.8 GHz – similar CPUs that are used in mobile phones
Hybrid between a micro-controller and a desktop computer – HDMA, VGA, USB, Ethernet, UART, SPI, I2C, GPIO, PWM, ADC and external interrupts
Advantages – inexpensive, small, low power consumption, working Linux distribution is usually supplied, open hardware platform, low-level and high-level I/O
Disadvantages – not real-time by default, unknown reliability, intended for hobbyists, not industry

Target vs Host

Use only target – easy to set up, less powerful, limited development tools

Develop on host, build on target

Might not have necessary build tools on target. Host has good development tools

Develop and build on host, transfer to target

Transfer binaries built on host to target
Will not work if host and targets are too different (different shared libraries)
Difficult to develop and test functionality that only the target has
Cross-compile – a compiler that runs on one computer and makes binaries for another. Challenging to set up but very good solution when it works.
Cross-compile and Flash – for hardware that does not have removable storage, network etc. use an external tool or debugger, to flash some content to some internal flash memory (JTAG)
Cross-compiler toolchain for C consists at least of a C-library, compiler and binary tools (linker, assembler etc.)
Easier ways to get a cross toolchain – install a pre-compiled toolchain, use a tool to make the toolchain, use a build system

Embedded Linux build systems

compromise between using a binary distribution and having to make everything from scratch
Buildroot – builds a root file-system image, no binary packages, muck simpler to use, understand and modify

Best is to use whatever tool that offers best support for your hardware

uClibc

Used as alternative for embedded Linux, instead of glibc
Not ABI compatibility between versions. Not everything in glibc is supported in uClibc

Embedded Linux vs QNX/VxWorks

The setup of embedded Linux system and development environment is provided when using commercial systems as QNX or VxWorks (cross-toolchain, tool to configure and make a file system of image file)
Freeness of Linux comes at a cost

Microprocessor scheduling

Multicore computers are computers with multiple CPU cores on the same physical CPU chip, share memory and usually share operating system
Assignment of processes to cores:
- Static assignment to a core – each process is permanently assigned to a core, one queue of processes for each core, minimal scheduling overhead, may result in one core being idle while another one is very busy
- Global queue – each process is added to a global queue of processes. When a processor core is idle it will select the next process and start running it. No core is idle as long as there is available work. Preempted or blocked processes are likely to run on a different core next, which makes caching less effective
- Dynamic load balancing – each process is assigned to a core – may be moved, depending on load
How to assign processes:
- Master/slave architecture – kernel scheduling functions always run on the same master core, slave cores send requests to the master. Simple and requires little changes from uniprocessor scheduling. Master can become a bottleneck and a single point of failure
- Peer architecture – kernel scheduling functions can run on any core. Each core schedule itself with available processes. Must make sure that no two cores run the same process.
- A process can get full access to a processor core of its own, like VxWorks
Real-time on multicore
- Difficult since another potential uncertainty is introduced
- Moving a thread from one core to another will cause a longer and more unpredictable delay than normal scheduling

From book

Real-time computing may be defined as that type of computing in which the correctness of the system depends not only on the logical results of the computation but also on the time at which the results are produced.

Windows

Kernel and executive

Typical kernel-level tasks are divided into the Kernel and the Executive
Kernel controls how the processors are used – schedule threads. Exception and interrupt handling. Synchronize between multiprocessors
The Executive provides the core OS services – runs as kernel-level thread. Provides memory management; creates, manages and deletes processes and threads; manages I/O etc.
The kernel is in many ways similar to a micro-kernel. But since the executive (and many other components) also are running as kernel-mode, Windows is not a micro-kernel system
Other kernel-mode components – hardware abstraction layer between kernel and hardware. Means that kernel does not need to consider the actual hardware. Device drivers – extend the functionality of the Executive, typically used for interacting with specific I/O device

User-mode processes

Special system processes – user-mode services that manage the system, session manager and authentication subsystem
Service processes – “server” programs. Only way to run a non-system background program
Environment subsystems – different “OS personalities” or environment (Win32, POSIX)
User applications – normal user programs (.exe) and libraries (.dll)

Client/server model

For OS services
Use remote procedure calls, which is a type of inter-process communication
The caller/client causes a procedure to run in another address space belonging to a server
Could be another computer or the same computer but in different virtual address space

Windows objects

Windows is written in C but has object -based design
Some of the internal system components of Windows exists as objects. Things like processes, threads etc. are objects. The object manager keeps track of all the objects in the system
Objects have both data (attributes) and procedures (services). Attributes are only accessible through services, thus it is possible to protect from unauthorized control

Windows processes

Implemented as a process object
Virtual address space and other attributes
Contains one or more threads that execute code and can be scheduled by the Kernel
Has a handle table with references to the different objects known to the process

Windows threads

Threads are similar to other systems – scheduled by the Kernel, share address space with other threads in the same process. A thread pool is a collection of worker threads that is available to perform asynchronous callbacks in the background
Fibers – one level below threads, one thread can schedule multiple fibers within itself. Run in the context of the thread that schedule it
User-mode scheduling (UMS) – lightweight mechanisms for applications to schedule its own threads. Differs from fibers as a UMS thread, has its own context
Both fibers and UMS have potential problems like one blocked fiber/thread blocks all
States – Same as the 5-state model except for the extra transitions state, means that a thread is ready to run, but all its resources are not yet available

Windows scheduling

Real-time priority class (16 to 31), fixed priorities that never change. Round-robin between threads with same priority
Variable priority class (0 to 15), vary depending on use, can get boosted to 15 if starved. Will never become higher than 15

Multiprocessor scheduling

Any thread can run on any processor
By default, soft affinity is used, which means that the Kernel will try to assign a thread to its previous processor so that cache can be reused
Can also be assigned with hard affinity, which means that it will only run on one processor

Windows dispatcher objects are used for synchronization

Concurrency mechanisms

Critical section protects critical code – similar to mutex
Slim reader-writer – like critical section
Condition variables
Lock-free synchronization – atomic access to a resource without using a lock/mutex, a thread cannot get preempted until it is done with the resource

Windows embedded/real-time – Windows Embedded Compact Edition

Familiar for many developers.
256 priority levels – guaranteed upper bound on high-priority thread scheduling and on delay in running high-priority interrupt service routines (ISRs)
Typical RT functionality:
- Support for handling priority inversion with priority inheritance
- Support for nested interrupts to ensure that high-priority events are not delayed
- Support for 1-ms system tick timing
- Advanced thread timing and scheduling
- Support for synchronization objects such as semaphores, mutexes and critical sections
Board support package system (BSP) – x86, ARM, MIPS, SH

From book

Core

The core of Windows NT, consisting of the Kernel and Executive was designed for 32-bit machines and included object-oriented features

Kernel-mode components

Executive – contains the core OS services, such as memory management, process and thread management, security, I/O, IPC
Kernel – controls execution of processors. The Kernel manages thread scheduling, process switching, exception and interrupt handling, and multiprocessor synchronization. Unlike the rest of the Executive and the user level, the Kernel’s own code does not run in threads
Hardware abstraction layer (HAL) – maps between generic hardware commands and responses and those unique to a specific platform. It isolates the OS from platform-specific hardware differences. The HAL makes each computer’s system bus, DMA controller, interrupt controller, system timers, and memory controller look the same to the Executive and Kernel components. It also delivers the support needed for SMP
Device drivers – dynamic libraries that extend the functionality of the Executive. These include hardware device drivers that translate user I/O function calls into specific hardware device I/O requests and software components for implementing file systems, network protocols, and any other system extensions that need to run in kernel mode
Windowing and graphics system – implements the GUI functions, such as dealing with windows, user interface controls and drawing

Executive modules

I/O manager, Cache manager, object manager, plug-and-play manager, power manager, security reference monitor, virtual memory manager, process/thread manager, configuration manager, advanced local procedure call (ALPC) facility

User-mode processes

special system processes, service processes, environment subsystem, user applications

The Windows OS services, the environment subsystems, and the applications are structured using the client/server computing model

Threads and symmetric multiprocessing (SMP)

OS routines can run on any available processor, and different routines can execute simultaneously on different processors
Multiple threads within the same process may execute on different processors simultaneously
Server processes may use multiple threads to process requests from more than one client simultaneously
Processes share data and resources with IPC

Windows objects

Used in cases where data are intended for user mode access or when data access is shared or restricted
- Encapsulation – an object consists of one or more items of data, called attributes, and one or more procedures that may be performed on those data, called services
- Object class and instance – an object class is a template that lists the attributes and services of an object and defines certain objects characteristics. The OS can create specific instances of an object class as needed
- Inheritance – the Executive uses inheritance to extend object classes by adding new features. Every Executive class is based on a base class which specifies virtual methods that support creating, naming, securing and deleting objects. Dispatcher objects are Executive objects that inherit the properties of an event object, so they can use common synchronization methods
- Polymorphism – API functions are used to manipulate objects of any type

Process and thread management

An application consists of one or more processes.
A process provides the necessary resources to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, but can create additional threads from any of its threads
A thread is the entity within a process that can be scheduled for execution. All threads of a process share its virtual address space and system resources. In addition, each thread maintains exception handlers, a scheduling priority, thread local storage, a unique thread identifier, and a set of structures the system will use to save the thread context until it is scheduled. On a multiprocessor computer, the system can simultaneously execute as many threads as there are processors on the computer
A job object allows groups of processes to be managed as a unit. Job objects are nameable, securable, shareable objects that control attributes of the processes associated with them
A thread pool is a collection of worker threads that efficiently execute asynchronous callbacks on behalf of the application. Used to reduce the number of application threads and provide memory management of the worker threads
A fiber is a unit of execution that must be manually scheduled by the application. Fiber has state information about the threads stack, a subset of its register, and the fiber data provided during fiber creation
User-mode scheduling (UMS) is a lightweight mechanism that applications can use to schedule their own threads

Characteristics of Windows processes

Implemented as objects
Processes can be created as a new process or as a copy of an existing process
An executable process may contain one or more threads
Both process and thread objects have built-in synchronization capabilities

Concurrency mechanisms

Provide synchronization among threads
Executive dispatcher objects, user-mode critical sections, slim reader-writer locks, condition variables and lock-free operations
- Synchronization objects – notification event, synchronization event, mutex, semaphore, waitable timer, file, process, thread

Memory management

The windows virtual memory manager controls how memory is allocated and how paging is performed. The memory manager is designed to operate over a variety of platforms and to use page sizes ranging from 4 Kbytes to 64Kbytes
Virtual address map – on 32-bit platforms, each Windows user process sees a separate 32-bit address space, allowing 4 Gb of virtual memory per process. By default, half of this memory is reserved for the OS, so each user actually has 2 Gb of available virtual address space and all processes share most of the upper 2 GB of system space when running in kernel mode
Swapping – Paging will hold items that haven’t been accessed in a long time, whereas swapping holds items that were recently taken out of memory

Scheduling

Windows implements a preemptive scheduler with a flexible system of priority levels that includes RR scheduling within each level and, for some levels, dynamic priority variation on the basis of their current thread activity. Threads are the unit of scheduling in Windows rather than processes
Priorities in Windows are organized into two classes: real time and variable, each consists of 16 priority levels. Each level has a FIFO queue.
Soft affinity – dispatcher tries to assign a ready thread to the same processor it last ran on
Hard affinity – thread executes only on certain processors

Android

Same as Linux kernel but, some changes to the kernel, instead of glibc, it uses its own standard C-library Bionic. It does not use the same X Window System that most Linux graphical user interfaces uses
Regular Linux programs and libraries are not directly compatible Android
Use ahead-of-time compilation instead of just-in-time

Android applications

Consists of one or more instances of four types of components:
Activities – A screen visible as a user-interface. An application can have multiple activities, each for a separate task. Activities can be external, thus it can be started by another application
Services – a background task. Can continue to run even when another application is used.
Content providers – An interface to application data. Data can be stored in a file, SQLite database etc.
Broadcast receivers – Responds to system wide broadcast announcements. Events from other applications or the system can have an effect on the application

Inter-process communication

Not typical Linux IPC mechanisms
Has its own addition to the Linux kernel called Binder, which implements a light weight remote procedure call (RPC) capability. Allow communication between processes running on two different virtual machines

Android and real-time

Android is probably not the first choice for a real-time system – but could be useful in some situation. A mixed-criticality system with touch-screen interface. A personal (always carried) device needing to do something in real-time

From book

Standard OS for IoT
Defined as a software stack that includes the OS kernel, middleware, and key applications
- Complete software stack – not just OS. Android is a form of embedded Linux
All applications that the user interacts with directly are part of the application layer
Open-source android architecture
Every Android application runs in its own process, with its own instances of the Dalvik virtual machine
Linux kernel
- The OS kernel for Android is similar to, but not identical with, the standard Linux kernel distribution. One noteworthy change is that the Android kernel lacks drivers not applicable in mobile environments, making the kernel smaller
- Relies on its Linux kernel for core system services such as security, memory management, process management, network stack and driver model
- The kernel acts as an abstraction layer between the hardware and the rest of the software stack and enables Android to use the wide range of hardware drivers that Linux support
Process and thread management
- Four types of application components
  - Activities – correspond to a single scree visible as a user interface
  - Services – perform background operations that take a considerable amount of time to finish
  - Content providers – act as an interface to application data that can be used by the application
  - Broadcast receivers – respond to system-wide broadcast announcements
- Default allocation of process and treads is a single of each. To avoid slowing down the user interface when slow and/or blocking operations occur in a component, the developer can create multiple treads within a process and/or multiple processes within an application
IPC
- Does not use pipes, shared memory, messages, sockets, semaphores or signals
- Binder – provides a lightweight remote procedure call (RPC) capability that is efficient in terms of both memory and processing requirements, and is well suited to the requirements of an embedded system
  - The Binder is used to mediate all interaction between two processes. A component in one process (the client) issues a call. This call is directed to the Binder in the kernel, which passes the call on to the destination process (the service). The return goes through the Binder and is delivered to the calling process.

Comparison of OS

No OS

When to use
- Simple applications
- When system is only doing one (or very few) things simultaneously
Advantages
- Full control
- Can be deterministic
- No CPU and memory is spent on OS, on low-powered microcontrollers
- Some low-level operations are much simpler than a OS
Disadvantages
- Tricky to do multiple things
- Often need to read CPU datasheet to use features
- Often not scalable

Free RTOS

When to use
- Simple applications
- When you need multitasking etc. on a simple microcontroller
Advantages
- The advantages of bare metal at the same time as simple operating system capabilities
- Operating system for hardware that can’t run other operating systems
Disadvantages
- For better or worse, a different type of system than the other we have covered

Android

When to use
- On hand-held devices
- Graphical systems in ARM CPUs
Advantages
- Wide adoption
- Many apps, libraries etc. available
- Linux kernel customized and minimized for hardware
- Every app run within its own virtual machine
Disadvantages
- Not intended for real-time system or control systems

Linux

When to use
- Soft real-time
- Developers have Linux experience
- When many different programs and libraries are needed/useful
Advantages
- Support for many platforms and system types
- Large user community
- Scalable and configurable
- Many free and open source tools, programs, libraries etc.
Disadvantages
- Not hard real-time
- Unpredictable periods when kernel is non-preemptive
- Costly to spend time on setting up system instead of buying something that just works

Real-time Linux

When to use
- Linux with hard real-time
- Mixed criticality systems
Advantages
- A fully functional Linux operating system with the addition of hard real-time
- Separation between real-time tasks and regular Linux processes
Disadvantages
- Real-rime tasks cannot fully use the Linux system
- Difficult to set up, experience with Linux in not directly applicable for development with real-time tasks

Windows

When to use
- Developers have Windows experience
- Existing Windows code
Advantages
- Embedded compact has suitable capabilities for hard real-time
- Many tools, programs, libraries etc.
Disadvantages
- Cost
- Consumer versions are not suitable for embedded/real-time

QNX

When to use
- When a proven, commercial real-time system is required
- Distributed control systems
Advantages
- Hard real-time
- Specialized development tools
- Microkernel
- Customizable
Disadvantages
- Cost
- Environment is unfamiliar for many

VxWorks

When to use
- When a proven, commercial real-time system is required
- Arguably lower-weight than QNX
Advantages
- Hard real-time
- Specialized development tools
- Microkernel
- Customizable
- Same vendor provides commercial Linux alternative
Disadvantages
- Cost
- Environment is unfamiliar for many

Uncategorized

Microkernel vs monolithic kernel A microkernel differs from a monolithic kernel in that it only the most basic functions are available from system calls; the kernel is broken down into separate processes, known as servers. This structure is implemented in many real-time operating systems. The practical purpose of using a microkernel is to sacrifice some performance for reliability. In a monolithic structure, a service is obtained by a single system call. In a microkernel structure a service is obtained through Inter-process Communication (IPC), this causes overhead due to the required context switch. However, since the microkernel is divided into different servers, if one fails, the other servers will work efficiently. In critical operations as control systems often are, this is an extremely important feature.

Utilization test example

Is this task set schedulable?

Task	Execution time $C$	Period $T$ (deadline)
A	2	8
B	3	10
C	4	16

$P_C = 2P_A$, meaning they are in the same family. $\implies N_U=2$, (not 3)

$$ U \equiv \sum_i^N{\frac{C_i}{T_i}} \leq N_U (2^{\frac{1}{N_U}}-1) $$

$$ \frac{2}{8} + \frac{3}{10} + \frac{4}{16} = 0.8 $$

$$ 2(2^{\frac{1}{2}}-1) = 0.8284 $$

$$ 0.8 < 0.8284 \implies \underline{\underline{\text{schedulable}}} $$

Response time analysis example

Use earliest deadline first (EDF)

Task	Execution time $C$	Period $T$	Deadline $D$
A	2	8	8
B	3	10	7
C	4	16	16

Shortest deadline is $D_B$, so we assign priorities accordingly; $B \rightarrow A \rightarrow C$.

Response time is

$$ \omega_{i}^{n+1} = C_i + \sum_{j \in \text{hp}(i)} \left\lceil \frac{\omega_{i}^{n}}{T_j} \right\rceil C_j, $$

where $\text{hp}(i)$ are all tasks with higher priority than $i$.

Finding convergent response times:

$ \omega_B^0 = C_B = 3 < 7 \quad\checkmark \\ \quad\\ \omega_A^0 = C_A = 2 \\ \omega_A^1 = 2 + 3 \lceil\frac{2}{10}\rceil = 5 \\ \omega_A^2 = 2 + 3 \lceil\frac{5}{10}\rceil = 5 < 8 \quad\checkmark \\ \quad\\ \omega_C^0 = C_C = 4 \\ \omega_C^1 = 4 + 3 \lceil\frac{4}{10}\rceil + 2 \lceil\frac{4}{8}\rceil = 9 \\ \omega_C^2 = 4 + 3 \lceil\frac{9}{10}\rceil + 2 \lceil\frac{9}{8}\rceil = 11 \\ \omega_C^3 = 4 + 3 \lceil\frac{11}{10}\rceil + 2 \lceil\frac{11}{8}\rceil = 14 \\ \omega_C^4 = 4 + 3 \lceil\frac{14}{10}\rceil + 2 \lceil\frac{14}{8}\rceil = 14 < 16 \quad\checkmark \\ \quad\\ \underline{\underline{\text{schedulable}}} $

Graphical EDF

Using the same task set as above.

Waveform EDF

Priority invertion examples

Higher priority = higher value

Symbols:

X: Normal execution
Q: Execution with resource $Q$
V: Execution with resource $V$
B: Execution with both (resource $Q$ and $V$)
-: Sleeping (preempted)
*: Waiting on resource

Inheritance example

Task set:

Task	Arrival	Priority	Execution
A	0	1	`XQQQX`
B	4	2	`XXXXXXXXX`
C	2	3	`XQX`

Without priority inheritance

                      |1
  |0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|
A |X|Q|-|Q|-|-|-|-|-|-|-|-|-|Q|-|-|X|
B | | | | |X|X|X|X|X|X|X|X|X| | | | |
C | | |X|*|*|*|*|*|*|*|*|*|*|*|Q|X| |

With priority inheritance

                      |1
  |0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|
A |X|Q|-|Q|Q|-|-|-|-|-|-|-|-|-|-|-|X|
B | | |-|-|-|-|-|X|X|X|X|X|X|X|X|X| |
C | | |X|*|*|Q|X| | | | | | | | | | |

Priority ceiling example

Task set:

Task	Arrival	Priority	Execution
A	0	1	`XQQQX`
B	4	2	`XXXXXXXXX`
C	2	3	`XQBVX`
D	4	4	`XVBQX`

Without ceiling protocol (nor inheritance)

                      |1                  |2
  |0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1|2|3|
A |X|Q|-|Q|-|-|-|-|-|-|-|-|-|-|-|Q|-|-|-|-|-|-|-|X|
B | | | | |-|-|X|X|X|X|X|X|X|X|X| | | | | | | | | |
C | | |X|*|*|*|*|*|*|*|*|*|*|*|*|*|-|-|-|Q|B|V|X| |
D | | | | |X|V|*|*|*|*|*|*|*|*|*|*|B|Q|X| | | | | |

With ceiling protocol (ICPP)

                      |1                  |2
  |0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1|2|3|
A |X|Q|Q|Q|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|X|
B | | | | |-|-|-|-|-|-|-|-|-|-|X|X|X|X|X|X|X|X|X| |
C | | |-|-|-|-|-|-|-|X|Q|B|V|X| | | | | | | | | | |
D | | | | |X|V|B|Q|X| | | | | | | | | | | | | | | |

*CPP WaveDrom

Written by

andervat larsgus larshb amundfj eiriklid borge

Last updated: Mon, 22 Nov 2021 14:16:37 +0100 .

TTK4147: Real-time Systems

Introduction

Real-time computing

Constraints

Task triggers

General purpose vs. real-time

Real-time operating systems (RTOS)

FreeRTOS

VxWorks and QNX

Real-time variants of Linux

From book

Real-time system definition

Basic characteristics of a general real-time or embedded computer system

Stages of design and implementation

Real-time definitions overview

Hardware and Operating systems

Microprocessors

4 types of instructions

Interrupts

Memory hierarchy

Cache

Transferring data from I/O to main memory

Reasons for having operating system

Hardware to OS

OS to libraries

Libraries come with OS but not strictly part of OS

From book

Multiprogramming and memory

Multiprogramming:

Process

Threads vs processes

User-level threads

Kernel-level threads

User and kernel combined

Memory management

Virtual memory – secondary memory can be accessed as if it was main memory

From book

Uniprocessor scheduling, inter-process communication and FreeRTOS

Scheduling criteria

Scheduling methods

FreeRTOS

Inter-process communication - sharing information, synchronization with

From book

Linux and real-time variants

Linux kernel – monolithic kernel with loadable modules

Concurrency

Modular monolithic kernel

Linux page replacement

Kernel memory allocation

Linux scheduling

Linux thread model

User space

Kernel space

Real-time

Xenomai

From book

Kernel components

Linux tasks

Linux threads

Linux namespace

Memory management

Real-time Scheduling

Fixed-Priority Scheduling (FPS)

Earliest deadline first (EDF)

Scheduling tests

Example

Priority inversion

Priority Ceiling protocols

Deadline monotonic priority

Adding block time to response time analysis

Scheduling aperiodic tasks

Finding worst case execution time

From book

Tests

Simple task model

Fixed-Priority Scheduling(FPS) vs. Earliest deadline first (EDF)

QNX

QNX microkernel

Fundamental services in QNX

QNX is suitable for real-time and embedded because