- The PREEMPT_RT patchset enables Linux to meet real-time requirements. Different settings allow compromise between latency and data throughput.
- A minimal Board Support Package (BSP) and distribution can improve latencies.
- Kernel security hardening measures have a limited impact on latencies.
Real-time systems and Linux
Systems that are required to guarantee a response to an event or command before a specified deadline are called real-time systems. A soft real-time system has its service degraded when a deadline is missed (e.g. a video/audio player). When a single miss results in a system failure (e.g. a plane engine control system), the system is considered hard real-time. To meet these constraints, real-time systems must be deterministic logically, i.e. the same inputs produce the same outputs, and temporally, i.e. the duration of task is bounded.
Embedded real-time systems can be found in automotive control systems, industry machinery, multimedia systems, Internet of Things edge devices, and many more markets. Nowadays, these systems face increasing requirements. These requirements include the integration of IHMs, advanced connectivity and cybersecurity prevention measures. These features are often too complex to implement on micro-controllers running standard RTOS (Real-Time OS). Contrarily, Linux is fully equipped to deal with these new requirements, but has not been originally designed for real-time systems.
Since 2004, the PREEMPT_RT patchset has been used to make the Linux kernel able to meet real-time requirements. This patchset is now being integrated in the mainline branch. This makes Linux a compelling candidate for modern real time systems development. In this article, the performance of Linux kernel preemption settings is compared. The use of a minimal root filesystem is investigated to further reduce latency. Moreover, the impact of kernel security hardening settings is evaluated.
This evaluation is done on a Toradex Apalis IMX6Q module and evaluation board. The i.MX6Q  is a quad-core system-on-chip from NXP. It is often used in industrial and multimedia applications, due to its longevity and extensive Linux support. Toradex designs System-on-Modules (SoM). They focus on platform scalability and long availability. Savoir-Faire Linux has accompanied multiple customers using Toradex modules and can vouch for their performance and reliability.
All the latencies are measured using the cyclictest tool which is part of the rt-tests test suite. The measurements are run over a 1 hour period, with a priority set to 90. In the background, the stress-ng tool is used to stress all the CPUs to a specified load for the test duration. At the end of the test, the temperature and load of the CPU are measured, to confirm that the stress was effective. Below is a capture of the htop output, to show load and priorities are correctly applied.
Fig. 1 : htop output during stress and cyclictest
The evaluation is made using different images generated with Yocto. The tested images are detailed below:
- preempt-none: A mainline kernel without preemption (CONFIG_PREEMPT_NONE) with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- preempt-voluntary: A mainline kernel with voluntary preemption (CONFIG_PREEMPT_VOLUNTARY) with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- preempt: A mainline kernel with preemption (CONFIG_PREEMPT) with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- preempt-rt: A mainline kernel with full preemption (CONFIG_PREEMPT_RT) with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- preempt-rt-hardened: A mainline kernel with full preemption (CONFIG_PREEMPT_RT) and kernsec recommendations applied with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- ref-tdx: A reference image built on Toradex BSP, using a mainline kernel with full preemption (CONFIG_PREEMPT_RT), a reference Toradex distribution and core-image-minimal with test utilities added
A first measurement campaign is made to evaluate the different preemption options. A stronger preemption is expected to reduce the maximum latency. Stronger preemption tend to reduce the data throughput that can be processed, but this is not measured in this study.
Fig.2 : Effect of different preemption settings on the max. latency at different loads.
Results show that the PREEMPT_RT patchset efficiently reduces the maximum latency. At all loads, the maximum latency is lower than 150µs. This makes it possible to use Linux to implement real-time systems. When implementing hard real-time systems however, the maximum latency should be extensively tested to ensure no deadlines are missed. Lower latencies can be attained through driver and kernel optimizations. An alternative solution is to use a micro-controller and a RTOS with proven determinism. But for a large variety of systems, the real-time capabilities of the Linux kernel are sufficient.
Impact of minimized BSP
A second point is the impact of the userland on the real-time performance of the system. Our BSP is minimal, as it only integrates the required services and applications to run the test applications. In contrast, the standard Toradex images are more generic. This difference in performance between the two BSP is shown below. Both are configured with the CONFIG_PREEMPT_RT option for full preemption.
Fig. 3 : Comparison of latencies between Troadex and our BSP for different loads.
This difference in performance can be explained by the number of concurrent applications running on each system. The Toradex BSP embeds a large number of services, to speed up product evaluation and development for a large range of customer. The following table details the size difference between the images.
|Toradex reference image||Savoir Faire Linux image|
|Boot time to prompt||20s||12s|
|Running processes at boot||171||100|
|Active systemd services||35||20|
|Running systemd services||16||11|
As a larger number of services runs in background, it is more likely for a process to be preempted by a higher priority service task. This increases its completion time, and thus its latency. In contrast, a minimal image reduces the number of running services to the strict minimum, which limits interruptions.
Impact of security configuration
Fig. 4 : Impact of kernel hardening measures on real-time performance
In some use-cases, the addition of security constraints can decrease the performance of the application. For example, it is recommended to enable the kernel options CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. These options force the memory to be wiped before it is allocated or freed, to avoid heap content exposure. However, these options have been measured to have a performance impact of up to 7 % in some synthetic benchmark.
This can have an impact on the reliability of real-time systems, if a deadline is missed because of security constraints. However, our measurements show that although it has an effect on the maximum latency of real-time systems, this impact is small. Moreover, it has no effect on the average latency. It can be pointed out that the measurement point at 50% latency is likely to be due to an outlier.
Fig. 5 : Impact of kernel hardening measures on real-time performance.
Indeed, longer measurements (6 hours) show more consistency. At all loads, the hardened kernel configuration increases the maximum latency, but this impact stays negligible, especially at high loads. The average latency remains constant at all loads. This proves that real-time performance requirements is not a reason to avoid good security practices.
Load distribution observation
An observation can be made that some CPU cores have worse latencies than other in the same system-on-chip. When the system load increases, one or more cores tend show higher latency measurements. This can be seen below when looking at the results at 50% load for the unhardened PREMPT_RT configuration.
Fig. 6 : Latency plots for each CPU at 50%, with CONFIG_PREEMPT_RT enabled
The CPU1 core seems to have slightly worse performance than the other. When looking at the final results, we see this is the core with the maximum latency (83µs). The average latency is only slightly higher (18µs vs. 17µs), as the high latency measurements happen orders of magnitude less often than average measurements. This hints that the scheduler may not perfectly allocate tasks to CPU cores, and that this could increase the maximum latency. However, this is a complex topic which would require more in-depth investigations.
Our results show that Linux is a suitable solution for real-time embedded systems development. While the maximum measured latency is higher than dedicated RTOS solutions, Linux can be used to implement hard real-time systems. However, the Linux kernel is not formally proven to be deterministic. Latencies should be thoroughly evaluated to verify that the system does not miss deadlines. The use of a minimal and customer-tailored BSP enables a reduction of latencies. Moreover, measurements show that a security hardened kernel has a limited impact on its real-time performance. This proves that security measures do not have to be avoided to meet performance requirements.
Savoir-faire Linux has developed several BSP for real-time and secured systems.