[L’introduction est en français, le reste du texte en anglais]
Points clés:
- La série de correctifs PREEMPT_RT permet au noyau Linux de répondre aux besoins des systèmes temps-réel. Différents réglages permettent un compromis entre performance et latence.
- Un paquet de support de carte minimal permet une réduction de la latence.
- Les mesures de durcissement du noyau Linux ont un impact limité sur les performances temps réel.
Systèmes temps-réel et Linux
On appelle système temps réel un système auquel on demande une réponse à un évènement ou une commande dans un délai spécifié. Un système temps-réel doux voit sa qualité de service dégradée lorsque l’échéance d’une tâche est dépassée (ex : lecteur audio ou vidéo). Lorsque ce dépassement provoque une situation critique (ex : système de contrôle d’avion), le système est dit temps-réel strict. Pour satisfaire ce cahier des charges, les systèmes temps-réel doivent être déterministes logiquement, i.e. les mêmes entrées produisent les mêmes sorties, et temporellement, i.e. la durée d’une tâche est bornée.
Les systèmes temps-réel embarqués sont utilisés dans le secteur automobile, les machines industrielles et multimédia, les systèmes Internet des Objets, et beaucoup d’autres marchés. De nos jours, ces systèmes font face à de nouveaux besoins. Ces besoins incluent l’intégration d’interfaces homme-machine, de connectivité avancée et de mesures de cybersécurité. Ces fonctionnalités sont souvent trop complexes pour des micro-contrôleurs exécutant des OS temps-réel. A l’inverse, Linux est capable de répondre à ces nouveaux besoins, mais n’a pas été conçu pour les contraintes temps-réel.
Depuis 2004, la série de correctifs PREEMPT_RT permet au noyau Linux de répondre à ces contraintes. Ces correctifs sont en cours d’intégration dans la branche principale. Cela fait de Linux une solution de choix pour le développement de systèmes temps-réel modernes. Dans cet article, les performances de différents réglages de préemption du noyau Linux sont évalués. Afin de réduire les latences mesurées, l’utilisation d’un système de fichier racine minimal est étudié. De plus, l’impact des mesures de durcissement du noyau Linux sur les performances temps-réel est mesuré.
Evaluation setup
This evaluation is done on a Toradex[1] Apalis IMX6Q module and evaluation board. The i.MX6Q [2] is a quad-core system-on-chip from NXP. It is often used in industrial and multimedia applications, due to its longevity and extensive Linux support. Toradex designs System-on-Modules (SoM). They focus on platform scalability and long availability. Savoir-Faire Linux has accompanied multiple customers using Toradex modules and can vouch for their performance and reliability.
All the latencies are measured using the cyclictest[3] tool which is part of the rt-tests[4] test suite. The measurements are run over a 1 hour period, with a priority set to 90. In the background, the stress-ng[5] tool is used to stress all the CPUs to a specified load for the test duration. At the end of the test, the temperature and load of the CPU are measured, to confirm that the stress was effective. Below is a capture of the htop output, to show load and priorities are correctly applied.
Fig. 1 : htop output during stress and cyclictest
The evaluation is made using different images generated with Yocto. The tested images are detailed below:
- preempt-none: A mainline kernel without preemption (CONFIG_PREEMPT_NONE) with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- preempt-voluntary: A mainline kernel with voluntary preemption (CONFIG_PREEMPT_VOLUNTARY) with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- preempt: A mainline kernel with preemption (CONFIG_PREEMPT) with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- preempt-rt: A mainline kernel with full preemption (CONFIG_PREEMPT_RT) with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- preempt-rt-hardened: A mainline kernel with full preemption (CONFIG_PREEMPT_RT) and kernsec[6] recommendations applied with a minimal poky distribution and only test utilities added, using a Savoir Faire Linux BSP
- ref-tdx: A reference image built on Toradex BSP, using a mainline kernel with full preemption (CONFIG_PREEMPT_RT), a reference Toradex distribution and core-image-minimal with test utilities added
A first measurement campaign is made to evaluate the different preemption options. A stronger preemption is expected to reduce the maximum latency. Stronger preemption tend to reduce the data throughput that can be processed, but this is not measured in this study.
Fig.2 : Effect of different preemption settings on the max. latency at different loads.
Results show that the PREEMPT_RT patchset efficiently reduces the maximum latency. At all loads, the maximum latency is lower than 150µs. This makes it possible to use Linux to implement real-time systems. When implementing hard real-time systems however, the maximum latency should be extensively tested to ensure no deadlines are missed. Lower latencies can be attained through driver and kernel optimizations. An alternative solution is to use a micro-controller and a RTOS with proven determinism. But for a large variety of systems, the real-time capabilities of the Linux kernel are sufficient.
Impact of minimized BSP
A second point is the impact of the userland on the real-time performance of the system. Our BSP is minimal, as it only integrates the required services and applications to run the test applications. In contrast, the standard Toradex images are more generic. This difference in performance between the two BSP is shown below. Both are configured with the CONFIG_PREEMPT_RT option for full preemption.
Fig. 3 : Comparison of latencies between Troadex and our BSP for different loads.
This difference in performance can be explained by the number of concurrent applications running on each system. The Toradex BSP embeds a large number of services, to speed up product evaluation and development for a large range of customer. The following table details the size difference between the images.
| Toradex reference image | Savoir Faire Linux image |
Image size | 46MB | 16MB |
Installed packages | 549 | 64 |
Boot time to prompt | 20s | 12s |
Running processes at boot | 171 | 100 |
Active systemd services | 35 | 20 |
Running systemd services | 16 | 11 |
As a larger number of services runs in background, it is more likely for a process to be preempted by a higher priority service task. This increases its completion time, and thus its latency. In contrast, a minimal image reduces the number of running services to the strict minimum, which limits interruptions.
Impact of security configuration
Fig. 4 : Impact of kernel hardening measures on real-time performance
In some use-cases, the addition of security constraints can decrease the performance of the application. For example, it is recommended to enable the kernel options CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. These options force the memory to be wiped before it is allocated or freed, to avoid heap content exposure. However, these options have been measured to have a performance impact of up to 7% in some synthetic benchmark[7].
This can have an impact on the reliability of real-time systems, if a deadline is missed because of security constraints. However, our measurements show that although it has an effect on the maximum latency of real-time systems, this impact is small. Moreover, it has no effect on the average latency. It can be pointed out that the measurement point at 50% latency is likely to be due to an outlier.
Fig. 5 : Impact of kernel hardening measures on real-time performance.
Indeed, longer measurements (6 hours) show more consistency. At all loads, the hardened kernel configuration increases the maximum latency, but this impact stays negligible, especially at high loads. The average latency remains constant at all loads. This proves that real-time performance requirements is not a reason to avoid good security practices.
Load distribution observation
An observation can be made that some CPU cores have worse latencies than other in the same system-on-chip. When the system load increases, one or more cores tend show higher latency measurements. This can be seen below when looking at the results at 50% load for the unhardened PREMPT_RT configuration.
Fig. 6 : Latency plots for each CPU at 50%, with CONFIG_PREEMPT_RT enabled
The CPU1 core seems to have slightly worse performance than the other. When looking at the final results, we see this is the core with the maximum latency (83µs). The average latency is only slightly higher (18µs vs. 17µs), as the high latency measurements happen orders of magnitude less often than average measurements. This hints that the scheduler may not perfectly allocate tasks to CPU cores, and that this could increase the maximum latency. However, this is a complex topic which would require more in-depth investigations.
Conclusions
Our results show that Linux is a suitable solution for real-time embedded systems development. While the maximum measured latency is higher than dedicated RTOS solutions, Linux can be used to implement hard real-time systems. However, the Linux kernel is not formally proven to be deterministic. Latencies should be thoroughly evaluated to verify that the system does not miss deadlines. The use of a minimal and customer-tailored BSP enables a reduction of latencies. Moreover, measurements show that a security hardened kernel has a limited impact on its real-time performance. This proves that security measures do not have to be avoided to meet performance requirements.
Savoir-faire Linux has developed several BSP for real-time and secured systems.
References: