Contents vii
Abstract xv
Résumé xvii
Acknowledgements xix
Author’s publication list xxi
List of Figures xxiii
List of Tables xxvii
List of Algorithms xxix
List of Listings xxxi
List of Abbreviations xxxiii
I State of the art 1
1 Introduction 3
1.1 Real-time embedded system design . . . . 5
1.2 SWaP constraints and modern parallelism . . . . 8
1.3 The Hipperos project . . . . 9
1.4 Research objectives of the thesis . . . 10
1.5 State of the art and contributions . . . 14
1.6 Dissertation structure . . . 15 vii
2 Power-aware computing 19
2.1 Digital system power consumption . . . 19
2.2 Impacts of device power consumption . . . 28
2.2.1 Electric energy consumption . . . 29
2.2.2 Operating temperature . . . 29
2.2.3 Large scale impacts . . . 31
2.3 Measuring power consumption . . . 32
2.4 Techniques of power management . . . 34
2.4.1 Capacitance reduction . . . 34
2.4.2 Transistor leakage reduction . . . 35
2.4.3 Voltage scaling . . . 35
2.4.4 Frequency scaling . . . 36
2.4.5 Dynamic Voltage and Frequency Scaling . . . 41
2.4.6 Switching activity control . . . 45
2.4.7 High level strategies . . . 47
2.4.8 CPU characteristics . . . 49
2.5 A case study: the Linux kernel . . . 49
2.5.1 Core modules presentation . . . 50
2.5.2 Implementation techniques at scheduler level . . . 51
2.5.3 CPU frequency selection strategy . . . 51
2.5.4 Idle power usage . . . 54
2.5.5 Module decoupling problems . . . 55
2.6 Conclusions on power management . . . 56
3 The multi-core era 57 3.1 Principles of processor design . . . 57
3.2 Limitations of the single-core approach . . . 63
3.3 The multi-core processor design paradigm . . . 66
3.4 Multi-core platforms for embedded systems . . . 68
3.5 Conclusion . . . 70
4 Parallel programming model 71
4.1 Multi-core scheduling . . . 71
4.2 Parallel computation model . . . 73
4.2.1 Model, algorithms and limitations . . . 73
4.2.2 Parallel languages . . . 74
4.2.3 Multi-threaded algorithms . . . 75
4.2.4 Theoretical limits . . . 84
4.3 Programming frameworks . . . 94
4.4 Issues and risks of using parallelism . . . 99
4.5 Conclusion on parallel programming . . . 100
5 Real-time scheduling theory 103 5.1 A real-time embedded system example . . . 103
5.2 A model for time in embedded systems . . . 105
5.3 Model formalism . . . 106
5.3.1 Time model . . . 106
5.3.2 Application model . . . 106
5.3.3 Platform model . . . 112
5.3.4 Power consumption model . . . 114
5.3.5 Scheduling model . . . 118
5.3.6 Parallelism model . . . 125
5.4 Chosen results and scheduling algorithms . . . 127
5.4.1 Definitions and feasibility . . . 127
5.4.2 Single-core scheduling . . . 128
5.4.3 Multi-core scheduling . . . 130
5.4.4 Power-aware scheduling . . . 133
5.4.5 Parallel scheduling . . . 133 ix
II Contributions 135
6 The design of an operating system micro-kernel for multi-core embedded systems 137
6.1 The need for a new kernel . . . 137
6.2 RTOS research and implementation challenges . . . 140
6.3 A new RTOS for embedded multi-core platforms . . . 143
6.3.1 Hipperos project presentation . . . 143
6.3.2 System overview . . . 144
6.3.3 Process model . . . 145
6.3.4 Asymmetric kernel architecture . . . 147
6.3.5 Kernel configurability . . . 150
6.4 Summary . . . 150
7 Porting a safety-critical industrial application on a mixed-criticality enabled real- time operating system 153 7.1 Introduction . . . 154
7.2 Hipperos RTOS details for the use case . . . 155
7.3 The Thales use case . . . 156
7.4 Mixed-criticality model . . . 158
7.4.1 Task types . . . 158
7.4.2 Mode switching . . . 159
7.4.3 Mixed-criticality scheduling algorithms . . . 159
7.5 Mixed-criticality considerations . . . 160
7.6 Experiments . . . 161
7.6.1 Hardware platforms . . . 161
7.6.2 Experimental design . . . 163
7.6.3 The task set . . . 163
7.6.4 Effect of the LO WCET . . . 166
7.6.5 Per-core utilisation . . . 171
7.7 Conclusions . . . 171
8 Power minimisation for parallel real-time systems with malleable jobs and homo-
geneous frequencies 173
8.1 Introduction . . . 173
8.2 Related work . . . 176
8.3 Models . . . 177
8.3.1 Parallel jobs and task model . . . 177
8.3.2 Power and processor model . . . 180
8.3.3 Scheduling algorithm . . . 186
8.3.4 Problem definition . . . 188
8.4 Preliminary results . . . 188
8.4.1 Background . . . 188
8.4.2 Schedulability criteria of malleable task system with homogeneous frequency 189 8.4.3 Sustainability of the frequency for the schedulability . . . 190
8.5 Optimal processor/frequency-selection algorithm . . . 197
8.5.1 Algorithm description . . . 197
8.5.2 An example . . . 201
8.5.3 Proof of correctness . . . 201
8.6 Practical considerations . . . 204
8.6.1 Continuous range frequency selection . . . 204
8.6.2 Linear dependency between frequency and job execution speed . . . 206
8.7 Simulations . . . 207
8.7.1 Methodology overview . . . 207
8.7.2 Results & Discussion . . . 212
8.8 Conclusions . . . 217
9 Quantifying energy consumption for practical fork-join parallelism on an embed- ded real-time operating system 219 9.1 Introduction . . . 220
9.2 Parallel run-time framework . . . 222
9.2.1 Intra-task parallelism and OpenMP . . . 222 xi
9.2.2 The Hipperos RTOS . . . 226
9.2.3 Embedded platform . . . 227
9.3 Experimental setup . . . 228
9.3.1 Experimental test bed presentation . . . 228
9.3.2 Use cases description . . . 230
9.4 Expected power savings . . . 231
9.4.1 Power dissipation model . . . 232
9.4.2 Measuring a running OpenMP program . . . 233
9.5 System model . . . 238
9.5.1 Parallel task model . . . 238
9.5.2 Platform, power and energy Models . . . 241
9.6 Algorithms and policies . . . 242
9.6.1 Offline Power Optimisation . . . 243
9.6.2 Partitioned Schedulability Test . . . 244
9.6.3 Online Scheduler . . . 244
9.7 System experiments . . . 245
9.7.1 Methodology overview . . . 245
9.7.2 Results & Discussion . . . 247
9.7.3 Flaw in schedulability test . . . 250
9.7.4 Methodology details . . . 251
9.8 Related work . . . 256
9.9 Future work . . . 257
9.10 Conclusions . . . 258
III Conclusions and future work 261 10 Scheduling on heterogeneous reconfigurable platforms 263 10.1 Introduction . . . 263
10.2 Recent innovations in embedded processor architecture . . . 264
10.3 Reconfigurable platforms . . . 265
10.4 Related work . . . 266
10.5 Hipperos supporting DPR on RSoC devices . . . 268
10.5.1 DPR implementation . . . 268
10.5.2 Hardware-accelerated image filters . . . 271
10.6 The H2020 Tulipp project . . . 275
10.6.1 Project presentation . . . 275
10.6.2 The Tulipp use cases . . . 276
10.6.3 The Tulipp Reference Platform Instance . . . 284
10.7 Tulipp and Hipperos challenges for heterogeneous systems . . . 292
11 Conclusions 295 11.1 Summary of contributions and research questions . . . 295
11.2 Weaknesses and future directions . . . 300
11.3 Journey and lessons learned . . . 304
Bibliography 307
xiii