Software systems are heavily configurable, in the sense that users can adapt them according to their needs thanks to configurations. But not all configurations are equals, and some of them will clearly be more efficient than others in terms of performance. For human beings, it is quite complex to handle all the possible configurations of a system and to choose among one of them to reach a performance goal. Research work have shown that machine learning can bridge this gap and predict the performance value of a software systems based on its configurations. Problem. These techniques do not include the executing environment as part of the training data, while it could interact with the different configuration options and change their related performance distribution. In short, our machine learning models are too simple and will not be useful or applicable for end-users. Contributions. In this thesis, we first propose the term deep variability to refer to the existing interactions between the environment and the configurations of a software system, altering its performance distribution. We then empirically demonstrate the existence of deep variability and propose few solutions to tame the related issues. Finally, we prove that machine learning models can be adapted to be by-design robust to deep variability.
Authors
- Bibliographic Reference
- Luc Lesoil. Deep software variability for resilient performance models of configurable systems. Other [cs.OH]. Université de Rennes, 2023. English. ⟨NNT : 2023URENS009⟩. ⟨tel-04190983v2⟩
- Department
- LANGAGE ET GÉNIE LOGICIEL
- HAL Collection
- ['Université de Rennes 1', 'CNRS - Centre national de la recherche scientifique', 'INRIA - Institut National de Recherche en Informatique et en Automatique', 'Université de Bretagne Sud', 'Institut National des Sciences Appliquées de Rennes', 'INRIA Rennes - Bretagne Atlantique', 'Irisa', 'STAR - Dépôt national des thèses électroniques', 'IRISA_SET', 'TESTALAIN1', 'Institut de Recherche en Informatique et Systèmes Aléatoires - Composante INSA Rennes', 'Ecole CentraleSupélec', 'INRIA 2', "Thèses de l'Université de Rennes 1", 'Publications labos UR1 dans HAL-Rennes 1', 'UR1 - publications Maths-STIC', 'UFR ISTIC Informatique et électronique', 'TEST Université de Rennes CSS', 'Université de Rennes', 'INRIA-RENGRE', 'Pôle Rennes 1 - Mathématiques - Numérique']
- HAL Identifier
- 4190983
- Institution
- ['Institut National de Recherche en Informatique et en Automatique', 'Université de Rennes', 'Institut National des Sciences Appliquées - Rennes', 'Université de Bretagne Sud', 'École normale supérieure - Rennes', 'CentraleSupélec', 'IMT Atlantique']
- Laboratory
- ['Inria Rennes – Bretagne Atlantique', 'Institut de Recherche en Informatique et Systèmes Aléatoires']
- Published in
- France
Table of Contents
- List of acronyms 16
- List of figures 17
- List of tables 19
- I State Of The Art 22
- Background 23
- Configurable Systems 23
- Software Product Lines 23
- Software Variability 25
- Performance Properties 27
- On the Complexity of Predicting Performance 27
- Performance Models 28
- Machine Learning 28
- Sampling, Measuring, Learning 29
- Benefits 31
- Drawbacks 31
- State-of-the-Art 33
- Browsing the Related Work 33
- Impact of the Software Environment on Performance 36
- Input Sensitivity 36
- Hardware Platforms 38
- Compile-time Variability 38
- Existing Solutions 38
- Transfer Learning 38
- Contextual Performance Models 40
- II Empirical Evidence of Deep Variability 42
- Introducing Deep Variability 43
- Definition 43
- Motivational Example 44
- Challenges and Opportunities 46
- Conclusion 49
- Exploring the Input Sensitivity of Configurable Systems 51
- Problem Statement 51
- Motivational Example 51
- Sensitivity to Inputs of Configurable Systems 52
- The Input Dataset 52
- Performance Correlations between Inputs (RQ1) 55
- Effects of Options (RQ2) 57
- Impact of Inputs on Performance (RQ3) 60
- Threats to Validity 62
- A Score to Quantify Input Sensitivity 63
- Implications, Insights and Open Challenges 64
- Conclusion 66
- The Interplay between Compile-time and Run-time Variability 67
- Problem Statement 67
- Motivational Example 67
- Interplay between Compile-time and Run-time Options 68
- The Compile-time Dataset 69
- Run-time Performance Distributions (RQ1) 71
- Quantify Compile-time Performance Variations (RQ2) 73
- Interplay between Compile-time and Run-time Options (RQ3) 75
- Cross-Layer Tuning (RQ4) 77
- Threats 79
- Discussion 80
- Conclusion 81
- III Exploit Deep Variability to Extend the Lifespan of Performance Models 84
- Reuse Performance Models across Environments 85
- Motivational Example 85
- Groups Inputs across Space (RQ1) 87
- Group Hardware Platforms across Time (RQ2) 90
- Protocol 90
- Results 91
- Group Inputs across Time (RQ3) 92
- Protocol 92
- Results 93
- Discussion 94
- Conclusion 95
- Reuse Performance Models across Software Systems 96
- Identify Similar Software Systems (RQ1) 96
- Protocol 96
- Evaluation 97
- Transfer Learning across Distinct Software Systems: A Proof of Concept (RQ2) 98
- Protocol 98
- Evaluation 101
- Threats to Validity 103
- Discussion 103
- Find Transferable Software Systems. 103
- When and How to Transfer? 104
- What to Transfer? 104
- Conclusion 105
- IV Train Performance Models Resilient to Deep Variability 108
- Train Input-aware Performance Models 109
- Problem Statement 109
- Input-Aware Performance Models 109
- Offline and Online Costs 110
- User Stories 111
- Using Properties to Discriminate Inputs 112
- Implementation of Performance Prediction Models 112
- Selecting Algorithm (RQ1) 114
- Selecting Inputs (RQ2) 116
- Selecting Configurations (RQ3) 120
- Selecting Approaches (RQ4) 122
- Discussion 123
- Threats to Validity 124
- Conclusion 125
- Conclusion and Perspectives 127
- Conclusion 127
- Perspectives 130
- Estimate the Uncertainty of Scientific Results 130
- Towards Collectively Defining a Dataset for Deep Variability 131
- Mining Open Data to Infer Information related to Deep Variability 132
- Deep Variability-Aware Hacking 133
- List of publications 135
- Appendix 136
- The Phoronix Dataset 136
- Mine Open Data to Predict Hardware Performance 137
- What is my Hardware Model Worth? 139
- Bibliography 140