My Publications
You can also browse my Google Scholar profile
-
P. Lago, P. Runeson, Q. Song, R. Verdecchia,
Threats to Validity in Software Engineering–hypocritical paper section or essential analysis?
International Symposium on Empirical Software Engineering and Measurement (ESEM), 2024.
[Abstract] [BibTeX] [PDF]@article{lago2023threats, title={Threats to Validity in Software Engineering Research: A Critical Reflection}, author={Lago, Patricia and Runeson, Per and Song, Qunying and Verdecchia, Roberto}, journal={Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement}, publisher={ACM/IEEE}, year={2024} }
Background: In recent years, a discourse on how to systematically consider and report threats to validity started to gain momentum within the empirical software engineering community.
Aims: With this study, we aim to systematically underpin the current state of threats to validity practices in software engineering research.
Method: We conduct a literature review comprising 91 papers awarded with the ACM SIGSOFT Distinguished Paper Award at the ACM/IEEE International Conference on Software Engineering. Data is extracted and analyzed by considering six main facets of threats to validity, e.g., their explicit documentation, categorization, discussion of limitations, and trade-offs.
Results: Results corroborate current critiques to the threats management state of the art. Threats result to be seldom discussed in depth, and are mostly considered as an enforced afterthought rather than an active concern of the research design and execution.
Conclusions: To improve the observed practice, we derived items to consider for researchers, reviewers and readers, and call for a community action to increase the understanding of knowledge creation in empirical software engineering research. -
S. Migliorini, R. Verdecchia, I. Malavolta, P. Lago, E. Vicario
Architectural Views: The State of Practice in Open-Source Software Projects
European Conference on Software Architecture (ECSA), 2024.
🏆Best Paper Award.
[Abstract] [BibTeX] [PDF] [Slides]@article{migliorini2024architectural, title={Architectural Views: The State of Practice in Open-Source Software Projects}, author={Migliorini, Sofia and Verdecchia, Roberto and Malavolta, Ivano and Lago, Patricia and Vicario, Enrico}, journal={European Conference on Software Architecture (ECSA)}, publisher={Springer}, year={2024} }
Context: Architectural views serve as fundamental artefacts for designing and communicating software architectures. In the context of collaborative software development, producing sound architectural documentation, where architectural views play a central role, is a crucial aspect for effective teamwork. Despite their importance, the use of architectural views in open-source projects to date remains only marginally explored.
Goal: We aim at conducting a comprehensive analysis on an extensive corpus of open-source architectural views. The goal is to understand (i) what the "history" of architectural views is, (ii) how architectural views are represented, and (iii) what architectural views are used for in the context of open-source projects.
Methods: We leverage a software repository mining process to systematically construct a dataset of 15k architectural views. Then, we perform (i) a quantitative analysis on the metadata of all 15k views and (ii) a qualitative analysis on a statistically-relevant sample of 373 views.
Results: Most projects rely on a single architectural view, which is often used to document a medium or high level description of the architecture. Views are usually created at either the beginning or at the end of a project, are rarely updated, and tend to be maintained by a single contributor. Views usually adopt an informal colored notation without a supporting legend and frequently report technologies used. Deployment and control flow are the most recurrent viewpoints, and commonly cover concerns related to software maintainability and functional suitability.
Conclusion: The state of the practice about architectural views in open-source software systems seems to favor informal descriptions. Despite this, the effort needed to create views might hinder keeping views up to date, and a common syntactic ground between viewpoints seems hard to find. To address current needs, we speculate that a solution could lie in defining and popularizing versionable, templateable views that can be integrated in collaborative programming environments. -
Á. Domingo Reguero, S. Martínez-Fernández, R. Verdecchia
Energy-efficient neural network training through runtime layer freezing, model quantization, and early stopping
Computer Standards & Interfaces, 2024.
[Abstract] [BibTeX] [PDF]@article{reguero2024energy, title={Energy-efficient neural network training through runtime layer freezing, model quantization, and early stopping}, author={Domingo Reguero, \'Alvaro and Mart\'inez-Fern\'andez, Silverio and Verdecchia, Roberto}, journal={Computer Standards & Interfaces}, publisher={Elsevier}, year={2024} }
Background: In the last years, neural networks have been massively adopted by industry and research in a wide variety of contexts. Neural network milestones are generally reached by scaling up computation, completely disregarding the carbon footprint required for the associated computations. This trend has become unsustainable given the ever-growing use of deep learning, and could cause irreversible damage to the environment of our planet if it is not addressed soon.
Objective: In this study, we aim to analyze not only the effects of different energy saving methods for neural networks but also the effects of the moment of intervention, and what makes certain moments optimal.
Method: We developed a novel dataset by training convolutional neural networks in 12 different computer vision datasets and applying runtime decisions regarding layer freezing, model quantization and early stopping at different epochs in each run. We then fit an auto-regressive prediction model on the data collected capable to predict the accuracy and energy consumption achieved on future epochs for different methods. The predictions on accuracy and energy are used to estimate the optimal training path.
Results: Following the predictions of the model can save 56.5% of energy consumed while also increasing validation accuracy by 2.38% by avoiding overfitting.The prediction model developed can predict the validation accuracy with a 8.4% of error, the energy consumed with a 14.3% of error and the trade-off between both with a 8.9% of error.
Conclusions: This prediction model could potentially be used by the training algorithm to decide which methods apply to the model and at what moment in order to maximize the accuracy-energy trade-off. -
Tomaso Trinci, Simone Magistri, R. Verdecchia, Andrew D. Bagdanov
How green is continual learning, really? Analyzing the energy consumption in continual training of vision foundation models
International Workshop on Green Foundation Models (GreenFOMO), 2024.
[Abstract] [BibTeX] [PDF]@article{trinci2024how, title={How green is continual learning, really? Analyzing the energy consumption in continual training of vision foundation models}, author={Trinci, tomaso and Magistri, Simone and Verdecchia, Roberto and Bagdanow, Andrew}, journal={International Workshop on Green Foundation Models (GreenFOMO)}, publisher={Springer}, year={2024} }
With the ever-growing adoption of AI, its impact on the environment is no longer negligible. Despite the potential that continual learning could have towards Green AI, its environmental sustainability remains relatively uncharted. In this work we aim to gain a systematic understanding of the energy efficiency of continual learning algorithms. To that end, we conducted an extensive set of empirical experiments comparing the energy consumption of recent representation-, prompt-, and exemplar-based continual learning algorithms and two standard baseline (fine tuning and joint training) when used to continually adapt a pre-trained ViT-B/16 foundation model. We performed our experiments on three standard datasets: CIFAR-100, ImageNet-R, and DomainNet. Additionally, we propose a novel metric, the Energy NetScore, which we use measure the algorithm efficiency in terms of energy-accuracy trade-off. Through numerous evaluations varying the number and size of the incremental learning steps, our experiments demonstrate that different types of continual learning algorithms have very different impacts on energy consumption during both training and inference. Although often overlooked in the continual learning literature, we found that the energy consumed during the inference phase is crucial for evaluating the environmental sustainability of continual learning models. -
Z. Codabux, F. Fard, R. Verdecchia, F. Palomba, D. Di Nucci, G. Recupito
Teaching Mining Software Repositories
Teaching Empirical Research Methods in Software Engineering (Springer), 2024.
[Abstract] [BibTeX] [PDF]@article{codabux2024teaching, title={Architectural Views: The State of Practice in Open-Source Software Projects}, author={Codabux, Zadia and Fatemeh, Fard and Verdecchia, Roberto and Palomba, Fabio and Di Nucci, Fabio and Recupito, Gilberto}, journal={Teaching Mining Software Repositories}, publisher={Springer}, year={2024} }
Mining Software Repositories (MSR) has become a popular research area recently. MSR analyzes different sources of data, such as version control systems, code repositories, defect tracking systems, archived communication, deployment logs, and so on, to uncover interesting and actionable insights from the data for improved software development, maintenance, and evolution. This chapter provides an overview of MSR and how to conduct an MSR study, including setting up a study, formulating research goals and questions, identifying repositories, extracting and cleaning the data, performing data analysis and synthesis, and discussing MSR study limitations. Furthermore, the chapter discusses MSR as part of a mixed method study, how to mine data ethically, and gives an overview of recent trends in MSR as well as reflects on the future. As a teaching aid, the chapter provides tips for educators, exercises for students at all levels, and a list of repositories that can be used as a starting point for an MSR study. -
N. Pollini, K. Maggi, R. Verdecchia, E. Vicario
Learning Programming without Teachers: An Ongoing Ethnographic Study at 42
Workshop on evaLuation and assEssment in softwARe eNgineers Education and tRaining (LEARNER), 2024.
[Abstract] [BibTeX] [PDF]@article{pollini2024learning, title={Learning Programming without Teachers: An Ongoing Ethnographic Study at 42}, author={Niccoló, Pollini, and Maggi, Kevin, and Verdecchia, Roberto, and Vicario, Enrico}, journal={Workshop on evaLuation and assEssment in softwARe eNgineers Education and tRaining (LEARNER)}, publisher={ACM}, year={2024} }
Background: Over the past decade, microservices have surged in popularity within software engineering. From a research Context: With the ever-evolving software landscape, methods to train software programmers are continuously advancing and evolving. In this investigation, we study the case of 42, a programming school with over 50 campuses worldwide. 42’s pedagogical method blends elements of problem-based learning, peer pedagogy, community building, and gamification.
Objectives: The goal of the research is twofold: On one hand, to gain a deep understanding of the pedagogical method itself, and on the other hand, to study how its different components affect learning.
Method: We adopt an ethnographic qualitative inquiry, with two academic researchers conducting participant observation over a period of six months by using activity theory as theoretical underpinning.
Results: Problems of incremental difficulty, albeit challenging, foster virtuous cycles of reinforcing feedback and community building. Gamification and peer learning elements, which are deeply rooted in the carefully crafted educational receipt, further support the pedagogical method.
Conclusions: The characteristic nature of 42 positions it as an outlier compared to the recurrent academic setting of frontal lectures followed by a final exam, making it a valuable case study to understand how various pedagogical components may function, interact, and affect student learning. -
K. Maggi, R. Verdecchia, L. Scommegna, E. Vicario
CLAIM: a Lightweight Approach to Identify Microservices in Dockerized Environments
International Conference on Evaluation and Assessment in Software Engineering (EASE), 2024.
[Abstract] [BibTeX] [PDF]@article{maggi2023claim, title={CLAIM: a Lightweight Approach to Identify Microservices in Dockerized Environments}, author={Maggi, Kevin, and Verdecchia, Roberto, and Scommegna, Leonardo and Vicario, Enrico}, journal={International Conference on Evaluation and Assessment in Software Engineering (EASE)}, publisher={ACM}, year={2024} }
Background: Over the past decade, microservices have surged in popularity within software engineering. From a research viewpoint, mining studies are frequently employed to assess the evolution of diverse microservice properties. Despite the growing need, a validated static method to swiftly identify microservices seems to be currently missing in the literature.
Aims: We present CLAIM, a lightweight static approach that analyzes configuration files to identify microservices in Dockerized environments, specifically designed with mining studies in mind.
Method: To validate CLAIM, we conduct an empirical experiment comprising 20 repositories, 160 microservices, and 13k commits. A priori and manually defined ground truths are used to evaluate CLAIM's microservice identification effectiveness and efficiency.
Results: CLAIM detects microservices with an accuracy of 82.0%, reports a median execution time of 61ms per commit, and requires in the worst case scenario 125.5s to analyze the history of a repository comprising 1509 commits. With respect to its closest competitor, CLAIM shines most in terms of false positive reduction (-40%).
Conclusions: While not able to reconstruct a microservice archi- tecture in its entirety, CLAIM is an effective and efficient option to swiftly identify microservices in Dockerized environments, and seems especially fitted for software evolution mining studies -
R. Verdecchia, K. Maggi, L. Scommegna, E. Vicario
Technical Debt in Microservices: A Mixed-Method Case Study
Post-proceedings of the European Conference on Software Architecture, 2024.
[Abstract] [BibTeX] [PDF]@article{verdecchia2024technical, title={Tracing the Footsteps of Technical Debt in Microservices: A Preliminary Case Study}, author={Verdecchia, Roberto and Maggi, Kevin and Scommegna, Leonardo and Vicario, Enrico}, journal={Post-proceedings of the European Conference on Software Architecture}, publisher={Springer}, year={2024} }
Background: Despite the rising interest of both academia and industry in microservice-based architectures and technical debt, the landscape remains uncharted when it comes to exploring the technical debt evolution in software systems built on this architecture.
Aims: This study aims to unravel how technical debt evolves in software-intensive systems that utilize microservice architecture, focusing on (i) the patterns of its evolution, and (ii) the correlation between technical debt and the number of microservices.
Method: We employ a mixed-method case study on an application with 13 microservices, 977 commits, and 38k lines of code. Our approach combines repository mining, automated code analysis, and manual inspection. The findings are discussed with the lead developer in a semi-structured interview, followed by a reflexive thematic analysis.
Results: Despite periods of no TD growth, TD generally increases over time. TD variations can occur irrespective of microservice count or commit activity. TD and microservice numbers are often correlated. Adding or removing a microservice impacts TD similarly, regardless of existing microservice count.
Conclusions: Developers must be cautious about the potential technical debt they might introduce, irrespective of the development activity conducted or the number of microservices involved. Maintaining steady technical debt during prolonged pe riods of time is possible, but growth, particularly during innovative phases, may be unavoidable. While monitoring technical debt is the key to start managing it, technical debt code analysis tools must be used wisely, as their output always necessitates also a qualitative system understanding to gain the complete picture -
L. Scommegna, , R. Verdecchia, E. Vicario
Unveiling Faulty User Sequences: A Model-based Approach to Test Three-Tier Software Architectures
Journal of Software and Systems, 2024.
[Abstract] [BibTeX] [PDF]@article{sgommegna2024testing, title={Unveiling Faulty User Sequences: A Model-based Approach to Test Three-Tier Software Architectures}, author={Scommegna, Leonardo, and Verdecchia, Roberto and Vicario, Enrico}, journal={Journal of Software and Systems}, publisher={Springer}, year={2024} }
Context. When testing three-tiered architectures, strategies often rely on superficial information, e.g., black-box input. However, the correct behavior of software-intensive systems based on such architectural pattern also depends on the logic hidden behind the interface. Verifying the response process is thus often complex and requires ad-hoc strategies. Objective. We propose an approach to identify faults hidden behind the presentation layer. The model-based approach uses an architectural abstraction called managed component Data Flow Graph (mcDFG). The mcDFG is aware of the interactions between all layers of the architecture and guides the generation of tests based on different mcDFG coverage criteria to identify faults in the business logic. Method. To evaluate the approach viability, we consider a three-tiered web application and 32 faults. The fault detection capability is assessed by comparing a set of test suites created by following our method and a set of test suites developed by utilizing traditional testing strategies. Results. The collected data show that the proposed model-based approach is a viable option to identify faults hidden in the logic layer, as it can outperform standard strategies based solely on the presentation layer while keeping the number of test cases and number of interactions per test case low. -
M. Funke, P. Lago, R. Verdecchia, Roel Donker
A Process for Monitoring the Impact of Architecture Principles on Sustainability: An Industrial Case Study
MDPI Software, 2024.
[Abstract] [BibTeX] [PDF]@article{funke2024aprocess, title={A Process for Monitoring the Impact of Architecture Principles on Sustainability: An Industrial Case Study}, author={Funke, Markus, and Lago, Patricia, and Verdecchia, Roberto and Donker, Roel}, journal={Software}, publisher={MDPI}, year={2024} }
Abstract: Architecture principles affect a software system holistically. Given their alignment with a business strategy, they should be incorporated within the validation process covering aspects of sustainability. However, current research discusses the influence of architecture principles on sustainability in a limited context. Our objective is to introduce a reusable process for monitoring and evaluating the impact of architecture principles on sustainability from a software architecture perspective. We seek to demonstrate the application of such a process in professional practice. A qualitative case study was conducted in the context of a Dutch airport management company. Data collection involved the case analysis and the execution of two rounds of expert interviews. We (i) identified a set of case-related Key Performance Indicators, (ii) utilized commonly accepted measurement tools, and (iii) employed graphical representations in form of spider charts to monitor the sustainability impact. The real-world observations were evaluated through a concluding focus group. Our findings indicate that architecture principles are a feasible mechanism to address sustainability across all different architecture layers within the enterprise. The experts considered the sustainability analysis valuable in guiding the software architecture process towards sustainability. With the emphasis on principles, we facilitate industry adoption by embedding sustainability in existing mechanisms. -
R. Verdecchia, L. Scommegna, E. Vicario, T. Pecorella
Designing a Future-Proof Reference Architecture for Network Digital Twins
Post-proceedings of the European Conference on Software Architecture, 2024.
[Abstract] [BibTeX] [PDF]@article{verdecchia2024designing, title={Designing a Future-Proof Reference Architecture for Network Digital Twins}, author={Verdecchia, Roberto and Scommegna, Leonardo and Pecorella, Tommaso, and Vicario, Enrico}, journal={Post-proceedings of the European Conference on Software Architecture}, publisher={Springer}, year={2024} }
As the complexity, distribution, and heterogeneity of networks continue to grow, how to architect and monitor of these networking environments is becoming an increasingly critical open issue. Digital twins, which can replicate the structure and behavior of a physical network, are seen as potential solution to address the problem. While reference architectures for digital twins exist in other fields, a comprehensive reference architecture for the networking context has yet to be developed. This paper discusses the need for such a reference architecture and outlines the key elements necessary for its design. We present the findings of a preliminary survey that explores the need for a network digital twin reference architecture, the crucial information it should include, and practical insights into its design. The survey results confirm that existing standards are inadequate for modeling network digital twins, outlining the necessity of a new reference architecture. We then articulate our position on the need for a reference architecture for network digital twins, focusing on three main aspects, namely: (i) digital twins of what, (ii) for what, and (iii) how to deploy them. We then proceed to delineate the fundamental obstacles that a reference architecture must confront, in tandem with the essential characteristics it needs to embody to successfully navigate these challenges. As conclusion, we present our vision for the reference architecture and outline the main research steps we plan to take to address this open problem. Our ultimate goal is to tightly collaborate both with the networking and digital twin software architecture communities to jointly establish a sound network digital twin architecture of the future. -
R. Verdecchia, E. Engström, P. Lago, P. Runeson, Q. Song
Threats to Validity in Software Engineering Research: A Critical Reflection
Information and Software Technology (IST), 2023.
[Abstract] [BibTeX] [PDF]@article{verdecchia2023threats, title={Threats to Validity in Software Engineering Research: A Critical Reflection}, author={Verdecchia, Roberto and Engstr\"om, Emelie and Lago, Patricia and Runeson, Per and Song, Qunying}, journal={Information and Software Technology (IST)}, publisher={Springer}, year={2023} }
Context: In the contemporary body of software engineering literature, some recurrent shortcomings characterize how threats to validity (TTV) are considered in studies.
Objective: With this position paper, we aim to open a discourse on the current use of TTV sections. The goal of our position is to jointly reflect and systematically improve how we, as a research community, consider TTV in our studies.
Method: Based on our personal experience as researchers, authors, reviewers, and editors, we critically reflect on the treatment of TTV in current empirical software engineering literature.
Results: We discuss the key shortcomings of TTV consideration, including the failure to acknowledge different types of validity categorizations and the tendency to treat threats just as an afterthought. For each identified problem, we propose a vision for an improved state, intending to catalyze thoughtful engagement and improvements the way our community addresses TTV.
Conclusion: We posit there is an urgent need to reconsider how we approach, document, and evaluate TTV in software engineering research. -
R. Verdecchia, K. Maggi, L. Scommegna, E. Vicario
Tracing the Footsteps of Technical Debt in Microservices: A Preliminary Case Study
International Workshop on Quality in Software Architecture (QUALIFIER), 2023.
[Abstract] [BibTeX] [PDF]@article{verdecchia2023tracing, title={Tracing the Footsteps of Technical Debt in Microservices: A Preliminary Case Study}, author={Verdecchia, Roberto and Maggi, Kevin and Scommegna, Leonardo and Vicario, Enrico}, journal={International Workshop on Quality in Software Architecture (QUALIFIER)}, publisher={Springer}, year={2023} }
Background: Albeit the growing academic and industrial interest in microservice architectures and technical debt, to date no study aimed to investigate the evolution characteristics of technical debt in software-intensive systems based on such architecture.
Aims: The goal of this study is to understand how technical debt evolves in microservice-based software-intensive systems, in terms of (i) evolution trends, and (ii) relation between technical debt and number of microservices.
Method: We adopt a case study based on an application comprising 13 microservices, 977 commits, and 38k lines of code. The research method is based on repository mining and automated source code analysis complemented via manual code inspection.
Results: While long periods of development without TD increase are observed, TD overall increases in time. TD variations can happen regardless of the number of microservices and development activity considered in a commit. TD and number of microservices are strongly correlated, albeit not always. Adding (or removing) a microservice has a similar impact on TD regardless of the number of microservices already present in a software-intensive system.
Conclusions: Adherence to microservice architecture principles might keep technical debt compartmentalized within microservices and hence more manageable. Developers should pay keen attention to the technical debt they may introduce, regardless of the number of microservice they touch with a commit and the development activity they carry out. Keeping technical debt constant during the evolution of a microservice-based architecture is possible, but the growth of technical debt while a software-intensive systems becomes bigger and more complex might be inevitable. -
J. Balanza-Martinez, P. Lago, and Roberto Verdecchia
Tactics for Software Energy Efficiency: A Review
International Conference on Environmental Informatics (EnviroInfo), 2023.
[Abstract] [BibTeX] [PDF]@article{balanza2023tactics, title={Tactics for Software Energy Efficiency: A Review}, author={Balanza-Martinez, Jose and Lago, Patricia and Verdecchia, Roberto}, journal={International Conference on Environmental Informatics (EnviroInfo)}, publisher={Springer}, year={2023} }
Context: Over the years, software systems experienced a growing popularization. With it, the energy they consume witnessed an exponential growth, surpassing the one of the entire aviation sector. Energy efficiency tactics can be used to optimize software energy consumption.
Objectives: In this work, we aim at understanding the state of the art of energy efficient tactics, in terms of activities in the field, tactic properties, tactic evaluation rigor, and potential for industrial adoption.
Method: We leverage a systematic literature review based on a search query and two rounds of bi-directional snowballing. We identify 142 primary studies, reporting on 163 tactics, which we extract and analyze via a mix of qualitative and quantitative research methods.
Results: The research interest in the topic peaked in 2015 and then steadily declined. Tactics on source code static optimizations and application level dynamic monitoring are the most frequently studied. Industry involvement is limited. This potentially creates a vicious cycle in which practitioners cannot apply tactics due to low industrial relevance, and academic researchers struggle to increase the industrial relevance of their findings.
Conclusions: Despite the energy consumed by software is a growing concern, the future of energy efficiency tactics research does not look bright. From our results emerges a call for action, the need for academic researchers and industrial practitioners to join forces for creating real impact. -
J. A.Edbert, S. J. Oishwee, S. Karmakar, Z. Codabux, R. Verdecchia
Exploring Technical Debt in Security Questions on Stack Overflow
International Symposium on Empirical Software Engineering and Measurement (ESEM), 2023.
[Abstract] [BibTeX] [PDF]@article{ebert2023exploring, title={Exploring Technical Debt in Security Questions on Stack Overflow}, author={Edbert, Aldrich Joshua and Oishwee, Sahrima Jannat and Karmakar, Shubhashis and Codabux, Zadia and Verdecchia, Roberto}, journal={International Symposium on Empirical Software Engineering and Measurement (ESEM)}, publisher={ACM/IEEE}, year={2023} }
Background: Software security is crucial to ensure that the users are protected from undesirable consequences such as malware attacks which can result in loss of data and, subsequently, financial loss. Technical Debt (TD) is a metaphor incurred by suboptimal decisions resulting in long-term con- sequences such as increased defects and vulnerabilities if not managed. Although previous studies have studied the relation- ship between security and TD, examining their intersection in developers’ discussion on Stack Overflow (SO) is still unexplored.
Aims: This study investigates the characteristics of security- related TD questions on SO. More specifically, we explore the prevalence of TD in security-related queries, identify the security tags most prone to TD, and investigate which user groups are more aware of TD.
Method: We mined 117,233 security-related questions on SO and used a deep-learning approach to identify 45,078 security-related TD questions. Subsequently, we conducted quantitative and qualitative analyses of the collected security- related TD questions, including sentiment analysis.
Results: Our analysis revealed that 38% of the security questions on SO are security-related TD questions. The most recurrent tags among the security-related TD questions emerged as “security” and “encryption.” The latter typically have a neutral sentiment, are lengthier, and are posed by users with higher reputation scores.
Conclusions: Our findings reveal that developers implicitly discuss TD, suggesting developers have a potential knowledge gap regarding the TD metaphor in the security domain. Moreover, we identified the most common security topics mentioned in TD-related posts, providing valuable insights for developers and researchers to assist developers in prioritizing security concerns in order to minimize TD and enhance software security. -
R. Verdecchia, J. Sallou, L. Cruz
A Systematic Review of Green AI
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2023.
[Abstract] [BibTeX] [PDF]@article{verdecchia2023green, title={A Systematic Review of Green AI}, author={Verdecchia, Roberto and June, Sallou and Cruz, Luís}, journal={Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery}, publisher={Whiley}, year={2023} }
With the ever-growing adoption of AI-based systems, the carbon footprint of AI is no longer negligible. AI researchers and practitioners are therefore urged to hold themselves accountable for the carbon emissions of the AI models they design and use. This led in recent years to the appearance of researches tackling AI environmental sustainability, a field referred to as Green AI. Despite the rapid growth of interest in the topic, a comprehensive overview of Green AI research is to date still missing. To address this gap, in this paper, we present a systematic review of the Green AI literature. From the analysis of 98 primary studies, different patterns emerge. The topic experienced a considerable growth from 2020 onward. Most studies consider monitoring AI model footprint, tuning hyperparameters to improve model sustainability, or benchmarking models. A mix of position papers, observational studies, and solution papers are present. Most papers focus on the training phase, are algorithm-agnostic or study neural networks, and use image data. Laboratory experiments are the most common research strategy. Reported Green AI energy savings go up to 115%, with savings over 50% being rather common. Industrial parties are involved in Green AI studies, albeit most target academic readers. Green AI tool provisioning is scarce. As a conclusion, the Green AI research field results to have reached a considerable level of maturity. Therefore, from this review emerges that the time is suitable to adopt other Green AI research strategies, and port the numerous promising academic results to industrial practice. -
M. Funke, P. Lago, R. Verdecchia
Variability Features: Extending Sustainability Decision Maps via an Industrial Case Study
International Conference on Software Architecture (ICSA), 2023.
[Abstract] [BibTeX] [PDF]@article{funke2023variability, title={Variability Features: Extending Sustainability Decision Maps via an Industrial Case Study}, author={Funke, Markus and Lago, Patricia and Verdecchia, Roberto}, journal={International Conference on Software Architecture (ICSA)}, publisher={IEEE}, year={2023} }
Over the years, various thinking frameworks have been developed to address sustainability as a quality property of software-intensive systems. Notwithstanding, which quality concerns should be selected in practice that have a significant impact on sustainability remains a challenge.
In this experience report, we propose the notion of variability features, i.e., specific software features which are implemented in a number of possible alternative variants, each with a potentially different impact on sustainability. We extended sustainability decision maps to incorporate these variability features into an already existing thinking framework. Our findings were derived from a qualitative case study and evaluated in an industrial context. Data was collected by analysing a real-world application and conducting working sessions together with expert interviews.
The variability features allowed us to identify and evaluate alternative usage scenarios of one real-world software-intensive system, enabling data-driven sustainability choices and suggestions for professional practices. By providing concrete measurements, we can support software architects at design time, and decision makers towards achieving sustainability goals. -
R. Verdecchia, L. Scommegna, E. Vicario, T. Pecorella
Network Digital Twins: Towards a Future Proof Reference Architecture
International Workshop on Digital Twin Architecture (TwinArch), 2023.
[Abstract] [BibTeX] [PDF]@article{verdecchia2023towards, title={Network Digital Twins: Towards a Future Proof Reference Architecture}, author={Verdecchia, Roberto and Scommegna, Leonardo and Vicario, Enrico and Pecorella, Tommaso}, journal={International Workshop on Digital Twin Architecture (TwinArch)}, publisher={Springer}, year={2023} }
With the evergrowing popularization of complex, distributed, and heterogeneous networks, how to architect and monitor networking environments is becoming a crucial open problem. In this context, digital twins can be used to mimic the structure and behavior of physical network. Albeit digital twin references architectures exist for other domains, to date, no comprehensive reference architecture for digital twins in the networking context was yet established. In this position paper, we discuss the current need for a reference network digital twin reference architecture, and describe the essential element in the road ahead to design it. We open the paper with the results of a preliminary survey we conducted to investigate the need for the reference architecture, the key information it should convey, and more practical insights on how to design it. Among other results, the survey corroborated that current standards are not best fitted to model network digital twin, and that a new reference architecture is needed. Following, we document our position on the need of a reference architecture for network digital twins. Our discussion is outlined as three main facets, namely (i) digital twins of what, for what, and how to deploy them. As conclusion, we outline our vision on the reference architecture, and the main research steps we plan to undertake to tackle the problem. As end goal, we intend to reach out to both networking and digital twin software architecture communities, towards the joint establishment of a future proof digital twin network architecture -
R. Verdecchia, P. Lago
Tales of Hybrid Teaching in Software Engineering: Lessons Learned and Guidelines
IEEE Transaction on Education (ToE), 2022.
[Abstract] [BibTeX] [PDF]@article{verdecchia2022tales, title={Tales of Hybrid Teaching in Software Engineering: Lessons Learned and Guidelines}, author={Verdecchia, Roberto and Lago, Patricia}, journal={IEEE Transaction on Education}, publisher={IEEE} }
Contribution: This paper contributes empirical insights on hybrid teaching of software engineering courses. Results include the systematic analysis of hybrid teaching attendance and interaction, perception of hybrid teaching, and grade distributions. Results are synthesised into eight evidence-based guidelines.
Background: Hybrid teaching, i.e., teaching simultaneously to in-person and online students, is gaining an increasing adoption. However, how to improve the experience of students with respect to hybrid teaching is still an open question.
Research questions: RQ: How can the experience of students with respect to hybrid teaching be improved? RQ1: Are there differences between in-person and online student attendance and interaction? RQ2: What is the student perception of hybrid teaching? RQ3: Is in-person and online supervision influencing grades of students?
Methodology: A mixed-method empirical research process is used, by considering two Master courses in software engineering. The process leverages three data sources, namely quantitative and qualitative data collected during lectures, a student survey, and student grades. Summary statistics, coding processes, and a statistical analysis are used to answer the research questions.
Findings: Students prefer to attend more frequently online, as it provides (among other factors) flexibility and convenience, while coming at the cost of lower focus and interaction quality. Following in-person is statistically a better choice to gain a median grade, while following online can lead with more probability to a higher or lower grade. Various guidelines are presented, ranging from hybrid classroom setup, to online student management, and course component design. -
N. Kozanidis, R. Verdecchia, E. Guzmàn
Asking about Technical Debt: Characteristics and Automatic Identification of Technical Debt Questions on Stack Overflow
International Symposium on Empirical Software Engineering and Measurement (ESEM), 2022.
[Abstract] [BibTeX] [PDF]@article{kozanidis2022asking, title={Asking about Technical Debt: Characteristics and Automatic Identification of Technical Debt Questions on Stack Overflow}, author={Verdecchia, Roberto and Lago, Patricia and de Vries, Carol}, journal={International Symposium on Empirical Software Engineering and Measurement (ESEM)}, publisher={IEEE} }
Background: Numerous methodologies have been used to study technical debt. Among different data sources, Q&A sites provide an opportunity to study how users reference and request support on technical debt. To date only few studies, focusing on narrow aspects, investigate technical debt through the lens of Stack Overflow.
Aims: We aim at gaining an in-depth understanding on the characteristics of technical debt questions on Stack Overflow. In addition, we assess if identification strategies based on machine learning can be used to automatically identify and classify technical debt questions.
Method: We use combination of automated and manual processes to identify technical debt questions on Stack Overflow. The final set of 415 questions is analyzed both quantitatively and qualitatively to study (i) technical debt types, (ii) question length, (iii) perceived urgency, (iv) sentiment, and (v) emerging themes. Natural language processing and machine learning techniques are used to evaluate if technical debt questions can be identified and classified automatically.
Results: Architecture debt is the most recurring debt type, followed by code and design debt. Most questions display mild urgency, with the frequency of higher urgency steadily declining as urgency rises. Question length varies across debt types. Sentiment of questions is mostly neutral. 29 recurrent themes emerge in the questions. Machine learning models can be used to identify technical debt questions and binary urgency accurately, but not debt types.
Conclusions: Different patterns emerge from the analysis of technical debt questions on Stack Overflow. The results provide further insights on the phenomenon, and support the adoption of a more comprehensive strategy to identify technical debt questions. -
R. Verdecchia, P. Lago, C. de Vries
The future of sustainable digital infrastructures: A landscape of solutions, adoption factors, impediments, open problems, and scenarios
Sustainable Computing: Informatics and Systems, 2022.
[Abstract] [BibTeX] [PDF]@article{verdecchia2022future, title={The future of sustainable digital infrastructures: A landscape of solutions, adoption factors, impediments, open problems, and scenarios}, author={Verdecchia, Roberto and Lago, Patricia and de Vries, Carol}, journal={Sustainable Computing: Informatics and Systems}, pages={100767}, year={2022}, publisher={Elsevier} }
Background: Digital infrastructures, i.e., ICT systems, or system-of-systems, providing digital capabilities, such as storage and computational services, are experiencing an ever-growing demand for data consumption, which is only expected to increase in the future. This trend leads to a question we need to answer: How can we evolve digital infrastructures to keep up with the increasing data demand in a sustainable way?
Objective: The goal of this study is to understand what is the future of sustainable digital infrastructures, in terms of: which solutions are, or will be, available to sustainably evolve digital infrastructures, and which are the related adoption factors, impediments, and open problems.
Method: We carried out a 3-phase mixed-method qualitative empirical study, comprising semi-structured interviews, followed by focus groups, and a plenary session with parallel working groups. In total, we conducted 13 sessions involving 48 digital infrastructure practitioners and researchers.
Results: From our investigation emerges a landscape for sustainable digital infrastructures, composed of 30 solutions, 5 adoption factors, 4 impediments, and 13 open problems. We further synthesized our results in 4 incremental scenarios, which outline the future evolution of sustainable digital infrastructures.
Conclusions: From an initial shift from on-premise to the cloud, as time progresses, digital infrastructures are expected to become increasingly distributed, till it will be possible to dynamically allocate resources by following time, space, and energy. Numerous solutions will support this change, but digital infrastructures are envisaged to be able to evolve sustainably only by (i) gaining a wider awareness of digital sustainability, (ii) holding every party accountable for their sustainability throughout value chains, and (iii) establishing cross-domain collaborations. -
R. Verdecchia, L. Cruz, J. Sallou, M. Lin, J. Wickenden, and E. Hotellier
Data-Centric Green AI: An Exploratory Empirical Study
International Conference on ICT for Sustainability (ICT4S), 2022.
[Abstract] [BibTeX] [PDF]@article{verdecchia2022data, title={Data-Centric Green AI: An Exploratory Empirical Study}, author={Verdecchia, Roberto and Cruz, Lu\`{i}s, and Sallou, June and Lin, Michelle an Wickenden, James and Hotellier, Estelle}, journal={International Conference on ICT for Sustainability (ICT4S)}, year={2022}, publisher={IEEE} }
With the growing availability of large-scale datasets, and the popularization of affordable storage and computational capabilities, the energy consumed by AI is becoming a growing concern. To address this issue, in recent years, studies have focused on demonstrating how AI energy efficiency can be improved by tuning the model training strategy. Nevertheless, how modifications applied to datasets can impact the energy consumption of AI is still an open question. To fill this gap, in this exploratory study, we evaluate if data- centric approaches can be utilized to improve AI energy efficiency. To achieve our goal, we conduct an empirical experiment, executed by considering 6 different AI algorithms, a dataset comprising 5,574 data points, and two dataset modifications (number of data points and number of features). Our results show evidence that, by exclusively conducting modifications on datasets, energy consumption can be drastically reduced (up to 92.16%), often at the cost of a negligible or even absent accuracy decline. As additional introductory results, we demonstrate how, by exclusively changing the algorithm used, energy savings up to two orders of magnitude can be achieved. In conclusion, this exploratory investigation empirically demonstrates the importance of applying data-centric techniques to improve AI energy efficiency. Our results call for a research agenda that focuses on data-centric techniques, to further enable and democratize Green AI. -
R. Verdecchia, I. Malavolta, P. Lago, and I. Ozkaya
Empirical evaluation of an architectural technical debt index in the context of the Apache and ONAP ecosystems
PeerJ Computer Science, 2022.
[Abstract] [BibTeX] [PDF]@article{verdecchia2022empirical, title={Empirical evaluation of an architectural technical debt index in the context of the Apache and ONAP ecosystems}, author={Verdecchia, Roberto and Malavolta, Ivano and Lago, Patricia and Ozkaya, Ipek}, journal={PeerJ COmputer Science}, year={2022}, publisher={O'Reilly and SAGE} }
Background Architectural Technical Debt (ATD) in a software-intensive system denotes architectural design choices which, while being suitable or even optimal when adopted, lower the maintainability and evolvability of the system in the long term, hindering future development activities. Despite the growing research interest in ATD, how to gain an informative and encompassing viewpoint of the ATD present in a software-intensive system is still an open problem. Objective In this study, we evaluate ATDx, a data-driven approach providing an overview of the ATD present in a software-intensive system. The approach, based on the analysis of a software portfolio, calculates severity levels of architectural rule violations via a clustering algorithm, and aggregates results into different ATD dimensions. Method To evaluate ATDx, we implement an instance of the approach based on SonarQube, and run the analysis on the Apache and ONAP ecosystems. The analysis results are then shared with the portfolio contributors, who are invited to participate in an online survey designed to evaluate the representativeness and actionability of the approach. Results The survey results confirm the representativeness of the ATDx, in terms of both the ATDx analysis results and the used architectural technical debt dimensions. Results also showed the actionability of the approach, although to a lower extent when compared to the ATDx representativeness, with usage scenarios including refactoring, code review, communication, and ATD evolution analysis. Conclusions With ATDx, we strive for the establishment of a sound, comprehensive, and intuitive architectural view of the ATD identifiable via source code analysis. The collected results are promising, and display both the representativeness and actionability of the approach. As future work, we plan to consolidate the approach via further empirical experimentation, by considering other development contexts (e.g., proprietary portfolios and other source code analysis tools), and enhancing the ATDx report capabilities. -
A. Bertolino, E. Cruciani, B. Miranda, and R. Verdecchia:
Testing non-testable programs using association rules
International Conference on Automation of Software Test (AST), 2022.
[Abstract] [BibTeX] [PDF]@article{bertolino2022testing, title={Testing non-testable programs using association rules}, author={Bertolino, Antonia and Cruciani, Emilio and Miranda, Breno and Verdecchia, Roberto}, journal={International Conference on Automation of Software Test (AST)}, year={2022}, publisher={ACM} }
We propose a novel scalable approach for testing non-testable programs denoted as ARMED testing. The approach leverages effi- cient Association Rules Mining algorithms to determine relevant implication relations among features and actions observed while the system is in operation. These relations are used as the spec- ification of positive and negative tests, allowing for identifying plausible or suspicious behaviors: for those cases when oracles are inherently unknownable, such as in social testing, ARMED testing introduces the novel concept of testing for plausibility. To illustrate the approach we walk-through an application example. -
L. Wattenbach, A. Basel, M. Maria Fiore, H. Ding, R. Verdecchia, and I. Malavolta:
Do You Have the Energy for This Meeting? An Empirical Study on the Energy Consumption of the Google Meet and Zoom Android apps
International Conference on Mobile Software Engineering and Systems (MobileSoft), 2022.
[Abstract] [BibTeX] [PDF]@inproceedings{wattenbach2022do, title={Do You Have the Energy for This Meeting? An Empirical Study on the Energy Consumption of the Google Meet and Zoom Android apps}, author={Wattenbach, Leonhard and Basel, Aslan and Maria Fiore, Matteo and Ding, Henley and Verdecchia, Roberto and Malavolta, Ivano}, booktitle={Proceedings of International Conference on Mobile Software Engineering and Systems (MOBILESoft 2022).} year={2022}, publisher={IEEE/ACM} }
Context. With “work from home” policies becoming the norm during the COVID-19 pandemic, videoconferencing apps have soared in popularity, especially on mobile devices. However, mobile devices only have limited energy capacities, and their batteries degrade slightly with each charge/discharge cycle. Goal. With this research we aim at comparing the energy con- sumption of two Android videoconferencing apps, and studying the impact that different features and settings of these apps have on energy consumption. Method. We conduct an empirical experiment by utilizing as subjects Google Meet and Zoom. We test the impact of multiple factors on the energy consumption: number of call participants, microphone and camera use, and virtual backgrounds. Results. Zoom results to be more energy efficient than Google Meet, albeit only to a small extent. Camera use is the most energy greedy feature, while the use of virtual background only marginally impacts energy consumption. Number of participants affect differently the energy consumption of the apps. As exception, microphone use does not significantly affect energy consumption. Conclusions. Most features of Android videoconferencing apps sig- nificantly impact their energy consumption. As implication for users, selecting which features to use can significantly prolong their mobile battery charge. For developers, our results provide em- pirical evidence on which features are more energy-greedy, and how features can impact differently energy consumption across apps. -
Sophie Vos, P. Lago, R. Verdecchia, and I. Heitlager
Architectural Tactics to Optimize Software for Energy Efficiency in the Public Cloud
International Conference on ICT for Sustainability (ICT4S), 2022.
[Abstract] [BibTeX] [PDF]@article{vos2022architectural, title={Architectural Tactics to Optimize Software for Energy Efficiency in the Public Cloud}, author={Vos, Sophie and Lago, Patricia and Verdecchia, Roberto and Heitlager, Ilja}, journal={International Conference on ICT for Sustainability (ICT4S)}, year={2022}, publisher={IEEE} }
A promise of cloud computing is the reduction of energy footprint enabled by economies of scale. Unfortunately, little research is available on how cloud consumers can reduce their energy footprint when running software in the public cloud. Moreover, cloud consumers do not have full access to information regarding their cloud infrastructure usage, which is required to understand the impact of design decisions on energy usage. The purpose of our study is to support cloud consumers in developing energy-efficient workloads in the public cloud. To achieve our goal, we collaborated with a large cloud solution provider to discover an initial set of reusable architectural tactics for software energy efficiency. Starting from interviews with 17 practitioners, we reviewed and selected available tactics to improve the energy efficiency of individual workloads in the public cloud, and synthetized the identified tactics in a reusable model. In addition, we conducted a case study to assess the impact of utilizing a tactic, which was selected following a prioritization provided by the practitioners. Our results demonstrate the possibility to architect cloud workloads for energy efficiency through reasoning and estimation of resource optimization. However, the process is not (yet) straightforward due to the current lack of transparency of cloud providers. -
R. Verdecchia, Philippe Kruchten, P. Lago, and I. Malavolta
Building and evaluating a theory of architectural technical debt in software-intensive systems
Journal of Systems and Software (JSS), 2021.
🏆Best Paper Award.
[Abstract] [BibTeX] [PDF] [Video]@article{verdecchia2021building, title={Building and evaluating a theory of architectural technical debt in software-intensive systems}, author={Verdecchia, Roberto and Kruchten, Philippe and Lago, Patricia and Malavolta, Ivano}, journal={Journal of Systems and Software}, pages={110925}, year={2021}, publisher={Elsevier} }
Architectural technical debt in software-intensive systems is a metaphor used to describe the “big” design decisions (e.g., choices regarding structure, frameworks, technologies, languages, etc.) that, while being suitable or even optimal when made, significantly hinder progress in the future. While other types of debt, such as code-level technical debt, can be readily detected by static analyzers, and often be refactored with minimal or only incremental efforts, architectural debt is hard to be identified, of wide-ranging remediation cost, daunting, and often avoided. In this study, we aim at developing a better understanding of how software development organizations conceptualize architectural debt, and how they deal with it. In order to do so, in this investigation we apply a mixed empirical method, constituted by a grounded theory study followed by focus groups. With the grounded theory method we construct a theory on architectural technical debt by eliciting qualitative data from software architects and senior technical staff from a wide range of heterogeneous software development organizations. We applied the focus group method to evaluate the emerging theory and refine it according to the new data collected. The result of the study, i.e., a theory emerging from the gathered data, constitutes an encompassing conceptual model of architectural technical debt, identifying and relating concepts such as its symptoms, causes, consequences, management strategies, and communication problems. From the conducted focus groups, we assessed that the theory adheres to the four evaluation criteria of classic grounded theory, i.e., the theory fits its underlying data, is able to work, has relevance, and is modifiable as new data appears. By grounding the findings in empirical evidence, the theory provides researchers and practitioners with novel knowledge on the crucial factors of architectural technical debt experienced in industrial contexts. -
R. Verdecchia, P. Lago, Christof Ebert, Carol de Vries
Green IT and Green Software
IEEE Software, 2021.
[BibTeX] [PDF]@article{verdecchia2021green, title={Green IT and Green Software}, author={Verdecchia, Roberto and Lago, Patricia and Ebert, Christof and De Vries, Carol}, journal={IEEE Software}, volume={38}, number={6}, pages={7--15}, year={2021}, publisher={IEEE} }
-
R. Verdecchia
Architectural Technical Debt: Identification and Management
Doctor of Philosophy Thesis (PhD).
Gran Sasso Science Institute and Vrije Universiteit Amsterdam, 2021.
Promotors: P. Lago and R. De Nicola
Co-Promotors: I. Malavolta and C. Trubiani
[Abstract] [BibTeX] [PDF]@book{verdecchia2021architectural, title={Architectural Technical Debt: Identification and Management}, author={Verdecchia, Roberto}, journal={Gran Sasso Science Institute and Vrije Universiteit Amsterdam}, ISBN="978-94-6423-368-1", year={2021} }
Architectural technical debt (ATD) in a software-intensive system is the sum of all design choices that may have been suitable or even optimal at the time they were made, but which today are significantly impending progress: structure, framework, technology, languages, etc. Unlike code-level technical debt which can be readily detected by static analysers, and can often be refactored with minimal or only incremental efforts, architectural debt is hard to detect, and its remediation rather wide-ranging, daunting, and often avoided. The objective of this thesis is to develop a better understanding of architectural technical debt, and determine what strategies can be used to identify and manage it. In order to do so, we adopt a wide range of research techniques, including literature reviews, case studies, interviews with practitioners, and grounded theory. The result of our investigation, deeply grounded in empirical data, advances the field not only by providing novel insights into ATD related phenomena, but also by presenting approaches to pro-actively identify ATD instances, leading to its eventual management and resolution. -
S. Ospina, R. Verdecchia, I. Malavolta, and P. Lago
ATDx: A tool for Providing a Data-driven Overview of Architectural Technical Debt in Software-intensive Systems
European Conference on Software Architecture (ECSA), 2021.
[Abstract] [BibTeX] [PDF] [Demo]@article{ospina2021atdx, title={ATDx: A tool for Providing a Data-driven Overview of Architectural Technical Debt in Software-intensive Systems}, author={Ospina, Sebastian and Verdecchia, Roberto and Malavolta, Ivano and Lago, Patricia}, journal={European Conference on Software Architecture}, year={2021}, publisher={Springer} }
Architectural technical debt (ATD) in software-intensive systems is mostly invisible to software de-velopers, can be widespread throughout entire code-bases, and its remediation cost is often steep. Inrecent years, numerous approaches have been proposed to identify, keep track, and ultimately manageATD. The variety of approaches available opens a new problem, namely how to gain an encompassingoverview of the ATD identified in a software-intensive system. With this paper we make available theATDx tool, an implementation of ATDx written in Python, designed in a plug-in fashion. ATDx is anapproach designed to provide a data-driven, intuitive, and actionable overview of the ATD present ina portfolio of software projects. ATDx is based on third-party source code analysis tools, architecturalissue severity calculationviaclustering, and aggregation of measurements into different architecturaltechnical debt dimensions. The ATDx tool allows users to automatically run the ATDx analysis, gener-ate reports containing the ATDx analysis results, and is integrated with GitHub. In addition to the tool,we provide two already implemented plugins, allowing users to run the ATDx tool out-of-the-box. GitHub repository: https://github.com/S2-group/ATDx Video: https://www.youtube.com/watch?v=ULT9fgxuB7E -
R. Verdecchia, Philippe Kruchten, P. Lago, and I. Malavolta
Summary: Building and evaluating a theory of architectural technical debt in software-intensive systems
European Conference on Software Architecture (ECSA), 2021.
[Abstract] [BibTeX] [PDF] [Video]@article{verdecchia2021building2, title={Building and evaluating a theory of architectural technical debt in software-intensive systems}, author={Verdecchia, Roberto and Kruchten, Philippe and Lago, Patricia and Malavolta, Ivano}, journal={Journal of Systems and Software}, pages={110925}, year={2021}, publisher={Elsevier} }
Architectural technical debt in software-intensive systems is a metaphor used to describe the “big” design decisions (e.g., choices regarding structure, frameworks, technologies, languages, etc.) that, while being suitable or even optimal when made, significantly hinder progress in the future. While other types of debt, such as code-level technical debt, can be readily detected by static analyzers, and often be refactored with minimal or only incremental efforts, architectural debt is hard to be identified, of wide-ranging remediation cost, daunting, and often avoided. In this study, we aim at developing a better understanding of how software development organizations conceptualize architectural debt, and how they deal with it. In order to do so, in this investigation we apply a mixed empirical method, constituted by a grounded theory study followed by focus groups. With the grounded theory method we construct a theory on architectural technical debt by eliciting qualitative data from software architects and senior technical staff from a wide range of heterogeneous software development organizations. We applied the focus group method to evaluate the emerging theory and refine it according to the new data collected. The result of the study, i.e., a theory emerging from the gathered data, constitutes an encompassing conceptual model of architectural technical debt, identifying and relating concepts such as its symptoms, causes, consequences, management strategies, and communication problems. From the conducted focus groups, we assessed that the theory adheres to the four evaluation criteria of classic grounded theory, i.e., the theory fits its underlying data, is able to work, has relevance, and is modifiable as new data appears. By grounding the findings in empirical evidence, the theory provides researchers and practitioners with novel knowledge on the crucial factors of architectural technical debt experienced in industrial contexts. -
J. Bogner,
R. Verdecchia, and Ilias Gerostathopoulos
Characterizing Technical Debt and Antipatterns in AI-Based Systems: A Systematic Mapping Study
International Conference on Technical Debt (TechDebt), 2021.
🏆Best Presentation Award.
[Abstract] [BibTeX] [PDF]@article{bogner2021characterizing, title={Characterizing Technical Debt and Antipatterns in AI-Based Systems: A Systematic Mapping Study}, author={Bogner, Justus and Verdecchia, Roberto and Gerostathopoulos, Ilias}, journal={International Conference on Technical Debt}, year={2021}, publisher={IEEE} }
Background: With the rising popularity of Artificial Intelligence (AI), there is a growing need to build large and complex AI-based systems in a cost-effective and manageable way. Like with traditional software, Technical Debt (TD) will emerge naturally over time in these systems, therefore leading to challenges and risks if not managed appropriately. The influence of data science and the stochastic nature of AI-based systems may also lead to new types of TD or antipatterns, which are not yet fully understood by researchers and practitioners. Objective: The goal of our study is to provide a clear overview and characterization of the types of TD (both established and new ones) that appear in AI-based systems, as well as the antipatterns and related solutions that have been proposed. Method: Following the process of a systematic mapping study, 21 primary studies are identified and analyzed. Results: Our results show that (i) established TD types, variations of them, and four new TD types (data, model, configuration, and ethics debt) are present in AI-based systems, (ii) 72 antipatterns are discussed in the literature, the majority related to data and model deficiencies, and (iii) 46 solutions have been proposed, either to address specific TD types, antipatterns, or TD in general. Conclusions: Our results can support AI professionals with reasoning about and communicating aspects of TD present in their systems. Additionally, they can serve as a foundation for future research to further our understanding of TD in AI-based systems. -
R. Verdecchia, E. Cruciani, B. Miranda, and A. Bertolino
Know Your Neighbor: Fast Static Prediction of Test Flakiness
IEEE Access, 2021.
[Abstract] [BibTeX] [PDF]@article{verdecchia2021know, title={Know Your Neighbor: Fast Static Prediction of Test Flakiness}, author={Verdecchia, Roberto and Cruciani, Emilio and Miranda, Breno and Bertolino, Antonia}, journal={IEEE Access}, year={2021}, pages={76119-76134}, publisher={IEEE} }
Context: Flaky tests plague regression testing in Continuous Integration environments by slowing down change releases and wasting testing time and effort. Despite the growing interest in mitigating the burden of test flakiness, how to efficiently and effectively detect flaky tests is still an open problem. Objective: In this study, we present and evaluate FLAST, an approach designed to statically predict test flakiness. FLAST leverages vector-space modeling, similarity search, dimensionality reduction, and k-Nearest Neighbor classification in order to timely and efficiently detect test flakiness. Method: In order to gain insights into the efficiency and effectiveness of FLAST, we conduct an empirical evaluation of the approach by considering 13 real-world projects, for a total of 1,383 flaky and 26,702 non-flaky tests. We carry out a quantitative comparison of FLAST with the state-of-the-art methods to detect test flakiness, by considering a balanced dataset comprising 1,402 real-world flaky and as many non-flaky tests. Results: From the results we observe that the effectiveness of FLAST is comparable with the state-of-the-art, while providing considerable gains in terms of efficiency. In addition, the results demonstrate how by tuning the threshold of the approach FLAST can be made more conservative, so to reduce false positives, at the cost of missing more potentially flaky tests. Conclusion: The collected results demonstrate that FLAST provides a fast, low-cost and reliable approach that can be used to guide test rerunning, or to gate the inclusion of new potentially flaky tests. -
R. Verdecchia, P. Lago, C. de Vries
The LEAP Technology Landscape: Lower Energy Acceleration Program (LEAP) Solutions, Adoption Factors,Impediments, Open Problems, and Scenarios
VU Technical Reports, 2021.
[Abstract] [BibTeX] [PDF]@article{verdecchia2021leap, title={The LEAP Technology Landscape: Lower Energy Acceleration Program (LEAP) Solutions, Adoption Factors,Impediments, Open Problems, and Scenarios}, author={Verdecchia, Roberto and Lago, Patricia and de Vries, Carol}, journal={VU Technical Reports}, year={2021}, publisher={Vrije Universiteit Amsterdam} }
This technology landscape is intended for all stakeholders that aim at contributing to building a future-proof energy efficient digital infrastructure, from business organizations like data centers, software development companies, telecommunication service providers, to business customers, NGOs and end users; but also governmental organizations, decision makers and funding agencies. -
M. Autili, I. Malavolta, A. Perucci, G.L. Scoccia, and R. Verdecchia
Software Engineering Techniques for Statically Analyzing Mobile Apps: Research Trends, Characteristics, and Potential for Industrial Adoption
Journal of Internet Services and Applications (JISA), 2021.
[Abstract] [BibTeX] [PDF]@article{autili2021software, title={Software Engineering Techniques for Statically Analyzing Mobile Apps: Research Trends, Characteristics, and Potential for Industrial Adoption}, author={Autili, Marco and Malavolta, Ivano and Perucci, Alexander and Scoccia, Gianluca and Verdecchia, Roberto}, journal={Journal of Systems and Software (JISA)}, year={2021}, publisher={Springer} }
Mobile platforms are rapidly and continuously changing, with support for new sensors, APIs, and programming abstractions. Static analysis is gaining a growing interest, allowing developers to predict properties about the run-time behavior of mobile apps without executing them. Over the years, literally hundreds of static analysis techniques have been proposed, ranging from structural and control-flow analysis to state-based analysis. In this paper, we present a systematic mapping study aimed at identifying, evaluating and classifying characteristics, trends and potential for industrial adoption of existing research in static analysis of mobile apps. Starting from over 12,000 potentially relevant studies, we applied a rigorous selection procedure resulting in 261 primary studies along a time span of 9 years. We analyzed each primary study according to a rigorously-defined classification framework. The results of this study give a solid foundation for assessing existing and future approaches for static analysis of mobile apps, especially in terms of their industrial adoptability. Researchers and practitioners can use the results of this study to (i) identify existing research/technical gaps to target, (ii) understand how approaches developed in academia can be successfully transferred to industry, and (iii) better position their (past and future) approaches for static analysis of mobile apps. -
Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Kristof Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodríguez-Pérez, Ricardo Colomo-Palacios, Roberto Verdecchia, Paramvir Singh, Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu, Diego Marcilio, Omar Alam, Abdullah Aldaeej, Idan Amit, Burak Turhan, Simon Eismann, Anna-Katharina Wickert, Ivano Malavolta, Matus Sulir, Fatemeh Fard, Austin Z Henley, Stratos Kourtzanidis, Eray Tuzun, Christoph Treude, Simin Maleki Shamasbi, Ivan Pashchenko, Marvin Wyrich, James Davis, Alexander Serebrenik, Ella Albrecht, Ethem Utku Aktas, Daniel Strüber, and Johannes Erbel.
A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits
Empirical Software Engineering (EMSE), 2021.
[Abstract] [BibTeX] [PDF]@article{herbold2021bug, title={A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits}, author={Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Kristof Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodriguez-Perez, Ricardo Colomo-Palacios, Roberto Verdecchia, Paramvir Singh, Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu, Diego Marcilio, Omar Alam, Abdullah Aldaeej, Idan Amit, Burak Turhan, Simon Eismann, Anna-Katharina Wickert, Ivano Malavolta, Matus Sulir, Fatemeh Fard, Austin Z Henley, Stratos Kourtzanidis, Eray Tuzun, Christoph Treude, Simin Maleki Shamasbi, Ivan Pashchenko, Marvin Wyrich, James Davis, Alexander Serebrenik, Ella Albrecht, Ethem Utku Aktas, Daniel Strüber, and Johannes Erbel}, journal={Empirical Software Engineering}, year={2021}, publisher={Springer} }
Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise. -
R. Verdecchia, Philippe Kruchten, and P. Lago
Architectural Technical Debt: A Grounded Theory
European Conference on Software Architecture (ECSA), 2020.
[Abstract] [BibTeX] [PDF] [Video]@inproceedings{verdecchia2020ecsa, title = "Architectural Technical Debt: A Grounded Theory", abstract = "Architectural technical debt in a software-intensive system is driven by design decisions about its structure, frameworks, technologies,languages, etc. Unlike code-level technical debt, which can be readily detected by static analysers, and can often be refactored with minimal efforts, architectural debt is hard to detect, and its remediation is wide-ranging, daunting, and often avoided. The objective of this study is to develop a better understanding of how software development organisations conceptualize their architectural debt, and how they deal with it, if at all. We used a grounded theory method, eliciting qualitative data from software architects and senior technical staff from a wide range of software development organizations. The result of the study, i.e., the theory emerging from the collected data, constitutes an encompassing conceptual theory of architectural debt, identifying and relating concepts such as symptoms, causes, consequences, and management strategies. By grounding the findings in empirical data, the theory provides researchers and practitioners with evidence of which crucial factors of architectural technical debt are experienced in industrial contexts.", author = "Verdecchia, R., Kruchten, P. and Patricia Lago", year = "2020", booktitle = "European Conference on Software Architecture (ECSA)" }
Architectural technical debt in a software-intensive system is driven by design decisions about its structure, frameworks, technologies,languages, etc. Unlike code-level technical debt, which can be readily detected by static analysers, and can often be refactored with minimal efforts, architectural debt is hard to detect, and its remediation is wide-ranging, daunting, and often avoided. The objective of this study is to develop a better understanding of how software development organisations conceptualize their architectural debt, and how they deal with it, if at all. We used a grounded theory method, eliciting qualitative data from software architects and senior technical staff from a wide range of software development organizations. The result of the study, i.e., the theory emerging from the collected data, constitutes an encompassing conceptual theory of architectural debt, identifying and relating concepts such as symptoms, causes, consequences, and management strategies. By grounding the findings in empirical data, the theory provides researchers and practitioners with evidence of which crucial factors of architectural technical debt are experienced in industrial contexts. -
R. Verdecchia, P. Lago, I. Malavolta, and I. Ozkaya
ATDx: Building an Architectural Technical Debt Index
Evaluation of Novel Approaches to Software Engineering (ENASE), 2020.
[Abstract] [BibTeX] [PDF] [Slides]@inproceedings{verdecchia2020vu, title = {ATDx: Building an Architectural Technical Debt Index}, author = {Roberto Verdecchia and Patricia Lago and Ivano Malavolta and Ipek Ozkaya}, year = {2020}, booktitle = {Evaluation of Novel Approaches to Software Engineering (ENASE)} }
Architectural technical debt (ATD) in software-intensive systems refers to the architecture design decisions which work as expedient in the short term, but later negatively impact system evolvability and maintainability. Over the years numerous approaches have been proposed to detect particular types of ATD at a refined level of granularity via source code analysis. Nevertheless, how to gain an encompassing overview of the ATD present in a software-intensive system is still an open question. In this study, we present a multi-step approach designed to build an ATD index (ATDx), which provides insights into a set of ATD dimensions building upon existing architectural rules by leveraging statistical analysis. The ATDx approach can be adopted by researchers and practitioners alike in order to gain a better understanding of the nature of the ATD present in software-intensive systems, and provides a systematic framework to implement concrete instances of ATDx according to specific project and organizational needs. -
R. Verdecchia, P. Lago, I. Malavolta, and I. Ozkaya
ATDx: Prototype Implementation Technical Report
VU Technical Reports, 2020.
[Abstract] [BibTeX] [PDF]@inproceedings{verdecchia2020enase, title = {ATDx: Prototype Implementation Technical Report}, author = {Roberto Verdecchia and Patricia Lago and Ivano Malavolta and Ipek Ozkaya}, year = {2020}, booktitle = {VU Technical Reports} }
In this technical report we document a preliminary investigation carried out to evaluate the viability and the implementation feasibility of ATDx, and index designed to gain an overview of the architectural technical debt (ATD) present in a software-intensive system. We implement a prototype via the ATDx method by considering the source code static analysis tool SonarQube. This process is carried out by manually identifying 45 architectural rules, and subsequently applying the constructed prototype on a large-scale dataset composed of 6,706 open source Java-based projects. Among other results, this technical report provides insights into the benefits and drawbacks entailed by the concrete implementation of ATDx, the distribution of architectural debt across 6 distinct ATD dimensions (marked by the prominence of issues related to interfaces), and the correlation among the identified dimensions. -
P. Lago, R. Verdecchia, N. Condori-Fernandez, E. Rahmadian, J. Sturm, T. van Nijnanten, R. Bosma, C. Debuysscher, and Paulo Ricardo
Designing for Sustainability: Lessons Learned from Four Industrial Projects
International Conference on Environmental Informatics (EnviroInfo), 2020.
[Abstract] [BibTeX] [PDF]@inproceedings{lago2020enviroinfo, title = {Designing for Sustainability: Lessons Learned from Four Industrial Projects}, author = {Lago, Patricia and Verdecchia, Roberto and Condori-Fernandez, Nelly and Rahmadian, Eko and Sturm, Janina and van Nijnanten, Thijmen and Bosma, Rex and Debuysscher, Christophe and Ricardo, Paulo}, year = {2020}, booktitle = {International Conference on Environmental Informatics}, }
Scientific research addressing the relation between software and sustainability is slowly maturing in two focus areas, related to ‘sustainable software’ and ‘software for sustainability’. The first is better understood and may include research foci like energy efficient software and software maintainability. It most-frequently covers ‘technical’ concerns. The second, ‘software for sustainability’, is much broader in both scope and potential impact, as it entails how software can contribute to sustainability goals in any sector or application domain. Next to the technical concerns, it may also cover economic, social, and environmental sustainability. Differently from researchers, practitioners are often not aware or well- trained in all four types of software sustainability concerns. To address this need, in previous work we have defined the Sustainability-Quality Assessment Framework (SAF) and assessed its viability via the analysis of a series of software projects. Nevertheless, it was never used by practitioners themselves, hence triggering the question: What can we learn from the use of SAF in practice? To answer this question, we report the results of practitioners applying the SAF to four industrial cases. The results show that the SAF helps practitioners in (1) creating a sustainability mindset in their practices, (2) uncovering the relevant sustainability-quality concerns for the software project at hand, and (3) reasoning about the inter-dependencies and trade-offs of such concerns as well as the related short- and long-term implications. Next to improvements for the SAF, the main lesson for us as researchers is the missing explicit link between the SAF and the (technical) architecture design. -
F. Corò, R. Verdecchia, E. Cruciani, B. Miranda, and A. Bertolino
JTeC: A Large Collection of Java Test Classes for Test Code Analysis and Processing
International Conference on Mining Software Repositories (MSR), 2020.
[Abstract] [BibTeX] [PDF] [Demo]@inproceedings{coro2020msr, title = {JTeC: A Large Collection of Java Test Classes for Test Code Analysis and Processing}, author = {Cor\`o, Federico and Verdecchia, Roberto and Cruciani, Emilio and Miranda, Breno and Bertolino, Antonia}, year = {2020}, booktitle = {International Conference on Mining Software Repositories}, }
The recent push towards test automation and test-driven develop- ment continues to scale up the dimensions of test code that needs to be maintained, analysed, and processed side-by-side with pro- duction code. As a consequence, on the one side regression testing techniques, e.g., for test suite prioritization or test case selection, capable to handle such large-scale test suites become indispensable; on the other side, as test code exposes own characteristics, specific techniques for its analysis and refactoring are actively sought. We present JTeC, a large-scale dataset of test cases that researchers can use for benchmarking the above techniques or any other type of tool expressly targeting test code. JTeC collects more than 2.5M test classes belonging to 31K+ GitHub projects and summing up to more than 430 Million SLOCs of ready-to-use real-world test code. -
A. Bertolino , E. Cruciani, B. Miranda, and R. Verdecchia
Know Your Neighbor: Fast Static Prediction of Test Flakiness
ISTI Technical Reports, 2020.
[Abstract] [BibTeX] [PDF]@inproceedings{bertolino2020isti, title = {Scalable Approaches for Test Suite Reduction}, author = {Cruciani, Emilio and Miranda, Breno and Verdecchia, Roberto and Bertolino, Antonia}, year = {2020}, booktitle = {ISTI Technical Reports}, doi = {10.32079/ISTI-TR-2020/001} }
Flaky tests plague regression testing in Continuous Integration environments by slowing down change releases, wasting development effort, and also eroding testers trust in the test process. We present FLAST, the fast static approach to flakiness detection using test code similarity. Our extensive evaluation on 24 projects taken from repositories used in three previous studies showed that FLAST can identify flaky tests with up to 0.98 Median and 0.92 Mean precision. For six of those projects it could already yield ∼0.98 average precision values with a training set containing less than 100 tests. Besides, where known flaky tests are classified according to their causes, the same approach can also predict a flaky test category with alike precision values. The cost of the approach is negligible: the average train time over a dataset of ∼1,700 test methods is less than one second, while the average prediction time for a new test is less than one millisecond -
E. Cruciani, B. Miranda, R. Verdecchia, A. Bertolino
Scalable Approaches for Test Suite Reduction
International Conference on Software Engineering (ICSE) , 2019.
🏆ACM SIGSOFT Distinguished Paper Award.
[Abstract] [BibTeX] [PDF]@inproceedings{cruciani2019FAST-R, title = {Scalable Approaches for Test Suite Reduction}, author = {Cruciani, Emilio and Miranda, Breno and Verdecchia, Roberto and Bertolino, Antonia}, year = {2019}, booktitle = {Proceedings of the 40th International Conference on Software Engineering}, organization={IEEE Press} }
Test suite reduction approaches aim at decreasing software regression testing costs by selecting a representative subset from large-size test suites. Most existing techniques are too expensive for handling modern massive systems and moreover depend on artifacts, such as code coverage metrics or specification models, that are not commonly available at large scale. We present a family of novel very efficient approaches for similaritybased test suite reduction that apply algorithms borrowed from the big data domain together with smart heuristics for finding an evenly spread subset of test cases. The approaches are very general since they only use as input the test cases themselves (test source code or command line input). We evaluate four approaches in a version that selects a fixed budget B of test cases, and also in an adequate version that does the reduction guaranteeing some fixed coverage. The results show that the approaches yield a fault detection loss comparable to state-of-the-art techniques, while providing huge gains in terms of efficiency. When applied to a suite of more than 500K real world test cases, the most efficient of the four approaches could select B test cases (for varying B values) in less than 10 seconds. -
R. Verdecchia, I. Malavolta, and P. Lago
Guidelines for Architecting Android Apps: A mixed-method Empirical Study
International Conference on Software Architecture (ICSA), 2019.
[Abstract] [BibTeX] [PDF]@inproceedings{verdecchia2019icsa, title = {Guidelines for Architecting Android Apps: A mixed-method Empirical Study}, author = {Roberto Verdecchia and Ivano Malavolta and Patricia Lago}, year = {2019}, booktitle = {International Conference on Software Architecture (ICSA))} }
For surviving in the highly competitive market of Android apps, it is fundamental for app developers to deliver apps of high quality and with short release times. A well architected Android app is beneficial for developers, e.g. in terms of maintainability, testability, performance, and avoidance of resource leaks. However, how to properly architect Android apps is still debated and subject to conflicting opinions usually influenced by technological hypes rather than objective evidence. In this paper we present an empirical study on how developers architect Android apps, what architectural patterns and practices Android apps are based on, and their potential impact on quality. We apply a mixed-method empirical research design that combines (i) semi-structured interviews with Android practitioners in the field and (ii) a systematic analysis of both the grey (i.e., websites, on-line blogs) and white literature (i.e., academic studies) on the architecture of Android apps. Based on the analysis of the state of the art and practice about architecting Android apps, we systematically extract a set of 42 evidencebased guidelines supporting developers when architecting their Android apps. -
P. Lago, Jia F. Cai, Remco C. de Boer, Philippe Kruchten, and
R. Verdecchia
DecidArch v2: An improved Game to teach Architecture Design Decision Making
International Workshop on decision Making in Software ARCHitecture (MARCH), 2019.
[Abstract] [BibTeX] [PDF]@inproceedings{deBoer2019MARCH, title = "DecidArch: Playing Cards as Software Architects", abstract = "Teaching software architecture is a challenge because of the difficulty to expose students to actual meaningful design situations. Games can provide a useful illustration of the design decision making process, and teach students the power of team interaction for making sound decisions.We introduce a game –DecidArch– developed to achieve three learning objectives: 1) create awareness about the rationale involved in design decision making, 2) enable appreciation of the reasoning behind candidate design decisions proposed by others, and 3) create awareness about interdependencies between design decisions.The game has been played by 22 groups with a total of 83 players, all of them students of the VU software architecture course. We present some of the lessons learned, both from our observation and through participant survey. We conclude that the game well supports our three learning objectives, and we identify several improvement points for future game editions.", author = "{de Boer}, R.C., P. Lago, R. Verdecchia and Philippe Kruchten", year = "2019", booktitle = "3rd International Workshop on decision Making in Software ARCHitecture (MARCH)" }
We report on the use of our DecidArch game to teach software architecture design decision making in two consecutive years. We compare the support of three learning goals for the first version of the game with the second, revised version. Results show how the game has clearly improved. For the remaining issues, we suggest final improvements. -
P. Lago, Jia F. Cai, Remco C. de Boer, Philippe Kruchten, and
R. Verdecchia
DecidArch: Playing Cards as Software Architects
Hawaii International Conference on System Sciences, 2019.
🏆Best Paper Award.
🏆ISSIP-IBM-CBA Student Paper Award for Best Industry Studies Paper.
[Abstract] [BibTeX] [PDF]@inproceedings{lago2019hicss, title = "DecidArch: Playing Cards as Software Architects", abstract = "Teaching software architecture is a challenge because of the difficulty to expose students to actual meaningful design situations. Games can provide a useful illustration of the design decision making process, and teach students the power of team interaction for making sound decisions.We introduce a game –DecidArch– developed to achieve three learning objectives: 1) create awareness about the rationale involved in design decision making, 2) enable appreciation of the reasoning behind candidate design decisions proposed by others, and 3) create awareness about interdependencies between design decisions.The game has been played by 22 groups with a total of 83 players, all of them students of the VU software architecture course. We present some of the lessons learned, both from our observation and through participant survey. We conclude that the game well supports our three learning objectives, and we identify several improvement points for future game editions.", author = "P. Lago and Cai, {Jia F.} and {de Boer}, R.C. and Philippe Kruchten and R. Verdecchia", year = "2018", booktitle = "52nd Hawaii International Conference on System Sciences (HICSS)" }
Teaching software architecture is a challenge because of the difficulty to expose students to actual meaningful design situations. Games can provide a useful illustration of the design decision making process, and teach students the power of team interaction for making sound decisions.We introduce a game –DecidArch– developed to achieve three learning objectives: 1) create awareness about the rationale involved in design decision making, 2) enable appreciation of the reasoning behind candidate design decisions proposed by others, and 3) create awareness about interdependencies between design decisions.The game has been played by 22 groups with a total of 83 players, all of them students of the VU software architecture course. We present some of the lessons learned, both from our observation and through participant survey. We conclude that the game well supports our three learning objectives, and we identify several improvement points for future game editions. -
R. Verdecchia, I. Malavolta, and P. Lago
Architectural Technical Debt Identification: The Research Landscape
International Conference on Technical Debt (TechDebt), 2018.
[Abstract] [BibTeX] [PDF] [Slides]@inproceedings{verdecchia2018techdebt, title = {Architectural Technical Debt Identification: The Research Landscape}, keywords = {Software Engineering, Software Architecture, Technical Debt, Systematic Mapping Study}, author = {Roberto Verdecchia and Ivano Malavolta and Patricia Lago}, year = {2018}, booktitle = {International Conference on Technical Debt (TechDebt)} }
Architectural Technical Debt (ATD) regards sub-optimal design decisions that bring short-term benefits to the cost of long-term gradual deterioration of the quality of the software architecture. The identication of ATD strongly influences the technical and economic sustainability of software systems and is attracting growing interest in the scientific community. During the years several approaches for ATD identification have been conceived. Each of them, however, addresses ATD from different perspectives and with heterogeneous characteristics. In this paper we present a systematic mapping study on ATD identification. Our goal is to identify, classify, and evaluate the state of the art on ATD identification from the following three perspectives: publication trends, characteristics, and potential for industrial adoption. Specifically, starting from a set of 509 potentially relevant studies, we systematically selected 47 primary studies and analyzed them according to a rigorously predefined classification framework. The analysis of the obtained results will support both researchers and practitioners by providing (i) an assessment of current research trends and gaps in ATD identification, (ii) a solid foundation for understanding existing (and future) research, and (iii) a rigorous evaluation of its potential for industrial adoption. -
I. Malavolta, R. Verdecchia, M. Bruntink
, B. Filipovic and P. Lago
On the Evolution of Maintainability Issues of Android Applications
IEEE International Conference on Software Maintenance and Evolution (ICSME), 2018.
[Abstract] [BibTeX] [PDF]@inproceedings{malavolta2018icsme, title = {On the Evolution of Maintainability Issues of Android Applications}, keywords = {Software Engineering, Software Maintenance, Android Application}, author = {IVano Malavolta, Roberto Verdecchia, Magiel Bruntink, Bojan Filipovic and Patricia Lago}, year = {2018}, booktitle = {International Conference on Software Maintenance and Evolution (ICSME)} }
Context. Android is the largest mobile platform today, with thousands of apps published and updated in the Google Play store everyday. Maintenance is an important factor in Android apps lifecycle, as it allows developers to constantly expand their apps and better tailor them to their user base.
Goal. In this paper we investigate the evolution of various maintainability issues along the lifetime of Android apps.
Method. We designed and conducted an empirical study on 434 GitHub repositories containing open, real (i.e., published in the Google Play store), and actively maintained Android apps. We statically analyzed 9,945 weekly snapshots of all apps for identifying their maintainability issues over time. We also identified maintainability hotspots along the lifetime of Android apps according to how their density of maintainability issues evolves over time. More than 2,000 GitHub commits belonging to identified hotspots have been manually categorized to understand the context in which maintainability hotspots occur.
Results. Our results shed light on (i) how often various types of maintainability issues occur over the lifetime of Android apps, (ii) the stationarity and trends maintainability issue density of Android apps over time, and (iii) an in-depth characterization of development activities related to maintainability hotspots. Conclusions. Independently from the type of development activ- ity, maintainability issues grow until they stabilize, but are never fully resolved.
-
R. Verdecchia, R. A. Saez, Giuseppe Procaccianti, and P. Lago
Empirical Evaluation of the Energy Impact of Refactoring Code Smells
International Conference on ICT for Sustainability (ICT4S), 2018.
🏆Runner-up Best Paper Award.
[Abstract] [BibTeX] [PDF]@inbook{verdecchia2018codeSmells, title = "Empirical Evaluation of the Energy Impact of Refactoring Code Smells", keywords = "Software Engineering, sustainability, energy efficiency", author = "Roberto Verdecchia and Saez, {Rene' Aparicio} and Giuseppe Procaccianti and Patricia Lago", year = "2018", month = "5", series = "ICT4S", booktitle = "International Conference on ICT for Sustainability" }
Software energy efficiency has gained the increasing attention of the research community. How to improve it, however, still lacks evidence. Specifically, the impact of code smell refactoring on energy efficiency has been scarcely investigated. In the exploratory study here reported, we investigate the impact on performance and energy consumption of refactoring well-known code smells on Java software applications. In order to understand if software metrics can be used as indicators of the energy impact of refactoring, we also measured the variation caused by refactoring on a set of well-established software metrics. We conducted a controlled experiment using state-of-the-art power measurement equipment. Statistical hypothesis testing and effect size estimation were performed on the experimental results, which show that in one out of three applications, refactoring each smell significantly impacted power- and energy consumption. E.g., refactoring Feature Envy and Long Method smells led to a 49% energy efficiency improvement. No software metric, however, significantly correlated with execution time, power or energy consumption. In conclusion, refactoring code smells resulted to be a viable process to significantly improve software energy efficiency. The magnitude of the impact may depend on application properties, e.g. size or age. Further research is needed to understand the relationship between software metrics and energy efficiency. -
R. Verdecchia, A. Guldner, Y. Becker, and E. Kern
Code-level Energy Hotspot Localization via Naive Spectrum Based Testing
International Conference On Enviromental Informatics (EnviroInfo), 2018.
[Abstract] [BibTeX] [PDF]@inproceedings{verdecchia2018enviroinfo, title = {Code-level Energy Hotspot Localizationvia Naive Spectrum Based Testing}, keywords = {Software Energy Efficiency; Software Testing; Empirical Software Engineering;}, author = {Roberto Verdecchia and Achim Guldner and Yannick Becker and Eva Kern}, year = {2018}, booktitle = {International Conference On Enviromental Informatics} }
With the growing adoption of ICT solutions, the requirement for developing energy efficient software becomes increasingly important. Current methods aimed at analyzing energy demanding portions of code, referred to as ``energy hotspots'', often require ad-hoc analyses that constitute an additional process in the development life cycle. This leads to the scarce adoption of such methods in practice, leaving an open gap between source code energy optimization research and its concrete application. Thus, our underlying goal is to provide developers with a technique that enables them to efficiently gather source code energy consumption information without requiring excessive time overhead and resources. In this research we present a naive spectrum-based fault localization technique aimed to efficiently locate energy hotspots in source code. More specifically, our research aims to understand the viability of spectrum based energy hotspot localization and the tradeoffs which can be made between performance and precision for such techniques. Our naive yet effective approach takes as input an application and its test suite, and utilizes a simple algorithm to localize portions of code which are potentially energy-greedy. This is achieved by combining test case coverage information with runtime energy consumption measurements. The viability of the approach is assessed through an empirical experiment. We conclude that the naive spectrum based energy hotspot localization approach can effectively support developers by efficiently providing insights of the energy consumption of software at source code level. Since we use processes already in place in most companies and adopt straightforward data analysis processes, naive spectrum based energy hotspot localization can reduce the effort and time required for assessing energy consumption of software and thus make including the energy consumption in the development process viable. As future work we plan to (i) further investigate the tradeoffs between performance and precision of spectrum based energy hotspot approaches (ii) compare our approach to similar ones through large-scale experiments. Our ultimate goal is to conceive ad-hoc tradeoff tuning of performance and precision according to development and organizational needs. -
R. Verdecchia
Identifying Architectural Technical Debt: Moving Forward
IEEE International Conference On Software Architecture (ICSA), 2018.
[Abstract] [BibTeX] [PDF] [Slides]@inproceedings{verdecchia2018icsa, title = {Identifying Architectural Technical Debt: Moving Forward}, keywords = {Software Architecture; Technical Debt; Software Maintenance;}, author = {Roberto Verdecchia}, year = {2018}, booktitle = {IEEE International Conference On Software Architecture} }
In software-intensive systems, technical debt is a metaphor encompassing design and implementation constructs that are used as expedients in the short term, but that hinder future maintainability and evolvability. Architectural technical debt, in turn, adopts such concept by considering sub-optimal architectural design and implementation choices that bring short-term benefits to the cost of the long-term gradual deterioration of the quality of the software architecture. Architectural technical debt is an active field of research. Nevertheless, how to accurately identify and manage architectural technical debt is still an open question. Our research aims to fill this gap. In particular, our goal is to: (i) consolidate the existing knowledge of architectural technical debt identification and its management in practice, (ii) conceive novel identification and management approaches built upon the existing state of the art techniques and industrial needs, and (iii) provide empirical evidence of architectural technical debt phenomena and assess the viability of the conceived approaches. As a result, we envision a sound methodology aimed to support software architects in the identification and management of architectural technical debt throughout the software development process. -
B. Miranda, E. Cruciani, R. Verdecchia, A. Bertolino
FAST Approaches to Scalable Similarity-based Test Case Prioritization
International Conference on Software Engineering (ICSE) , 2018.
[Abstract] [BibTeX] [PDF]@inproceedings{miranda2018fast, title = {FAST Approaches to Scalable Similarity-based Test Case Prioritization}, author = {Miranda, Breno and Cruciani, Emilio and Verdecchia, Roberto and Bertolino, Antonia}, year = {2018}, booktitle = {Proceedings of the 39th International Conference on Software Engineering}, organization={IEEE Press} }
Many test case prioritization criteria have been proposed for speeding up fault detection. Among them, similarity-based approaches give priority to the test cases that are the most dissimilar from those already selected. However, the proposed criteria do not scale up to handle the many thousands or even some millions test suite sizes of modern industrial systems and simple heuristics are used instead. We introduce the FAST family of test case prioritization techniques that radically changes this landscape by borrowing algorithms commonly exploited in the big data domain to find similar items. FAST techniques provide scalable similarity-based test case prioritization in both white-box and black-box fashion. The results from experimentation on real world C and Java subjects show that the fastest members of the family outperform other black-box approaches in efficiency with no significant impact on effectiveness, and also outperform whitebox approaches, including greedy ones, if preparation time is not counted. A simulation study of scalability shows that one FAST technique can prioritize a million test cases in less than 20 minutes. -
R. Verdecchia
Identifying Architectural Technical Debt in Android Applications through Compliance Checking
International Conference on Mobile Software Engineering and Systems (MobileSoft), 2018.
🏆Bronze medal - ACM Student Research Competition.
[Abstract] [BibTeX] [PDF] [Slides] [Poster]@inproceedings{verdecchia2018mobilesoft, title = {Identifying Architectural Technical Debt in Android Applications through Compliance Checking}, keywords = {Software Engineering, Software Architecture, Technical Debt, Android}, author = {Roberto Verdecchia}, year = {2018}, booktitle = {International Conference on Mobile Software Engineering and Systems} }
By considering the fast pace at which mobile applications need to evolve, Architectural Technical Debt results to be a crucial yet implicit factor of success. In this research we present an approach to automatically identify Architectural Technical Debt in Android applications. The approach takes advantage of architectural guide- lines extraction and modeling, architecture reverse engineering, and compliance checking. As future work, we plan to automate the process and empirically evaluate it via large-scale experiments. -
R. Verdecchia, Giuseppe Procaccianti, I. Malavolta, P. Lago, and J. Koedijk
Estimating Energy Impact of Software Releases and Deployment Strategies: The KPMG Case Study
Empirical Software Engineering and Measurement (ESEM), 2017.
[Abstract] [BibTeX] [PDF]@inproceedings{verdecchia2017estimating, title={Estimating Energy Impact of Software Releases and Deployment Strategies: The KPMG Case Study}, author={Verdecchia, Roberto and Procaccianti, Giuseppe and Malavolta, Ivano and Lago, Patricia and Koedijk, Joost}, booktitle={Empirical Software Engineering and Measurement (ESEM), 2017 ACM/IEEE International Symposium on}, pages={257--266}, year={2017}, organization={IEEE} }
Often motivated by optimization objectives, software products are characterized by different subsequent releases and deployed through different strategies. The impact of these two aspects of software on energy consumption has still to be completely understood and can be improved by carrying out ad-hoc analyses for specific software products. In this research we report on an industrial collaboration aiming at assessing the different impact that releases and deployment strategies of a software product can have on the energy consumption of its underlying hardware infrastructure. We designed and performed an empirical experiment in a controlled environment. Deployment strategies, releases and use case scenarios of an industrial third-party software product were adopted as experimental factors. The use case scenarios were used as a blocking factor and adopted to dynamically load-test the software product. Power consumption and execution time were selected as response variables to measure the energy consumption. We observed that both deployment strategies and software releases significantly influence the energy consumption of the hardware infrastructure. A strong interaction between the two factors was identified. The impact of such interaction highly varied depending on which use case scenario was considered, making the identification of the most frequently adopted use case scenario critical for energy optimisation. The collaboration between industry and academia has been productive for both parties, even if some practitioners manifested low interest/awareness on software energy efficiency. For the software product considered there is no absolute preferable release or deployment strategy with respect to energy efficiency, as the interaction of these factors has to be considered. The number of machines involved in a software deployment strategy does not simply constitute an additive effect of the energy consumption of the underlying hardware infrastructure. -
R. Verdecchia, F. Ricchiuti, A. Hankel, P. Lago, and Giuseppe Procaccianti
Green ICT Research and Challenges
Advances and New Trends in Environmental Informatics, pp. 37-48, 2017.
[Abstract] [BibTeX] [PDF]@article{verdecchia2017green, title={Green {ICT} {R}esearch and {C}hallenges}, author={Verdecchia, Roberto and Ricchiuti, Fabio and Hankel, Albert and Lago, Patricia and Procaccianti, Giuseppe}, booktitle={Advances and New Trends in Environmental Informatics}, pages={37--48}, year={2017}, publisher={Springer} }
Green ICT is a young and pioneering field. Therefore, as often pointed out in the literature, studies evaluating the main research activities and the general direction of this new and continuously evolving research field are scarce and often incomplete. This study presents a quantitative analysis, through a systematic literature review, of the main activities, trends and issues that can be found in the Green ICT literature. The research reports the analysis of various characteristics of the studies gathered for this review, such as addressed type of effect and year of publication. It also led to the identification of the most recurrent issues of the research and development of Green ICT strategies. Finally, this study proposes a new category of effect (people awareness) that, even if often addressed by the field, is not included in current Green ICT frameworks.
R. Verdecchia, L. Scommegna, Benedetta Picano, Marco Becattini, E. Vicario
Network Digital Twins: A Systematic Review
IEEE Access, 2024.
[Abstract] [BibTeX] [PDF]
@article{verdecchia2024network, title={Network Digital Twins: A Systematic Review}, author={Verdecchia, Roberto and Scommegna, Leonardo and Picano, Benedetta, and Becattini, Marco and Vicario, Enrico}, journal={IEEE Access}, publisher={IEEE}, year={2024} }
Network management is becoming more complex due to various factors. The growth
of IoT increases the number of nodes to control. The combination of Edge and Fog Computing with
distributed algorithms makes network synchronization challenging. Softwarized technologies simplify
network management but create integration issues with legacy networks. Even in industrial settings where
drones and mobile robots are used, proper network management is crucial yet challenging. In this context,
digital twins can be used to replicate the structure and behavior of the physical network and at the same
time can be used to successfully manage the complexity and heterogeneity of current networks. Despite
the rapid growth of interest in the topic, a comprehensive overview of Network Digital Twin research
is currently missing. To address this gap, in this paper, we present a systematic review of the Network
Digital Twin literature. From the analysis of 138 primary studies, various insights emerge. Networking
Digital Twin is a particularly recent concept that has been explored in the literature since 2017 and is
experiencing a steady increase to this day. The vast majority of the studies propose solutions to optimize
network performance, but there are also many oriented towards other goals such as security and functional
suitability. The three most recurrent application domains, as self-reported in the primary studies, are those
of smart industry, edge computing, and vehicular. The main research topics aim at network optimization,
support for offloading, resource allocation, and floor monitoring, but also support in the implementation of
machine learning algorithms such as federated learning. As a conclusion, Networking Digital Twin proves
to be a promising emerging field both for academics and practitioners
The entirety of above linked documents are made available as a mean to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that the works are offered here electronically. It is understood that all persons utilizing this information will adhere to the terms and constraints invoked by each copyright holder. These works may not be reposted without the explicit permission of the copyright holders.