Recent work on open source sustainability shows that successful trajectories of projects in the Apache Software Foundation Incubator (ASFI) can be predicted early on, using a set of socio-technical measures. Because OSS projects are socio-technical systems centered around code artifacts, we hypothesize that sustainable projects may exhibit different code and process patterns than unsustainable ones, and that those patterns can grow more apparent as projects evolve over time. Here we studied the code and coding processes of over 200 ASFI projects, and found that ASFI graduated projects have different patterns of code quality and complexity than retired ones. Likewise for the coding processes ś e.g., feature commits or bug-fixing commits are correlated with project graduation success. We find that minor contributors and major contributors (who contribute <5%, respectively >=95% commits) associate with graduation outcomes, implying that having also developers who contribute fewer commits are important for a project’s success. This study provides evidence that OSS projects, especially nascent ones, can benefit from introspection and instrumentation using multidimensional modeling of the whole system, including code, processes, and code quality measures, and how they are interconnected over time.
Feature models are arguably one of the most intuitive and successful notations for modeling the features of software product lines. Feature models help developers keep an overview understanding as well as they support scoping, planning, development, product derivation (configuration), and maintenance activities, sustaining complex software product lines. Unfortunately, feature models are difficult to build and maintain. Features need to be identified, grouped, organized in a hierarchy, mapped to software assets, and feature dependencies declared. While feature models have been the subject of three decades of research—resulting in various featuremodeling dialects, automated analysis or configuration methods—a generic process for engineering feature models is still missing. So far, it is not even clear, whether engineering feature models can follow recurrent principles across domains, hampering their practical applicability. We address this gap and show that such principles in fact exist. We analyzed feature-modeling practices expressed in 31 publications over the last decades and in ten interviews conducted with industrial practitioners. We synthesized a set of 34 principles covering eight different phases of feature modeling, from planning over model construction to model maintenance and evolution. Grounded in empirical evidence, these principles provide practical, context-specific advice on how to perform feature modeling, describe what information sources to consider, and highlight common characteristics of feature models.We discuss the relevance of our principles for improving feature modeling tools, synthesis and analyses techniques, as well as future research directions.
Cloning is a simple way to create new variants of a system. While cheap at first, it unfortunately creates a maintenance cost in the long term. Eventually, the cloned variants need to be integrated into a configurable platform. Such an integration is challenging: it involves merging the usual code improvements between the variants, and also integrating the variable code (features) into the platform. Thus, variant integration differs from traditional software merging, which does not produce or organize configurable code, but creates a single system that cannot be configured into variants. In practice, variant integration requires fine-grained code edits, performed in an exploratory manner, in multiple iterations. Unfortunately, little tool support exists for integrating cloned variants. In this work, we show that fine-grained code edits needed for integration can be alleviated by a small set of integration intentions—domain-specific actions declared over code snippets controlling the integration. Developers can interactively explore the integration space by declaring (or revoking) intentions on code elements. We contribute the intentions (e.g., ‘keep functionality’ or ‘keep as a configurable feature’) and the IDE tool INCLINE, which implements the intentions and five editable views that visualize the integration process and allow declaring intentions producing a configurable integrated platform. In a series of experiments, we evaluated the completeness of the proposed intentions, the correctness and performance of INCLINE, and the benefits of using intentions for variant integration. The experiments show that INCLINE can handle complex integration tasks, that views help to navigate the code, and that it consistently reduces mistakes made by developers during variant integration.
Fork-based development has been widely used both in open source communities and in industry, because it gives developers flexibility to modify their own fork without affecting others. Unfortunately, this mechanism has downsides: When the number of forks becomes large, it is difficult for developers to get or maintain an overview of activities in the forks.Current tools provide little help. We introduce Infox, an approach to automatically identify non-merged features in forks and to generate an overview of active forks in a project. The approach clusters cohesive code fragments using code and network-analysis techniques and uses information-retrieval techniques to label clusters with keywords. The clustering is effective, with 90% accuracy on a set of known features. In addition, a human-subject evaluation shows that Infox can provide actionable insight for developers of forks.
Variability-sensitive verification pursues effective analysis of the exponentially many variants of a program family. Several variability-aware techniques have been proposed, but researchers still lack examples of concrete bugs induced by variability, occurring in real large-scale systems. A collection of real world bugs is needed to evaluate tool implementations of variability-sensitive analyses by testing them on real bugs. We present a qualitative study of 98 diverse variability bugs (i.e., bugs that occur in some variants and not in others) collected from bug-fixing commits in the Linux, Apache, BusyBox, and Marlin repositories. We analyze each of the bugs, and record the results in a database. For each bug, we create a self-contained simplified version and a simplified patch, in order to help researchers who are not experts on these subject studies to understand them, so that they can use these bugs for evaluation of their tools. In addition, we provide single-function versions of the bugs, which are useful for evaluating intra-procedural analyses. A web-based user interface for the database allows to conveniently browse and visualize the collection of bugs. Our study provides insights into the nature and occurrence of variability bugs in four highly-configurable systems implemented in C/C++, and shows in what ways variability hinders comprehension and the uncovering of software bugs.
Highly configurable software often uses preprocessor annotations to handle variability. However, understanding, maintaining, and evolving code with such annotations is difficult, mainly because a developer has to work with all variants at a time. Dedicated methods and tools that allow working on a subset of all variants could ease the engineering of highly configurable software. We investigate the potential of one kind of such tools: projection-based variation control systems. For such systems we aim to understand: (i) what end-user operations they need to support, and (ii) whether they can realize the actual evolution of real-world, highly configurable software. We conduct an experiment that investigates variability-related evolution patterns and that evaluates the feasibility of a projection-based variation control system by replaying parts of the history of a highly configurable real-world 3D printer firmware project. Among others, we show that the prototype variation control system does indeed support the evolution of a highly configurable system and that in general, it does not degrade the code.
In large-scale software ecosystems, many developers contribute extensions to a common software platform. Due to the independent development efforts and the lack of a central steering mechanism, similar functionality may be developed multiple times by different developers. We tackle this problem by contributing a role-based collaboration model for software ecosystems to make such implicit similarities explicit and to raise awareness among developers during their ongoing efforts. We extract this model based on realization artifacts in a specific programming language located in a particular source code repository and present it in a technology-neutral way. We capture five essential collaborations as independent role models that may be composed to present developer collaborations of a software ecosystem in their entirety, which fosters overview of the software ecosystem, analyses of duplicated development efforts and information of ongoing development efforts. Finally, using the collaborations defined in the formalism we model real artifacts from Marlin, a firmware for 3D printers, and we show that for the selected scenarios, the five collaborations were sufficient to raise awareness and make implicit information explicit.
Code cloning has been reported both on small (code fragments) and large (entire projects) scale. Cloning-in-the-large, or forking, is gaining ground as a reuse mechanism thanks to availability of better tools for maintaining forked project variants, hereunder distributed version control systems and interactive source management platforms such as Github. We study advantages and disadvantages of forking using the case of Marlin, an open source firmware for 3D printers. We find that many problems and advantages of cloning do translate to forking. Interestingly, the Marlin community uses both forking and integrated variability management (conditional compilation) to create variants and features. Thus, studying it increases our understanding of the choice between integrated and clone-based variant management. It also allows us to observe mechanisms governing source code maturation, in particular when, why and how feature implementations are migrated from forks to the main integrated platform. We believe that this understanding will ultimately help development of tools mixing clone-based and integrated variant management, combining the advantages of both.
Variability management aims at taming variability in large and complex software product lines. To efficiently manage variability, it has to be modeled using formal representations, such as feature or decision models. Such models are efficient in many domains, where variability is about switching on and off features, or using parameters to customize products of the product line. However, variability can be represented in the form of a topology in domains where variability is about connecting components in a certain order, in specific interconnected hierarchies, or in different quantities. In this experience report, we explore topological variability within a case study of large-scale fire alarm systems. We identify core characteristics of the variability, derive modeling requirements, model the variability using UML2 class diagrams, and discuss the applicability of further variability modeling languages. We show that, although challenging, class diagrams can suffice to represent topological variability in order to generate a configurator tool. In contrast, modeling parallel and recursive structures, cycles, informal constraints, and orthogonal hierarchies were among the main experienced challenges that require further research.
Cloning is widely used for creating new product variants. While it has low adoption costs, it often leads to maintenance problems. Long term reliance on cloning is discouraged in favor of systematic reuse offered by product line engineering (PLE) with a central platform integrating all reusable assets. Unfortunately, adopting an integrated platform requires a risky and costly migration. However, industrial experience shows that some benefits of an integrated platform can be achieved by properly managing a set of cloned variants. In this paper, we propose an incremental and minimally invasive PLE adoption strategy called virtual platform. Virtual platform covers a spectrum of strategies between ad-hoc clone and own and PLE with a fully-integrated platform divided into six governance levels. Transitioning to a governance level requires some effort and it provides some incremental benefits. We discuss tradeoffs among the levels and illustrate the strategy on an example implementation.