Metrics Model to Complement the Evaluation of DevOps in Software Companies

This article presents a model to complement the evaluation of DevOps in software companies. It was designed by harmonizing the elements of the DevOps process identified through a systematic mapping of the literature and aimed to know the state of the art of methodological solutions and tools to evaluate DevOps in the industry


I. INTRODUCTION
Currently, software development companies face challenges to deploy solutions with high quality standards in short time intervals [1]. To achieve this, companies seek to improve their processes by implementing approaches and/or frameworks that allow them to enhance the quality of their products [1]. In this sense, proposals related to the software product implementation life cycle (Dev) that can be classified as traditional and agile have been made. Some of the most used traditional solutions are CMMI [2], RUP [3], Cascade model [4], Spiral model [5], and Rapid Application Design (RAD) [6]. Some common agile solutions are Scrum [7], Lean Software [8], Test Driven Development (TDD) [9], Extreme Programming (XP) [10], [11], Crystal Clear [12], Adaptive Software Development [13], and Dynamic Systems Development Method [14]. Moreover, hybrid solutions that take advantage of both approaches have been proposed, e.g., Scrum & XP [15], Scrumban [16], and Scrum & CMMI [17]. However, software companies have also paid special attention to the processes related to operations management in Information Technology (Ops), which are applied to establish strategies that allow defining and implementing a set of best practices to guarantee the stability and reliability of the solutions in productive environments. Software development life cycle management brings multiple benefits to companies including continuously reducing development, integration, and deployment times; delegating repetitive tasks to automated processes; reducing errors caused by human intervention [18], [19], among others. To achieve this, solutions related to operations management such as ITIL [20], COBIT [21], the ISO/IEC 20000 standard [22], and ISO/IEC 27000 standard [23] have been proposed. Debois [24] introduced the term DevOps in 2009 with the aim of integrating the best practices proposed for development and operations (Dev and Ops). Over the years, DevOps has proven to bring multiple benefits related to the improvement of activities of the projects' life cycle, especially in productivity, quality, and competitiveness of software development companies [25], [26]. In general, DevOps focuses on defining practices that allow enhancing tasks related to continuous integration [27], change management [28], automated tests [29], continuous deployment [30], continuous maintenance [31], among others. According to the global survey report on the state of agility in 2021 [32], 75% of the participants mentioned that a transformation towards a culture supported by DevOps brings multiple benefits for companies in terms of reduced effort, cost, and time. However, adopting DevOps in software companies is not a simple task [33], to minimize the risk of error in its adoption, they must establish mechanisms that allow quantifying how it is applied in their projects and identify improvement opportunities to fine-tune their practices and improve their internal processes [34]. The efforts and proposals related to the evaluation of DevOps in software companies were identified with a systematic mapping of the literature carried out in [35]. Two mechanisms were used to define methodological solutions (models, metrics, certification standards) and tools developed by active players in the industry that seek to assess DevOps in multiple ways. However, the results show a high degree of heterogeneity in the proposed solutions, since there is no consensus in the definitions, relationships, and concepts related to DevOps [36]. In consequence, the solutions identified in the literature were proposed in accordance with a set of values, principles, activities, roles, practices, and tasks considered relevant by each author. Although the analyzed solutions follow the same objective: "assess the degree of DevOps capacity, maturity and/or competence", they have different perceptions, scopes and, in some cases, they are ambiguous. Likewise, the solutions described in [35] establish "what" to do; however, they do not define "how" to implement the proposed practices, which can cause confusion when applying DevOps in software companies. Besides, there are studies related to the evaluation of DevOps in companies of different sizes, most of them focus on large and medium-sized companies and leave aside small and micro software companies. According to the digital transformation report of the Economic Commission for Latin America and the Caribbean (CEPAL) in 2021 [37], they correspond to approximately 99% of the legally constituted companies in Latin America and have gradually become active industry players looking to apply DevOps in their projects.
Hence, there are solutions and tools to evaluate DevOps; however, each author suggests his own terminology, evaluation criteria, concepts, practices, and process elements. It results in a high degree of heterogeneity that can generate confusion, This article presents a metrics model defined following the Goal, Question, Metric (GQM) approach [38], and aims to complement the evaluation of DevOps. The model organizes its elements around four dimensions: people, culture, technology, and processes and aims to define what and how to evaluate DevOps compliance in the software industry. The paper is structured as follows: Section 2 analyzes the state of the art of solutions to evaluate DevOps in software companies; Section 3 presents a metrics model to evaluate DevOps according to the practices, dimensions, and values found, analyzed, and harmonized from the literature; Section 4 describes the protocol to form a focus group as an evaluation method.
Finally, Section 5 presents the conclusions and future work.

A. Background
After executing a systematic mapping of the literature (SML), as reported in [35], it was analyzed to identify the solutions proposed by different authors in relation to the definition of processes, models, techniques and/or or tools to evaluate DevOps in software companies. Three types of studies were identified: (i) exploratory studies, (ii) methodological solutions, and (iii) tools. The results obtained are presented below.

1) Exploratory Studies.
In [39], an exploratory study was carried out to analyze different tools to evaluate DevOps in small and medium software companies. In [11], [36], [40], [41], SML were made to identify the process elements that must be considered to certify that a company applies DevOps appropriately. In [42]- [45], studies were conducted to know the use of maturity models to evaluate DevOps.
3) Tools. [39], [50], [61] mentioned the following tools: DevOps Maturity Assessment [62], Microsoft DevOps Self-Assessment [63], IBM DevOps Self-Assessment [64], and IVI's DevOps Assessment [65]. However, the tools presented in the studies were not assessed exhaustively. To expand the knowledge on the definition of tools to evaluate DevOps, an exploratory study was carried out based on the methodology proposed in [66]. As a result, 13 tools were identified and are presented in Table 1.
The tools analysis considers accessibility (A1): free to access, trial period, or paid; evaluation method (A2): surveys, frameworks, consulting, or another mechanism; and objective or scope of the evaluation (A3): the tool performs an evaluation of the process, practices, activities, tasks, or other aspects/element. In relation to accessibility (A1), it was observed that 7 tools (54%) (

B. Protocol to Harmonize DevOps Process Elements
It was necessary to carry out a harmonization process that allowed identifying the elements to define a generic model to evaluate DevOps. Nevertheless, each model and tool has its own structure, concepts, and characteristics. To establish a homogeneous solution, HPROCESS was used to harmonize the models [76] with the following activities: identification, carried out during the SML; homogenization; comparison; and integration.

1) Homogenization Method.
This method compares the general information of each solution and tool in a common structure that shows the characteristics of each study in relation to the rest [77]. It was defined from the process elements established in the PrMO ontology [78]. The characterization is available at https://bit.ly/3QDJOT9.  The detail of the results can be consulted at https://bit.ly/3dItx0M. Finally, an activity was conducted to identify the relationship between practices, dimensions, and values. It can be consulted at https://bit.ly/3T05Q45.

III. RESULTS AND DISCUSSION
The goal of the metrics in software engineering is to identify the essential parameters present in the projects [82]. The harmonization process allowed to obtain 12

B. Goals
Initially, a set of goals related to each practice defined at the harmonization stage was defined. As a result, 42 goals and 63 questions related to fundamental practices were set; and 19 goals and 29 questions related to complementary practices. Table   3 shows the goals related to the continuous deployment (DC) practice. The rest of the objectives can be consulted at https://bit.ly/3SZQSuZ.

C. Questions
Each goal is associated with one or more questions that relate the aspects to be evaluated quantitatively. The questions use a nominal scale with two possible values (YES: 100%, NO: 0%). They were defined following the criteria proposed in [83], which seeks to avoid ambiguous questions, vague terms, cognitive overload, among others. Table 4 presents the questions associated with DC. The rest of the questions can be consulted at https://bit.ly/3ChM1zi. A questionnaire-type evaluation instrument was designed with two possible answers ("YES", "NO"). The template used to answer the questions can be found at https://bit.ly/3LDgfik. The answer to each question is given according to the following criteria: "YES"; (i) collection of opinions about each role involved in the practices or (ii) consistent historical records that evidence compliance. "NO"; (i) if the company does not present evidence of compliance with the practice. Table 5 shows the scale to assess the degree of implementation ( ) of practices, dimensions, and values. It was defined following the formalism proposed in [80]. The metrics definition process was carried out by assigning weights to each practice, dimension, and value by applying the linear weighting method [84], and metrics were defined using the GQM [38].

1) Metrics to Evaluate Practices.
As a result of the weighting process [84], it was identified that each practice has an associated weighted percentage (% ). The combined weighted percentage (% ) corresponds to the weight associated with all practices during the total evaluation. Table 6 shows the weights of each fundamental and complementary practice.   Table 6).  Table 11).  Table 6); % : 30% (see Table 6); % : percentage of compliance with fundamental practices; % : percentage of compliance with complementary practices.

2) Metrics to evaluate dimensions.
A total of 4 dimensions were obtained: (i) tools, (ii) processes, (iii) culture and (iv) people. Each dimension has a set of practices associated with it and a weighted percentage (% = 25%). Table 8 shows the metrics to evaluate the degree of implementation of dimensions.  Table  6); % : 30% (see Table 6); % : Percentage of individual compliance with a specific fundamental practice (see Table 7); % : Percentage of individual compliance with a specific complementary practice (see Table 7).

3) Metrics to evaluate values. A total of 4 values were obtained: (i) automation, (ii)
collaboration, (iii) measurement and (iv) communication. Each value has a set of associated dimensions and a weighted percentage (% = 25%). Table 9 presents the metrics to evaluate the degree of implementation of the values.  Table 8). Table 10 presents the metric to know the implementation degree of DevOps in a software development company. : Percentage of individual compliance with a value (see Table 9).

A. Focus Group Protocol
The procedure to form the focus group followed the guidelines defined in [85], which proposes 5 phases: (i) planning, (ii) recruitment, (iii) moderation, (iv) analysis and report of results, and (v) limitations. To conduct the focus group, a questionnaire that aimed to assess the suitability, completeness, ease of understanding, and applicability of the metrics model was designed. The materials were a questionnaire, a work agenda, a protocol structure, and a proposal to be evaluated.  Table 12 presents the details of the questions.  practice. In this sense, the participants stated that the metrics can offer a result that allows companies to identify possible aspects to be improved. Finally, it was possible to observe a favorable opinion regarding aspects related to mathematical rigor and the usefulness of the proposed metrics; however, the ones related to the applicability of the proposal in small and medium-sized companies were identified. They were considered and applied to refine a new version of the proposal. The detail of the improvement actions can be consulted at the following link: https://bit.ly/3pw659y.

B. Limitations
Each limitation found during the focus group and the solutions they applied are presented below. Although all the participants met the selection criteria, they did not have the same level of knowledge and experience with DevOps; the metric model was sent to all participants three weeks in advance to guarantee that all participants were aware of the context of the proposal. According to [85], the focus group should have at least 6 participants; therefore, 15 people were invited to reduce the possibility of not reaching the minimum number of attendees. At the beginning of the session, it was possible to observe that the participation was low, this was corrected by the rapporteur and the moderator, who encouraged them to participate by asking questions or animate them to express their comments. Due to the number of participants, some of the comments made during the discussion were outside the scope of the proposed evaluation objectives; it was decided to clarify each comment quickly to continue with the discussion. The focus group was carried out following biosafety protocols to avoid crowds: the session was held remotely and permission was requested to record it and analyze the observations and comments that could have been omitted during the session.

V. CONCLUSIONS
The metrics model was the result of several stages executed in a structured and Likewise, the focus group allowed to receive feedback thanks to the recommendations of software engineering experts with experience in the definition, adoption, and application of DevOps processes, and to identify aspects to be improved. Those aspects were analyzed to obtain a refined version of the proposal.
Finally, the future work gaps that are currently being addressed include the execution of multiple case studies to evaluate the metrics model in operational environments, the construction of a tool to automate the application of the metrics, and the