Abstract

Context. Coupling between classes is an important metric for software complexity in software systems. Objective. In order to overcome the shortcomings of the existing coupling methods and fully investigate the weighted coupling of classes in different cases in large-scale software systems, this study analyzed the relationship between classes at package level, class level, and method level. Method. The software system is considered as a set of special bipartite graphs in complex networks, and an effective method for coupling measurement is proposed as well. Furthermore, this method is theoretically proved to satisfy the mathematical properties of coupling measurement, leading to overcome the disadvantages of the majority of existing methods. In addition, it was revealed that the proposed method was efficient according to the analyses of existing methods for coupling measurement. Eventually, an algorithm was designed and a program was developed to calculate coupling between classes in three open-source software systems. Results. The results indicated the scale-free characteristic of complex networks in the statistical data. Additionally, the calculated power-law value was used as a metric for coupling measurement, so as to calculate coupling of the three open-source software. It indicated that coupling degrees of the open-source software systems contained a certain impact on evaluation of software complexity. Conclusions. It indicated that coupling degrees of the open-source software systems contained a certain impact on evaluation of software complexity. Moreover, statistical characteristics of some complex networks provided a reliable reference for further in-depth study of coupling. The empirical evidence showed that within a certain range, reducing the coupling was helpful to attenuate the complexity of the software, while excessively blindly pursuit of low coupling increases the complexity of software systems.

1. Introduction

Coupling refers to the degree of interdependence between software modules; a measure of how closely connected two routines or modules are [1]; and the strength of the relationships between modules. Structured design, including cohesion and coupling, was published in an article by Stevens et al. and a book by Stevens et al. [2, 3], and the latter subsequently became standard terms. Coupling is considered as a double-edged sword in object-oriented programming. On the one hand, object-oriented software development (OOSD) includes object-oriented requirement analysis, as well as object-oriented design. OOSD is a practical method of developing a software system which focuses on the objects of a problem throughout development. Interactions between objects reflect the interdependence between objects. If objects are isolated, then the software system can only achieve simple functions. However, objects are equivalent to cells in human body. If cells are completely isolated from human body, they basically do not play any significant role, reflecting that functions of a software system require a tight coupling between objects. On the other hand, tight coupling between objects would lead to a water-wave effect, meaning that changes in one object may result in further changes in other objects. The most terrible case is that there is a possibility of “avalanche” effect, which may affect the whole system, leading to a sharp decline in the testability, understandability, reliability, and maintainability of the system. Therefore, it is expected that classes are loosely coupled in terms of software design. A system can be tightly coupled in one aspect while being loosely coupled in another. However, software developers mainly prefer to develop those systems that are as loosely coupled as possible; thus, design, testing, and maintenance of the system would be relatively independent and more reasonable. Moreover, a decrease may be observed in the possibility that errors propagate between modules if there are few connections between modules [4]. Coupling has been widely used in evaluation of the degree of failure in classification [57], effective analysis [8, 9], and design pattern [10] of software systems.

The present article has the following organization: Section 2 summarizes the materials and methods. Section 3 shows the results. Section 4 provides a conclusion and suggests perspectives.

2. Materials and Methods

2.1. Methods

Currently, the methods for coupling measurement of object-oriented software systems are mainly structure-oriented measurement methods (Tables 1 and 2) [8, 1115].

Comparative analysis of typical methods is shown in Tables 13, indicating that(1)Methods for coupling measurement are mainly based on method invocation between classes.(2)Calculation of coupling strength is defined as the degree of method invocation, which is weighted coupling.(3)A small number of methods use fan-in and fan-out as metrics.(4)Inheritance is dominant.(5)Few methods use static method invocation, system measurement, and package-level metrics.

In addition to the abovementioned structural information methods, some scholars have recently used dynamic information methods [17, 18], semantic information methods [1921], and logical information methods [22, 23]. Based on the results of previous studies, methods of coupling measurement cover the following cases:(1)The DCMs are more accurate than that of structural information methods, while it seems to be difficult in the measurement of coupling metrics. However, structural information methods are more intuitive and easier to be perceived compared with semantic information and logical information methods.(2)At present, the majority of the traditional structural information methods analyze coupling based on the degree of connecting edges between classes and mainly focus on complexity between classes and emphasize more on measurement from a local fine-grained aspect. Moreover, these degrees of connecting edges only consider a certain or a limited aspects of software engineering. Therefore, these methods contain some limitations, which cannot properly satisfy the requirement of an effective coupling measurement in complex software systems.(3)Although a number of coupling measurement methods analyze network relationship from overall and macro perspectives based on graph theory, the majority of measurement metrics mainly use classes, packages, or methods as nodes to construct some undirected, directed, unweighted, or weighted network models. Moreover, they ignore a complex relationship of object-oriented characteristics between different classes. Some methods have not been theoretically verified for developing the mathematical characteristics of the measurement metrics.

The process of establishment of an effective method for coupling measurement between classes in a software system is determined by the following two aspects: reasonable measurement metrics and theoretical support of measurement metrics [24, 25]. Briand et al. mathematically analyzed measurement metrics of the software system and presented a robust theoretical support for the measurement metrics [4, 26, 27]. Many of complex networks have been shown to share the features such as “scale-free” and “small world” [28, 29]. Pan et al. revealed many physics-like laws in software systems from a complex network perspective recently [30, 31]. Studies on complex networks and software engineering revealed that class-level, method-level, and package-level diagrams of a software system could show the characteristics of “scale-free” and “small world,” which provided a novel perspective for finding more reasonable measurement metrics [3234]. Complex network theories were applied to measure software [35, 36], identify key software elements [37], and cluster Web services [3839]. Researchers have found that many real networks have the bipartite graph characteristics of complex networks [4045]. With combination of package level, class level, and method level, this study analyzed a complex relationship between classes in the same layer and all layers of a package and proposed a method for coupling metrics for object-oriented systems based on bipartite graph of complex networks, named here CSBG, and object-oriented software systems were expressed as a set of special bipartite graphs.

2.2. Problem Description

In this study, a statistical method for complex networks was used to analyze the degree of fan-out and the heterogeneity of classes at the same layer and all layers of a package, in addition to the calculation of coupling of software systems.

2.2.1. Relationship between Classes

Definition 1. ASS relationship (association and aggregation).
Association means which/how classes interact with each other, and association can be represented by a line between these classes with an arrow indicating the navigation direction. Aggregation implies that one class exists in another class in the form of attribute.

Definition 2. DEP relationship (dependency)
DEP_D: dynamic dependency refers to an instance method in a class that invokes methods and attributes in another class.
DEP_S: static dependency refers to static methods in a class invokes methods and attributes in another class.

Definition 3. GEN (generalization)
One class inherits with another class, or one class implements interfaces with another class, or that of an abstract class.

2.2.2. Definition of Package Hierarchy

Packages of an object-oriented software system include classes and subpackages in the current hierarchy, and these subpackages contain classes in the current hierarchy and their subpackages. Software systems can actually be considered to be a tree hierarchical structure composed of packages.

t layer of a package is defined as . represents a set of classes in the t layer, while this layer does not contain subpackages of this layer. represents class relationship in the t layer, that is, .

is defined as a set of weighted fan-out of at the t layer of the package, that is, . is the set of weighted cross-package fan-out of , that is, .

2.3. CSBG for Coupling Measurement

Software stability and modularity could be measured based on complex network theories. In this study, software systems can be expressed as a set of bipartite graphs that use nodes as classes, and ASS as well as DEP are the edges constituted by attributes of the class with those of another class based on complex network theories. However, GEN is a direct connection between classes. Therefore, object-oriented software systems are taken into account as a set of special bipartite graphs constituted by classes in the package, as shown in Figure 1. There are defects in the coupling metrics containing the two metrics of fan-in and fan-out, because their total number is equal in a software system [46]. Therefore, this study only analyzed fan-out metric. The coupling strength between classes is correlated with the complexity of information exchange between modules. The more complex the information interaction (such as CBO), the tighter the coupling [47]. Coupling measurement metrics refer to the weighted fan-out of classes in special bipartite graphs. In these special bipartite graphs, classes associate with a class that may be at the same layer of the same package or at different layers of the package. Therefore, this study analyzed degrees of fan-out for classes in the same layer and all layers of the package. Moreover, heterogeneity of the abovementioned weighted out-degree was analyzed.

2.3.1. Demonstration of CSBG for Coupling Measurement

The detailed scheme proposed here is explained in the following, as illustrated in Figure 2:(1)The object-orient software systems are constructed as directed weighted network graphs, and classes and relationship between classes are shown as nodes and edges, respectively.(2)The package level, class level, and method level are combined to construct special weighted bipartite graphs between classes, aiming to make preparation for calculating the weighted out-degree of classes at the same layer of the package (see step 3, ), and the weighted out-degree () of all classes with classes across layers of the package (see step 3).(3), , , and at the same layer of the package were calculated. is determined by adding the weights of to the four mentioned metrics, respectively, in order to calculate , , , and between classes across different layers of the package. Then, weights of were added to these four metrics to determine :(4) and are weighted to calculate the weighted out-degree of the software system. The system coupling is calculated through dividing by the number of classes:

2.3.2. Calculation of the Weighted Fan-Out of Classes in a Software System

A special bipartite graph is constructed between classes of a software system. Weighted fan-out of classes in the special bipartite graph is analyzed based on characteristics of the object-oriented software.

(1) Construction of the Special Bipartite Graph in Software Systems. In the graph , if we divide the set V of nodes into two complementary subsets S and T, , the graph is the bipartite graph. In the graph constructed by classes and for the software, if only coupling relationship between classes is considered, coupling of methods and attributes in the class wouldn’t be taken into account; then, the property of bipartite graph is satisfied. A network diagram constructed by classes and satisfies the following formula: . Moreover, the two points of a connecting edge between classes and are in classes and , respectively.

In summary, the complex coupling between classes and constructs a bipartite graph . However, not only possesses the general properties of a bipartite graph, including method invocation and dependencies, but also possesses its own characteristics. In aggregation, reference, inheritance, and interface implementation between classes and , the two points of the connecting edge are in classes and , respectively. This bipartite graph is defined as a special bipartite graph. However, the software system can be considered as a set of special bipartite graphs as well (Figure 3).

In the present study, the coupling of the complete bipartite graph constructed by classes and was used to analyze the coupling of the software system .

(2) Modeling the Coupling of Special Bipartite and Calculating the Number of Weighted Fan-Out in Software Systems. In this study, a software system was defined. Classes and were defined as two different classes in a software system: . Among them, was the set of instantiated objects in class and p is the number of instantiated objects. is the attribute set of class , and q is the number of attribute. is the method set of class , and r is the number of methods. The methods included class methods and instance methods, that is, .

The relationship of the special bipartite graph between classes and can be summarized as follows:ASSIn class , there was instantiation of class (association), or one class existed in another class in the form of attribute (aggregation), which was defined as , where .In the class , instantiated object of class was implemented , or was a part of class ; then, there was an ASS edge between classes and , that is, . The set of ASS weighted fan-out of class wasDEPRelationship between classes is implemented defined by instance methods and variables.In class , there are instance methods of class : if , then the relationship between classes and is defined as . In class , there are instance variables of class : if , then relationship between classes and is defined as . Under the condition of instance methods and instance variables, the set of weighted fan-out for wasImplementing connecting edges between two classes through class methods and class variables.If there are class methods of (static methods) in class : , then the relationship between classes and is defined as . If there are class variables (static variables) of in class :, then the relationship between classes and is defined as . Thereafter, under the conditions of class methods and class variables, the set of weighted out-degree for class wasGENAs the inheritance is preferred in software engineering, if one class is a subclass of another class, the derived connecting edge was taken into account only once in this study. Because transfer of derived connecting edges would make the software system network more complex, this study did not consider transfer of derived connecting edge but only considered the conditions that class was a direct subclass of class (through extension), or class was implemented through interface class (through implementation), or class was implemented by abstract class (through extension). Thus, there was a GEN connecting edge between classes and , which was defined as , and a set of GEN weighted out-degree for class was

(3) Determination of Weights. In software systems, one class can construct one or more special bipartite graphs with other classes. Supposing that the number of classes and the total number of weighted fan-out of all classes are definite in a software system, the first case is that the number of weighted fan-out in each class is the same or roughly the same. The second case is that there is no rule for the distribution of the number of weighted fan-out in a class. The third case is that the number of weighted fan-out of a class is heterogeneity, which approximately obeys the power-law distribution. For the second case, heterogeneity of the out-degree of the class is superior than that of the first case; however, this is impossible to be compared with the third case. For the third case, because the number of fan-out is limited for the majority of classes, only few classes have a large number of fan-out; therefore, maintenance staff can dedicate more effort on these few classes. Moreover, the maintenance workload of these classes is lower than that of the first case.

In this study, heterogeneity under the situation of fan-out was analyzed. If the distribution was the abovementioned third case, then the larger the power-law value, the easier the maintenance, and the smaller the coupling degree. However, if the distribution was one of the other two cases, then it was considered in this study that the power-law value was equal to 1. are the power-law values for the distribution of , , , and , respectively. are the power-law values for the distribution of , , , and , respectively. The calculating formula for weights was as follows:

In the present study, statistical analyses were performed for the out-degree of the three open-source software systems, and the distributions were the first and the third cases as mentioned above, demonstrating that the proposed method had a certain practical value.

2.4. Theoretical Verification of Coupling Metrics

Whether the proposed CSBG method met the mathematical properties of the measurement metrics [4] was theoretically verified.

CSBG Property 1. CSBG satisfies nonnegativity.

Proof. In an object-oriented software system , there are two classes . When , , and are all 0, the minimum value of the software system is 0. However, there is a maximum value , so that the value is in the range of . Thus, nonnegativity of CSBG is satisfied, and the proposition is proved.

CSBG Property 2. CSBG satisfies zero value.

Proof. As described in CSBG property 1, if the minimum value is 0, then CSBG satisfies zero-value, and the proposition is proved as well.

CSBG Property 3. CSBG satisfies monotonicity.

Proof. If one edge is arbitrarily added in the system, the weighted out-degree of classes would increase according to CSBG measurement metrics. Obviously, the coupling increases as well. Thus, CSBG meets monotonicity and the proposition is proved.

CSBG Property 4. CSBG meets the property of class merging.

Proof. In an object-oriented software system , there are two classes , and class is a merger of classes and . The object-oriented system is a system in which classes and in are replaced by class . CSBG mainly calculates the weighted out-degree of classes in software systems. Therefore,

CSBG Property 5. CSBG satisfies the merge property of two irrelevant classes.

Proof. In an object-oriented software system , there are two classes , and the two classes are not coupled. Moreover, class is the merger of classes and . The object-oriented system is a system in which classes and in are replaced by class . CSBG mainly calculates the weighted out-degree of classes in software systems. Therefore,

2.5. Comparative Experiment

In the next sections, the CSBG method is herein proposed for coupling measurement and the existing measurement methods were compared and analyzed in order to verify the rationality of the results of CSBG measurement.

2.5.1. Calculating the Coupling of the Software System Using CSBG

In this section, CSBG for coupling measurement was compared with the existing measurement methods.

This experiment was conducted on a simple system as an example to analyze and compare the measurement values by the existing coupling measurements. This system was composed of 6 classes (Shape.java, Point.java, Line.java, Triangle.java, Quadrilateral.java, and Square.java), which described shapes, points, edges, triangles, quadrilaterals, and squares, respectively. Among them, the first three classes were in package graph, and the last three classes were in package graph.polygon (hierarchy of classes in package level is shown in Figure 4). There were inheritance, combination, variable declaration, and method invocation among these classes, which were appropriate for analyzing the coupling model. Codes of classes are shown in Figures 510.

There were three classes in the package graph, including class Shape, class Point, and class Line.

There were three classes in the package graph.polygon, which were classes of Triangle, Square, and Quadrilateral.

In this study, an algorithm was designed and the program was developed based on the aforementioned mathematical model, mainly calculating the four metrics for the out-degree of classes in the same layer and different layers of the package in software systems, including ASS, DEP_D, DEP_S, and GEN. Coupling metrics, including ASS, DEP_D, DEP_S, and GEN, were corresponded to the cases described in Section 2.3. Out-degrees of classes in various layers are shown in Table 4.

The mathematical model described in Section 2.3 was herein used. Because the number of classes was small, the heterogeneity of out-degree of classes could not be reflected. Moreover, heterogeneity had little impact on the coupling in this example. Therefore, it was considered that heterogeneity was approximately the same. Coupling of software systems was calculated as follows:

2.5.2. Analysis of the Results of Various Methods for Coupling Measurement

Coupling of software systems was calculated based on existing measurement methods, which is shown in Table 5. In addition to CSBG coupling measurement, other measurement methods mainly focus on measurement of a certain local fine-grained aspect. These methods were based on the theory of reductionism, which did not investigate the coupling of software systems from an overall and global perspective. Therefore, the measurement values were mostly either very large or very small, and several metrics were equal to 0. In addition, discrimination of these metrics was not significant compared with other methods for coupling measurement. The metrics calculated by CSBG had a better discrimination. Therefore, the existed methods have limitations, which cannot accordingly satisfy an effective coupling measurement for complex software systems. CSBG not only can consider a complex relationship between classes in object-oriented software systems but also analyze the complexity of classes and the special bipartite graph composed of classes from the prospective of overall package level. Therefore, the CSBG measurement method contained a certain rationality in theory.

3. Results

3.1. Application of CSBG Measurement Metrics in the Three Open-Source Software Systems

In order to further validate the effects of CSBG, this study used CSBG to measure and analyze coupling between classes in the three Java open-source software systems from different fields, including Art of Illusion [48], JabRef [49], and GanttProject [50]. Some studies have reported results of class cohesion metrics for the three open-source software systems [5153]; it is feasible to further study the complexity of the three open-source software systems if there is a more reasonable method for coupling measurement. In order to verify the effects of the CSBG measurement method in actual open-source software systems, three Java open-source software systems from different aforementioned fields were used. Art of Illusion is a software system for 3D rendering, modeling, and animation. JabRef is a graphical application for managing bibliographic database. GanttProject is a software system for project scheduling characterized by resource calendar, management, and import or export (MS Project, PDF, HTML). The reasons to use the three open-source software systems in the measurement were because (1) these systems were based on object-oriented Java; (2) the classes in the systems had a certain scale; (3) the three systems were from different fields; and (4) the source codes were available as well. Scholars can freely download the source codes from an open-source website (http://sourceforge.net).

3.2. Association of Coupling with Statistical Characteristics of the Three Open-Source Software Systems

Firstly, the program was developed and out-degree of classes at the same layer and all layers of the package was eventually obtained, including ASS, DEP_D, DEP_S, and GEN.

In this section, DEP_D and DEP_S were analyzed, and the results are shown in Figures 1118. In the experimental results, DEP(i) was a nonstandardized part of probability distribution . If, , then . A linear function was fitted using the double logarithmic method that was fitted to estimate Gamma index (R is the Pearson’s correlation coefficient and SD is standard deviation; is also expressed as B in the following table).

Although inheritance between classes increases coupling of the system, this is encouraged by the software system, which is conducive to reduce function definition and attribute definition in order to create a new class; thus, it is a poor coupling. It can be seen from linear distribution of GEN (Table 6) that neither all classes have an inheritance relationship, nor the GEN fan-out of all classes were very large or very small. However, classes with values equal to 0 or 1 were dominant.

Pearson’s correlation coefficient (R) and SD value provided the quality of the linear fitting; the larger the R value, the better the quality of the linear fitting, and B is estimated Gamma index . Moreover, the smaller the SD value, the better the quality of the linear fitting. As shown in Table 7, if 0.95 is considered to be the minimum value, it can be approximated that the distribution obeyed the power-law distribution except that ASS value in JabRef was relatively small (0.91651 and 0.88148). Furthermore, the distributions of ASS layer, ASS_all layer, DEP_D layer, DEP_S layer, DEP_D_all layer, and DEP_S_all layer were assumed to obey power-law distribution. The results demonstrated that there was a certain rule for the number of fan-out of classes in the form of ASS and DEP, which was not the case that the values were mostly large or small. However, they had “scale-free” property for complex networks, which obeyed the power-law distribution. In actual software development process, if software developers excessively pursue low coupling between classes, a class may be divided into two or more subclasses; thus, system complexity may be accordingly increased. The process of determination of the range of coupling between classes in software systems is significant. Based on data analysis, it can be seen that scale-free” property of complex networks motivated software developers to pay more attention to the distribution range of the coupling in large-scale software systems, which could provide a reliable reference for developing more reasonable software systems.

3.3. Coupling Measurement for the Three Open-Source Software Systems

According to the results of the abovementioned analysis, out-degrees of classes were often equal to 0, 1, and 2 for class inheritance in generalization, interface implementation, and implementation of abstract classes, which were approximately linearly distributed. Therefore, the power-law value of GEN was approximated to 1.

3.3.1. Calculation of Coupling Measurement for Art of Illusion

According to the CSBG method for coupling measurement, coupling of the software system for Art of Illusion was calculated as follows:

3.3.2. Calculation of Coupling Measurement for JabRef

According to CSBG for coupling measurement, coupling of the software system for JabRef was calculated as follows:

3.3.3. Calculation of Coupling Measurement for GanttProject

According to CSBG for coupling measurement, coupling of the software system for GanttProject was calculated as follows:

The three aforementioned open-source software systems were analyzed from the points of view of package level, class level, and method level using CSBG for coupling measurement. A program was also developed to calculate various metrics; thus, the coupling of the three open-source software systems in descending order was the Art of Illusion, JabRef, and GanttProject, suggesting that it was feasible to use CSBG for coupling measurement of software systems that contained a certain practical value.

4. Conclusion

Based on bipartite graphs for complex networks, by comprehensive consideration of the weighted fan-out between classes from points of view of package level, class level, and method level, this study expressed that the interaction of classes is a special bipartite graph, while a software system is a set of these special bipartite graphs. For this purpose, first, this study analyzed the four relationships for a software system, including ASS, DEP_D, DEP_S, and GEN, and coupling relationship for a class with other classes in the same layer of package was considered as well. Moreover, coupling relationship for classes in a package with other classes in different layers of the package was taken into account. Therefore, the CSBG method for coupling measurement of software systems was proposed, which was completely in compliance with the mathematical characteristics of the widely accepted metrics. Second, for a software system, other typical methods and CSBG method were compared for the purpose of coupling measurement, and the results revealed that the measured value was either large or small due to the defects of other measurement methods that were analyzed from an overall and global perspective. Moreover, the corresponding values were mostly equal to 0. Therefore, there were some defects in other measurement methods, while CSBG had its rationality. Eventually, a program was developed based on the CSBG method to apply the three open-source software systems (Art of Illusion, JabRef, and GanttProject). The results demonstrated that coupling of the three open-source software systems in the descending order was the Art of Illusion, JabRef, and GanttProject. Although inheritance between classes increases coupling of the system, this is also followed by software engineering, which is conducive to reduce function definition and attribute definition in order to create a new class, and thus, this is weak coupling. It can be concluded from the linear distribution of GEN that all classes either had an inheritance relationship, or that the number of GEN fan-out of all classes was very large or very small. However, classes with values equal to 0 or 1 were accounted. Furthermore, it was revealed that in the same layer and total layers of the package, fan-out values of ASS, DEP_D, and DEP_S obeyed the scale-free property of complex networks. These findings provided empirical support for the CSBG method. The statistical power-law metrics were applied to the method for coupling measurement proposed in this study in order to calculate the coupling of the three open-source software systems, which provided a reliable reference for further investigation of coupling between classes in software systems using statistics of complex networks. In [54], it was mentioned that cohesion distribution of the majority classes of a software system contained a certain regularity. In other words, it was not the case that neither cohesion of all classes was very large nor very small. In the empirical analysis of coupling, the values of coupling metrics had a regularity similar to class cohesion. Although coupling represented the degree of interdependence between classes, the greater the coupling, the more complex the software from an intuitive aspect. However, excessive pursuit of “high cohesion and low coupling” of software systems increases the workload of software developers and the complexity of software systems as well. Therefore, the empirical evidence showed that within a certain range, reducing the coupling was helpful to attenuate the complexity of the software, while excessively blindly pursuit of low coupling increases the complexity of software systems.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 61602400). This work was also supported in part by the Key Research and Development Program of Hangzhou under Grant 20182011A46.