Abstract
Software reliability is an important feature that influences systems’ reliability. Software reliability models are a common tool to evaluate software reliability quantitatively. Various reliability models have been suggested based on the NHPP (nonhomogeneous Poisson process). In this article, a new NHPP model based on the Lindley distribution is proposed. The mathematical formulas for its measures of reliability are obtained and graphically illustrated. The proposed model’s parameters are estimated using both the NLSE (nonlinear least squares estimation) and the WNLSE (weighted nonlinear least squares estimation) methods. The model is then validated based on several different reliability datasets. The methods of estimation are evaluated and compared using three different criteria. The performance of the new model is also evaluated and compared, both objectively and subjectively, with three previously suggested models. The application results show that our new model demonstrates good performance in our selected failure data.
1. Introduction
The development of software systems is becoming more expensive and time-consuming because of their increasing complexity. Consequently, the reliable performance of software systems is becoming more important. Numerous SRGMs (software reliability growth models) with various assumptions have been proposed since the 1970s [1–3]. Several researchers have used reliability models based on the NHPP during the past years [4–8].
Kapur et al. [9] proposed a new SRGM based on It type of stochastic differential equation; the proposed model performs comparatively better than the existing NHPP models. Xu and Yao [10] also suggested a novel NHPP model based on the partial differential equation, and their suggested model exhibits a closer fitting to observation. Li and Yi [11] proposed a modified SRGM to reconsider the reliability of open source software and showed that it well fits the failure data and provides powerful prediction capability. Ramasamy and Lakshmanan [12] proposed the SRGM with infinite testing effort function. Recently, Al-Turk [13] proposed a NHPP model based on the two-parameter log-logistic distribution. The essential model characteristics were obtained, and the parameters of the model were estimated using the MLE (maximum likelihood estimation) and the NLSE methods. The results of the application indicate that the considered model gives a reasonable prediction capability for real studied datasets. Hui and Liu [14] proposed a SRGM based on Gaussian new distribution. The proposed model was confirmed by experiments to have a better fit and prediction performance than other reliability models.
In this article, we propose a new model that belongs to the NHPP class and based on the Lindley distribution. Several properties of the proposed model are outlined in Section 2 with graphical representations. These properties include MVF (mean value function), failure intensity, number of remaining faults, error detection rate, MTBF (mean time between failures), and conditional reliability. The NLSE and WNLSE methods are used for the purpose of estimating the proposed model parameters in Section 3. Application on real datasets is provided in Section 4. The last section concludes the article.
2. NHPP Lindley Model
2.1. Model Construction
A one-parameter Lindley distribution was suggested by Lindley [15] for the analysis of failure data. This model can capture failure data with different shapes of hazard rates. It has been studied by several authors as a good alternative to the exponential and Weibull distributions [16–18]. As with all statistical distributions, the Lindley distribution is specified by its PDF (probability density function),or its CDF (cumulative distribution function),where and is the shape parameter. The main aim of the NHPP model is to assess and predict the expected number of detected faults up to a specific point of time, which can be achieved using its MVF. Suppose denotes the cumulative number of faults discovered at time , and is the distribution function of time between two successive failures, then the MVF of the NHPP model can be expressed as follows [5]:while the corresponding intensity function is given bywhere is the expected number of faults. By substituting equations (1) and (2) into equations (3) and (4), respectively, we get the MVF of the NHPP L (Lindley) model as follows:and its corresponding intensity function as follows:
2.2. Model Characteristics
The NHPP model has very useful reliability measures for describing failure phenomena. In this section, the mathematical formulas of some of these measures for the new model will be given. First, the number of remaining faults of the NHPP L model is given byand then the error detection rate can be defined as follows:while the MTBF is as follows:
The conditional reliability is expressed by the probability that nondetected fault is found in the interval , given that a fault occurred at time . is the interval of operation time according to some practical or administrative requirements [19]. Mathematically, the conditional reliability of the NHPP L model can be obtained as follows:
2.3. Graphs of the Model Characteristics
The plots of the NHPP L model’s characteristics for different selected values of parameters are shown in Figures 1–6. Figure 1 illustrates the MVF which represents the variation of the number of faults detected with respect to time. From this figure, we can see that, initially, the faults detected during testing are very high but later on become stable, and also larger values of the parameter give higher MVF form. Figure 2 displays that the intensity function varies in shape over the different selected shape parameters, and it reaches a larger peak level with the larger value of the parameter . The number of remaining errors function in Figure 3 decreases as the testing time increases; smaller values of the parameter give a lower form of the number of remaining errors function. In Figure 4, the error detection rate function increases as the testing time increases; a larger value of the shape parameter gives a larger form of the failure occurrence rate per fault of the software function. The MTBF function in Figure 5 increases with the progress of the testing time. In Figure 6, we can see that as tends to infinity the conditional reliability becomes approximately 1.

(a)

(b)

(a)

(b)

(a)

(b)


(a)

(b)

(a)

(b)
3. Estimation of Model Parameters
In this section, the NLSE and WNLSE methods are applied for the estimation of parameters of our proposed model.
3.1. The NLSE and WNLSE Methods
Assume that a software system is tested for units of time and faults were detected. Let be the times at which the failures were observed. is the MVF; and is its unknown parameters. The parameters are thus derived from observed data pairs: where is the total number of faults detected within time . Then, the NLSE method aims to minimize the following function:while the WNLSE method aims to minimize the following function:where are positive weights; [20].
3.2. The NLSE and WNLSE Methods for the NHPP L Model
For the NLSE, we substitute equation (5) in equation (11) as follows:
Taking the partial derivative of equation (13) with respect to and θ, respectively, we get
By setting the derivatives equal to zero, we get the following nonlinear equations:
The closed form expression for the NLS estimates of cannot be obtained. Consequently, an estimate of parameter can be obtained by numerically solving the nonlinear equation (16), and then by substituting this estimate in equation (15), the estimate of the parameter can be obtained.
For the WNLSE, we substitute equation (5) in equation (12), and thus we obtain
Taking the partial derivative of equation (17) with respect to and θ, respectively, we get
By setting the derivatives equal to zero, we have the following nonlinear equations:
Closed form expression for the WNLS estimate of cannot be obtained. By solving equation (20) using the Gauss–Newton method, we obtain the value of the estimate, and then by substituting this estimate in equation (19), the estimate of the parameter can be obtained.
4. Application to Failure Data
In this section, examples of real data are used to compare the two considered methods of estimation for the proposed model. Also, we perform a comparative study to evaluate the effectiveness of the proposed model with three of the previously existing models. Useful results based on the studied real datasets are presented and discussed at the end of this section. To facilitate mathematical computation, a software tool was developed using R language version 3.6.1.
4.1. Datasets
Nine published datasets with different sizes were chosen for our evaluation study. References for the selected datasets are shown in Table 1.
4.2. Models
In addition to our proposed model (NHPP L), three other well-known reliability models are considered, and the names and MVFs of these models are listed in Table 2.
4.3. Evaluation Criteria
To check the performance of the considered models, we used the following three criteria based on equations (21)–(23). The mean square error (MSE) is the variation between the predicted values and the actual observations. It is defined as [19]where is the estimated number of faults at time obtained from the considered model; is the total number of faults detected within time , ; is the number of observations; and is the number of parameters. A lower value of the MSE indicates more confidence in the model and thus better performance. The variance is defined as follows [29, 30]:where the bias is defined as . The average of the prediction faults is referred to as the prediction bias, and its standard deviation is often used as a measure of the variance in the predictions. The small value of variance indicates that the model fits the data well. The coefficient of determination can measure how precise the fit is in describing the deviation of the data. It is defined as [19]
Values for this coefficient range from 0 to 1. The value of closest to 1 indicates the best model.
4.4. Results and Discussion
4.4.1. Comparative Study of the Estimation Methods
This section evaluates the performance of the NLSE and WNLSE methods for the NHPP L model based on eight datasets. The results are shown in Table 3. From the evaluation criteria values in Table 3, we derived the following conclusions:(i)The NHPP L model provides values indicative of a better model for most of the evaluation criteria in most cases when using the WNLSE method.(ii)The different evaluation criteria gave different results, and this indicates the necessity to study several criteria during the comparison.
Figures 7 and 8 illustrate the actual and fitted curves of software failures using the NLSE and WNLSE methods. According to these figures, we can see our new model provides a good fit for all considered datasets when using either the NLSE or WNLSE methods. In particular, the proposed model is more suitable for modeling the failure datasets when using the WNLSE method rather than the NLSE method.

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)
4.4.2. Comparing the Performance of Various SRGMs for Some Real Datasets
Since the proposed model is new concerning the predication/estimation of software reliability, we compared its accuracy with some well-known and widely used SRGMs, namely, the GO model, delayed S-shaped model, and inflection S-shaped model. Our comparative study is based on five datasets, DS2, DS3, DS4, DS5, and DS9, and used the WNLSE as the method of estimation. The Kolmogorov–Smirnov test was used to check and compare the fit between these datasets and our studied reliability models. The results are presented in Table 4. From the table, we can observe the following:(i)The MSE values for all studied models are very close, indicating that all studied models have the ability to describe the five selected systems effectively with minor differences between them in terms of their performance. The NHPP L model ranked the second for DS2, DS3, and DS5 while it ranked the first for DS4 and the third for DS9.(ii)The values for the coefficient of determination for all studied models are close to 1. Therefore, it can be said that all studied models are suitable for modeling the considered software projects.
Figure 9 illustrates the actual and prediction results based on the four considered models. According to these figures, we can see that all the selected models are well-fitted to the studied failure data. In particular, the proposed model is one of the most suitable for modeling the selected datasets.

(a)

(b)

(c)

(d)

(e)
5. Conclusions
In this article, we propose a new reliability model based on the Lindley distribution. Several essential characteristics of our proposed model, the NHPP L model, were obtained. The considered model parameters were estimated using the NLSE and WNLSE methods. The performance of the estimators for each studied method was evaluated using different criteria based on eight datasets. A comparative study between the proposed model and three other common models was conducted based on five real datasets. The WNLSE method was determined to have better performance than the NLSE method for the chosen failure datasets. Thus, it is recommended that the WNLSE method be used with the NHPP models. The performance of the NHPP L model is encouraging in comparison with other selected models. The present study can be extended by incorporating SRGMs with learning effects to increase the flexibility of models and to enhance their capability for accurately describing software failure phenomena.
Abbreviations
SRGMs: | Software reliability growth models |
NHPP: | Nonhomogeneous Poisson process |
PDF: | Probability density function |
CDF: | Cumulative distribution function |
MVF: | Mean value function |
MTBF: | Mean time between failures |
MSE: | Mean of squared errors |
: | Coefficient of multiple determination |
NLSE: | Nonlinear least square estimation |
WNLSE: | Weighted nonlinear least square estimation |
MLE: | Maximum likelihood estimation. |
Data Availability
Previously published data were used to support this study. These prior studies are cited at relevant places within the text as references.
Conflicts of Interest
The authors declare that they have no conflicts of interest regarding the publication of this paper.