Abstract
We present in this paper a discrete analogue of the continuous generalized inverted exponential distribution denoted by discrete generalized inverted exponential (DGIE) distribution. Since, it is cumbersome or difficult to measure a large number of observations in reality on a continuous scale in the area of reliability analysis. Yet, there are a number of discrete distributions in the literature; however, these distributions have certain difficulties in properly fitting a large amount of data in a variety of fields. The presented has shown the efficiency in fitting data better than some existing distribution. In this study, some basic distributional properties, moments, probability function, reliability indices, characteristic function, and the order statistics of the new DGIE are discussed. Estimation of the parameters is illustrated using the moment's method as well as the maximum likelihood method. Simulations are used to show the performance of the estimated parameters. The model with two real data sets is also examined. In addition, the developed DGIE is applied as color image segmentation which aims to cluster the pixels into their groups. To evaluate the performance of DGIE, a set of six color images is used, as well as it is compared with other image segmentation methods including Gaussian mixture model, K-means, and Fuzzy subspace clustering. The DGIE provides higher performance than other competitive methods.
1. Introduction
In the field of reliability analysis, it is inconvenient or difficult to measure a lot of observations in nature on a continuous scale. For example, in many practical satiations [1–9], reliability data are measured in terms of the number of cases, runs, or the number of days left for patients with the deadly disease since therapy. For more examples in reliability and lifetime applications, see Meeker and Escobar [10]. There are ways to build up a discrete distribution that has been recognized [11].
Indeed, this technique has been widely applied to generate new discrete distributions for example [12–18] and references cited therein. Abouammoh and Alshingiti [19] introduced a shape parameter to the inverted exponential distribution to get the generalized inverted exponential (GIE) distribution. The GIE distribution is derived from the exponetiated Frechet distribution [20]. The hazard rate of the GIE distribution can be decreasing or increasing, based on its shape parameter. The GIE has effectiveness in modeling a lot of data and can be applied in several applications, such as horse racing, life testing, queues, and wind speeds [20]. Abouammoh and Alshingiti [19] show that the GIE distribution provides a better fit than Weibull, gamma, generalized exponential distribution, and gamma distribution. The GIE can be widely used in many fields, see for example, [21, 22]. However, there exist a number of discrete distributions in the literature; there are some limitations in these distributions in fitting a lot of data in many areas effectively, such as geometric, discrete Lindely, and discrete logistic distributions. There is still a need to develop new discretized distributions that are able to have applications such as image segmentation. This motivated us to present a new distribution.
The presented distribution discrete generalized inverted exponential (DGIE) is constructed from a generalized inverted exponential distribution. Parameters are estimated using two methods, namely moments and maximum likelihood. The consistency of the estimated parameters is illustrated using simulation. Based on to two data sets the proposed distribution is more convenient to analyze the given data more than competitive distributions. The proposed distribution is applied in color segmentation which helps in clustering the pixels into their groups. The DGIE provides higher performance than other competitive methods.
The main contribution of the current study can be summarized as follows:(1)Present a new distribution discrete generalized inverted exponential (DGIE) to avoid the limitations of other distributions(2)Compute the basic distributional properties, moments, probability function, reliability indices, characteristic function, and the order statistics of DGIE(3)Evaluate the applicability of DGIE by using it to improve the color segmentation
The paper is organized as follows: in Section 2, we introduce the distribution and mention statistical properties, as failure function, survival function. In addition, we list some additional properties of the proposed distribution such as moment generating function, moments, quantile, entropy, stress-strength, mean residual lifetime, and order statistics. We analyse the using two real data set in Section 3. Finally, the conclusion is mentioned in Section 4.
2. Materials and Methods
2.1. Discrete Generalized Inverted Exponential Distribution
Definition 1. A random variable is said to have a discrete generalized inverted exponential distribution with parameter ß (β and , if its probability mass function (PMF) has the form:We denote this distribution as . Figure 1 illustrates several examples of the probability mass function of distribution for various values of β and θ.

2.1.1. Cumulative Distribution Function
The cumulative distribution function CDF of is given bywhere β (β and . Monotonic property simply, we find out
Is a decreasing function of which leads to
β (β and .
The distribution is log-concave. Based on the log concavity (Mark, 1996), the proposed of distribution is unimodal with increasing failure rate distribution, and it all its moments.
Furthermore, the quantile function of distribution, say from , is given bywhere β (β, and . Where denotes the greatest integer function.
Hence the median can be obtained by putting in equation (4)
Survival function: The survival function of distribution is defined as
Hazard rate, r(x) is put as
Figure 2 shows the HRF plots of distribution for various values of and .

The reversed hazard rate function (RHRF) of the distribution is put in the form
Figure 3 indicate the RHRF plots of distribution for various values of and .

2.2. Statistical Properties
The moment of a discrete exponentiated exponential distribution about the origin is obtained as follows:
The moment generating function (MGF) of distribution is computed as follows:
The mean () of distribution is derived as
The second moment is obtained as
Hence, the variance could be derived as
The and moments are, respectively are obtained as
The measure of skewness of distribution is obtained as follows:
The measure of kurtosis of distribution is obtained as follows:
The probability generating function (PGF), of distribution is obtained as follows:
For simplicity, we compute the PGF numerically where the the factorial moment is computed as
The variance, the variance () of distribution is given by the following:
Characteristic function: the characteristic function (CF), of distribution is of the form:
Because moments do not have closed forms, the mean and variance can only be calculated numerically. We estimated mean and variance for various values of β and θ in Tables 1 and 2, respectively.
2.3. Order Statistics
Order statistics has a deep reflection on theoretical and practical aspects of statistics. This importance is shown in statistical inference and nonparametric statistics. Let be a random sample from A distribution, and let be the order statistics. Then, the CDF of the order statistics for can be represented in the for
Therefore, the PMF of the Os has the form:where .
So, the moments of is written in the form:where .
2.3.1. Renyi Entropy
Renyi entropy plays a vital role in information theory. The Renyi entropy of a random variable X is defined as:where γ and γ (Renyi, 1961). For the distribution for γ is an integer number, we compute
2.4. Unknown Parameters Estimation
In this section, we used two methods to estimate distribution unknown parameters using
2.4.1. Maximum Likelihood Method
Let represents the lifetimes of independent test units which
Following . So, its log-likelihood function is written as:
Likelihood equations are then obtained as follows:
We can obtain the solution of these equations numerically then; we compute the Fisher’s information matrix by finding the second partial derivatives
One can infer that the distribution satisfies the regularity conditions [23]. Then, the MLE vector is asymptotically normal and consistent. Fisher’s information matrix can be approximated aswhere and are the MLEs of and [24].
The element of the hessian matrix are obtained from
2.4.2. Method of Moments Estimation
We can find moments' estimates (MM ) of by solving the equationswhere represent the first and the second sample moments.
3. Results and Discussion
3.1. A Simulation Study
In this section, we assess the performance of the maximum-likelihood estimate with respect to sample size n. The assessment is based on a simulation study:(1)Generate 10000 samples of size n from equation (1). The inversion method is used to generate samples; that is, varieties of the discrete generalized inverted exponential distribution are generated using where is a uniform variable on the unit interval.(2)Compute the Maximum-likelihood Estimates for 10000 Samples, Say for(3) Compute the biases and mean-squared errors given by
The empirical results are given in Table 3. From Table 3, the following observations can be noted: the magnitude of the bias always decreases to zero as . The MSEs always decrease to zero as . This shows the consistency of the estimators.
3.1.1. Data Application
We pointed out here, the notability of a discrete generalized inverted exponential distribution on distributions: geometric distribution, discrete logistic distribution and discrete Lindley distribution. Two real data sets are applied. The first data are in Table 4 is for 30 failure times of the air conditioning system of an airplane. These data are taken from [25].
The MLE of (β, θ) values in all these cases has been computed. The Kolmogorov–Smirnov (K–S) measure in each case and the associated P value are computed. The result is put in Table 5.
The Akaike information criterion (AIC), correct Akaike information criterion (CAIC) and Bayesian Akaike information criterion (BIC) values for the models have been computed. The result is reported in Table 6. The Akaike’s measures indicate that the GIED distribution fits the data better than some existing distributions for this data set.
The data set given in Table 7 consists of uncensored data from [23]. The data gives 100 observations on breaking stress of carbon fibers (in Gba). The MLE of (β, θ) values in all these cases have been computed. The Kolmogorov–Smirnov (K-S) measure in each case and the associated -value are computed. The result is put in Table 8. A comparison between the observed and the fitted distributions are shown in Figures 4 and 5.


The Akaike’s measures indicate that the GIED distribution fits the data better than some existing distributions for this data set, as in Table 9. For the first, second data sets, the discrete generalized inverted exponential distribution shows the best convenient values. The distribution plots propose that the discrete generalized inverted exponential distribution offers the best fit between the competitor distributions. On the basis of the tabulated results, we infer that the discrete generalized inverted exponential distribution provides the best fit compared to its submodels. Some summary statistics of data sets 1 and 2 are listed in Table 10.
3.2. Image Segmentation
In this section, we assess the ability of DGIED to improve the performance of segmentation the image. This can be performed by considering it as a clustering method.
3.2.1. Clustering Problem Formulation for Image Segmentation
In this part, we introduce the mathematical formulation of the automatic clustering-based image segmentation problem. In general, the main aim of AC is to split the given image into a set of groups. To perform this task, the between-cluster variation must be maximized at the same time with minizing within-cluster variation.
Therefore, the mathematical representation of AC can be given by dividing the image into cluster (i.e., ) with satisfied the following criteria:
Gaussian mixture models (GMM): it is one of the most popular clustering techniques, and it has been used as an image segmentation method in different applications, for example, image retrieval [26], chemical and physical properties of Italian wines, and the chemical [27, 28] and others [29].
The mathematical formulation of the Gaussian mixture model (GMM) can be represented by considering the given image consisting of a set of pixels that are represented as a random variable. So, the GMM can be defined as
In equation (36), represents the number of objects and refer to the weights where . In addition, the is defined aswhere and are mean and the standard deviation of class . For image , the parameters are are required to determine and to achieve this estimation, the Expectation-Maximization (EM) method is used. The steps of EM can be summarized as in Algorithm 1:
|
However, the traditional GMM has some limitations that influence its performance, such as inefficiency in modeling all the data types, including discrete data in the application such as machine learning [30]. To avoid these limitations, we use the new distribution named Discrete Generalized Inverted Exponential Distribution. In general, DGIED has the ability to tackle the non-inaccurateness of using general distributions such as mixture Gaussian distribution.
3.2.2. Dataset Description
In this study, the performance of the developed clustering-based color image segmentation using DGIED mixture model (DGIEMM) is evaluated using a set of six color images (as in Figures 6(a)–6(f)) [31]. In addition, we compared the results of DGIEMM with GMM, K-means, and Fuzzy subspace clustering (FSC).

(a)

(b)

(c)

(d)

(e)

(f)
3.2.3. Performance Measures
To evaluate the efficacy of the developed image segmentation, a set of performance measures is used. For example, Accuracy, Adjust Rand Index, Hubert, and Normalized mutual information. The details of these measures are given as follows:
Accuracy: It is a measure used to assess the ability of the method to determine the optimal cluster for each pixel. It is formulated aswhere and are the True positive, False positive, True negative, and False negative.
Adjust Rand Index: It is a measure used to assess the similarity between two groups, and it is defined as:where denotes the number of objects in common between classes. and are the sum of rows and columns of contingency table, respectively.
Hubert: it is a measure used to compute the correlation coefficient between classes and it is defined as:where and are the standard deviation of cluster and cluster , respectively.
Normalized mutual information (NMI): is defined as a normalization of the Mutual Information that defined as:where and are the class label and its cluster label, respectively. is the Entropy and Mutual Information between and , respectively.
3.2.4. Results and Discussion
The comparison between the developed color image segmentation method (i.e., DGIEMM) and the other methods is given in Table 11. It can be noticed from these results the high ability of the developed method to cluster the images into their objects overall the other methods. For example, according to the results in terms of accuracy, it can be seen from these values that the DGIEMM has a high ability to assign each pixel into its true label (i.e., the object that contains it). The FSC and GMM provide results better than K-means, and this observation can be noticed from Figure 7(a) that shows the average overall of the tested six images.

(a)

(b)

(c)

(d)
In terms of AR, it can be seen that the DGIEMM still provides results better than other methods. The same observations are noticed in the other three measures (i.e., RI, NMI, and Hubert); also, Figures 7(b)–7(d) shows the superiority of DGIEMM.
To justify the superiority of DGIEMM, the nonparametric Friedman test is used. In general, this test is applied to make a decision about the difference between the DGIEMM and other methods is significant or not. There are two hypotheses; the first one is named null, and it is assumed that there is no difference between the tested methods. In contrast, the second hypothesis, called alternative, is considered there is a difference between the method. We accept the alternative hypothesis, when the obtained value is less than significant level 0.05.
Table 12 shows the mean-rank obtained using the Friedman test in terms of the performance measures (i.e., accuracy, AR, RI, Hubert, NMI). From these values, it can be seen that the developed color image segmentation method has the highest mean rank in terms of performance measures. In addition, FSC allocates the second mean rank, followed by K-means that provides results better than traditional GMM. Finally, Figure 8 shows an example of Segmented image using competitive algorithms.

4. Concluding Remarks
In this study, a new two-parameters distribution for modeling a lot of observations in nature has been presented. It is constructed from continuous generalized inverted exponential distribution, so called discrete generalized inverted exponential distribution DGIE distribution. Some important probabilistic properties of this distribution have been studied. Using two methods namely the moment's method and the maximum likelihood technique, the parameters of the distribution have been estimated. To evaluate the quality of , a set of experimental series has been conducted using synthetic and real data. The results have been shown the efficiency of in fitting data better than some existing distribution in case of synthetic data. In addition, the developed has been applied as image segmentation based on clustering technique, which aims to avoid the limitations of the traditional Gaussian mixture model (GMM). This is achieved by using instead of Gaussian distribution. The developed image segmentation has been established its performance using a set of color images which provides results better than GMM, K-means, and Fuzzy subspace clustering (FSC).
According to these properties and results, can be applied to a wide range of applications, including reliability, physics, and machine learning techniques.
Data Availability
The data used to support the findings of this study are available from the authors upon request.
Conflicts of Interest
The authors declare no conflict of interest.