Estimation Using Suggested EM Algorithm Based on Progressively Type-II Censored Samples from a Finite Mixture of Truncated Type-I Generalized Logistic Distributions with an Application

Ateya, Saieed F.; Kilai, Mutua; Aldallal, Ramy

doi:https://doi.org/10.1155/2022/1720033

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions Data Availability Conflicts of Interest References Copyright Related Articles

Special Issue

Application of Mathematical Methods in Nature-Inspired Computation

View this Special Issue

Research Article | Open Access

Volume 2022 | Article ID 1720033 | https://doi.org/10.1155/2022/1720033

Estimation Using Suggested EM Algorithm Based on Progressively Type-II Censored Samples from a Finite Mixture of Truncated Type-I Generalized Logistic Distributions with an Application

Saieed F. Ateya,^1,2Mutua Kilai ,³and Ramy Aldallal⁴

Academic Editor: Aida Mustapha

Received27 Feb 2022

Accepted08 Apr 2022

Published11 May 2022

Abstract

In this paper, the identifiability property has been studied for a suggested truncated type-I generalized logistic mixture model which is denoted by . A suggested form of the algorithm has been applied on type-II progressive censored samples to obtain the maximum likelihood estimates of the parameters, survival function , and hazard rate function of the studied mixture model. Monte Carlo simulation algorithm has been applied to study the behavior of the mean squares errors of the estimates. Also, a comparative study is conducted between the suggested algorithm and the ordinary algorithm of maximizing the likelihood function, which depends on the differentiation of the log likelihood function. The results of this paper have been applied on a real dataset as an application.

1. Introduction

The progressive type-II censored model is a very important model in the field of reliability and life testing (see [1]). This censoring model can be shown as follows.

Consider a lifetime test in which identical units are tested. surviving units are removed randomly from the experiment once the failure has occurred, . Thus, if the number of observed failures is , then units are progressive censored, and hence and describe the progressive censored failure times, where and . The likelihood function based on type-II progressive censored data which can be written for simplicity as is given bywhere (see [1]). The functions and are the probability density function and the survival function of the studied distribution at a value .

The importance of the mixture models appear in the theoretical and applied fields when the population under study is heterogeneous. For details about mixture models, see [2–5].

The contributions of this paper are suggesting an algorithm suitable for estimation based on progressively type-II censored samples from a finite mixture of distributions, studying the identifiability property of a finite mixture of distributions, and finally using a real dataset as an application.

A random variable is said to have a finite mixture of certain distributions with and vector of parameters if its is given bywhere

The cumulative distribution function and are

This paper is organized as follows. In Section 2, a suggested form of algorithm is introduced to compute the of the parameters of a finite mixture of distributions based on progressively type-II censored samples. In Section 3, the identifiability property of the finite mixture of distributions is studied using Chandra’s theorem in [6]. In Section 4, the suggested form of the algorithm is applied on a finite mixture of distributions. In Section 5, the main results are introduced. Finally, concluding remarks are introduced in Section 6.

2. Maximum Likelihood Estimation Using EM Algorithm

It is generally known that the of the parameter vector maximizes the log of the (1). The log- is challenging to optimize since it includes the log of the sum and has high number of parameters.

In this section, we will employ Krishnan and Krishnan’s idea [7] of missing data. The vector which represents the missing data is , where and

Based on and , we can writewhich is known as complete data likelihood function.

In this paper, the E and M steps of the suggested algorithm may be written as follows.

2.1. E Step

The function may be rewritten aswhereand is an initial value for .

2.2. M Step

By maximizing the function , we getandwhere .

The E and M steps should be repeated until the value become a small amount. In this case, will be the of , denoted by .

3. Identifiability of the Finite Mixture of Distributions

We can say that the random variable Mixture distributions with parameters , , if its is given as below:where and for , ,

AL-Hussaini and Ateya [8, 9] studied the estimation problem under a finite mixture of distributions using the classical and Bayes methods based on complete classified type-I censoring scheme. Ateya and Alharthi in [10] studied the estimation problem under a finite mixture of modified Weibull distributions under type-I, type-II, and type-II progressive censoring schemes using the ordinary likelihood method which depends on the differentiation of the with respect to the parameters and without studying the identifiability property. Also, Ateya and Alharthi in [11] studied the estimation problem under the same mixture model under type-I and type-II censoring schemes using the algorithm without studying the type-II progressive censoring case. Ateya in [12] studied the identifiability property and the estimation problem using algorithm under a finite mixture of generalized exponential distribution under type-I and type-II censoring schemes without studying type-II progressive censoring case. For more details about distribution and its mixtures, see [8, 9, 13–15]. In our study, we will take , and then the vector of parameters will be .

For a value of the random variable , let

So, (12) and its corresponding and can be written in the following forms (with ):

In the next section, we will write instead of and instead of .

The and of the finite mixture can be written asand

Note that the of a finite mixture of distributions is a finite mixture of the corresponding to the distributions, but this is not true with respect to the .

It is very important to know that the statistical inference problem for the parameters in case of the mixture distributions cannot be discussed before proving the identifiability property (see [5]).

The identifiability property has been explored by a variety of authors [12, 16–27].

In this section, the identifiability property of the suggested mixture has been proved using Chandra’s theorem in [6].

Theorem 1 (see [6]). Let be the class of all with elements and let be a linear mapping with domain . Assume that there is a total ordering of such that(1), implies .(2)For each , there exists in the closure of such that , for all , .Then, this class is identifiable relative to .

Proposition 1. The class of all finite mixtures of distributions is identifiable.

Proof. Let , with given by (12). Define the transform as a moment generating function of the , which can be written aswhereis the incomplete beta (type-II) function.
Also, the of and can be written in the following forms:andFrom (17) and (18), we can see that and .
Now, we will make sure that the two conditions of the previous theorem are met.

Condition 1. The family of all may be ordered as follows: and , implying that . As a result, it is easy to show that

Condition 2. If we take , then .
This means that all the mixtures of distributions are identifiable.

4. Maximum Likelihood Estimation under a Finite Mixture of Distributions Based on Type-II Progressive Censoring Scheme

In this section, the of the parameters of a finite mixture of distributions have been obtained using the results and the formulas obtained from Sections 2 and 3.

The of all parameters of the finite mixture of distributions can be obtained using the suggested algorithm as follows:where and are defined in Section 3 and are initial values.

As mentioned in Section 2, after the convergence of the sequence of the likelihood values , will be the of , denoted by .

The of and , denoted by, and , can be obtained by replacing each parameter by its in each function.

5. Main Results

In this section, Monte Carlo simulation algorithm has been applied to make a comparison between the suggested algorithm and the ordinary method using the criterion. In the end of this section, the suggested algorithm is applied on a real dataset as an application.

5.1. Simulated Results

In this section, a type-II progressive censored sample from a mixture of two distributions has been generated for different schemes as follows:(1)Generate independent random variates from .(2)Define .(3)Define , which represent a progressive type-II censored sample from .(4)Generate a random variate from .(5)If , generate from using ; otherwise, generate from using .(6)Based on the generated type-II progressive censored sample and for different schemes where , the of the all parameters and functions have been obtained using the suggested form of the algorithm and also using the ordinary algorithm for maximizing the likelihood function.(7)Over samples, the of all estimates have been computed based on the suggested and ordinary algorithms of estimation.

In our study, using the suggested algorithm, the parameters, , and have been estimated based on simulated type-II progressive censored samples with different values for and to study the behavior of the of the estimates.

The average estimates and under the suggested algorithm are summarized in Tables 1 and 2. Also, the of the all estimates using the suggested and ordinary algorithms are summarized in Tables 3 and 4 as comparative results.

From the results in Tables 1–4, we can conclude that for fixed , the MSE’s decrease by increasing , and for fixed , the MSE’s increase by increasing . Also, scheme represents the complete sample case in which .

From the results in Tables 3 and 4, we can conclude that for certain values of and , the MSE’s using the suggested algorithm are less than those computed using the ordinary algorithm which means than the suggested algorithm is better than the ordinary algorithm.

5.2. An Application

In this section, two real datasets from distributions are presented which represent the waiting times (in minutes) before customer service in two different banks (see [28]).

The first real dataset is as follows: 0.8, 0.8, 1.3, 1.5, 1.8, 1.9, 1.9, 2.1, 2.6, 2.7, 2.9, 3.1, 3.2, 3.3,3.5, 3.6, 4.0, 4.1, 4.2, 4.2, 4.3, 4.3, 4.4, 4.4, 4.6, 4.7, 4.7, 4.8, 4.9, 4.9, 5.0, 5.3, 5.5, 5.7, 5.7, 6.1, 6.2, 6.2, 6.2, 6.3, 6.7, 6.9, 7.1, 7.1, 7.1, 7.1, 7.4, 7.6, 7.7, 8.0, 8.2, 8.6, 8.6, 8.6, 8.8, 8.8, 8.9, 8.9, 9.5, 9.6, 9.7, 9.8, 10.7, 10.9, 11.0, 11.0, 11.1, 11.2, 11.2, 11.5, 11.9, 12.4, 12.5, 12.9, 13.0, 13.1, 13.3, 13.6, 13.7, 13.9, 14.1, 15.4, 15.4, 17.3, 17.3, 18.1, 18.2, 18.4, 18.9, 19.0, 19.9, 20.6, 21.3, 21.4, 21.9, 23.0, 27.0, 31.6, 33.1, and 38.5.

Also, the second real dataset is as follows: 0.1, 0.2, 0.3, 0.7, 0.9, 1.1, 1.2, 1.8, 1.9, 2.0, 2.2, 2.3, 2.3, 2.3, 2.5, 2.6, 2.7, 2.7, 2.9, 3.1, 3.1, 3.2, 3.4, 3.4, 3.5, 3.9, 4.0, 4.2, 4.5, 4.7, 5.3, 5.6, 5.6, 6.2, 6.3, 6.6, 6.8, 7.3, 7.5, 7.7, 7.7, 8.0, 8.0, 8.5, 8.5, 8.7, 9.5, 10.7, 10.9, 11.0, 12.1, 12.3, 12.8, 12.9, 13.2, 13.7, 14.5, 16.0, 16.5, and 28.0.

Al-Mutairi et al. in [29] showed that Lindley distribution fits these data well, and Srinivasa in [28] proved that the distributions fit reasonably well to the two datasets.

By combining the two real datasets, the new ordered real dataset will be 0.1, 0.2, 0.3, 0.7, 0.8, 0.8, 0.9, 1.1, 1.2, 1.3, 1.5, 1.8, 1.8, 1.9, 1.9, 1.9, 2.0, 2.1, 2.2, 2.3, 2.3, 2.3, 2.5, 2.6, 2.6, 2.7, 2.7, 2.7, 2.9, 2.9, 3.1, 3.1, 3.1, 3.2, 3.2, 3.3, 3.4, 3.4, 3.5, 3.5, 3.6, 3.9, 4.0, 4.0, 4.1, 4.2, 4.2, 4.2, 4.3, 4.3, 4.4, 4.4, 4.5, 4.6, 4.7, 4.7, 4.7, 4.8, 4.9, 4.9, 5.0, 5.3, 5.3, 5.5, 5.6, 5.6, 5.7, 5.7, 6.1, 6.2, 6.2, 6.2, 6.2, 6.3, 6.3, 6.6, 6.7, 6.8, 6.9, 7.1, 7.1, 7.1, 7.1, 7.3, 7.4, 7.5, 7.6, 7.7, 7.7, 7.7, 8.0, 8.0, 8.0, 8.2, 8.5, 8.5, 8.6, 8.6, 8.6, 8.7, 8.8, 8.8, 8.9, 8.9, 9.5, 9.5, 9.6, 9.7, 9.8, 10.7, 10.7, 10.9, 10.9, 11.0, 11.0, 11.0, 11.1, 11.2, 11.2, 11.5, 11.9, 12.1, 12.3, 12.4, 12.5, 12.8, 12.9, 12.9, 13.0, 13.1, 13.2, 13.3, 13.6, 13.7, 13.7, 13.9, 14.1, 14.5, 15.4, 15.4, 16.0, 16.5, 17.3, 17.3, 18.1, 18.2, 18.4, 18.9, 19.0, 19.9, 20.6, 21.3, 21.4, 21.9, 23.0, 27.0, 28.0, 31.6, 33.1, and 38.5.

In this paper, the combined real dataset of size is analyzed using a mixture of two . The estimated parameters of the mixture model, the associated test statistic, and the are summarized in Table 5.

It is clear that the computed test statistic is less than the critical value for test statistic, under significance level of 0.05, which is equal to 0.108; also, the computed is greater than the chosen significance level (0.05) which means that the suggested mixture model fits the combined real dataset quite well.

For more illustration, Figure 1 shows the histogram of the real data and the fitted of the suggested mixture model computed at the estimated parameters.

Also, Figure 2 shows the fitted and the empirical of the suggested mixture model, where the dotted curve represents the empirical curve and the continuous curve represents the fitted curve computed at the estimated parameters.

Three type-II progressive censored samples are generated from the combined real dataset using the schemes = , , and . The three samples, respectively, are as follows: Sample (1): 0.1, 0.3, 0.8, 0.9, 1.2, 1.5, 1.8, 1.9, 2.0, 2.2, 2.3, 2.5, 2.6, 2.7, 2.9, 3.1, 3.1, 3.2, 3.4, 3.5, 3.6, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.7, 4.7, 4.9, 5.0, 5.3, 5.6, 5.7, 6.1, 6.2, 6.2, 6.3, 6.7, 6.9, 7.1, 7.1, 7.4, 7.6, 7.7, 8.0, 8.0, 8.5, 8.6, 8.6, 8.8, 8.9, 9.5, 9.6, 9.8, 10.7, 10.9, 11.0, 11.1, 11.2, 11.9, 12.3, 12.5, 12.9, 13.0, 13.2, 13.6, 13.7, 14.1, 15.4, 16.0, 17.3, 18.1, 18.4, 19.0, 20.6, 21.4, 23.0, 28.0, and 33.1 Sample (2): 0.1, 0.3, 0.8, 0.9, 1.2, 1.5, 1.8, 1.9, 2.0, 2.2, 2.3, 2.5, 2.6, 2.7, 2.9, 3.1, 3.1, 3.2, 3.4, 3.5, 3.6, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.7, 4.7, 4.9, 5.0, 5.3, 5.6, 5.7, 6.1, 6.2, 6.2, 6.3, 6.7, 6.9, 7.1, 7.1, 7.4, 7.6, 7.7, 8.0, 8.0, 8.5, 8.6, 8.6, 8.8, 8.9, 9.5, 9.6, 9.8, 10.7, 10.9, 11.0, 11.1, 11.2, 11.9, 12.1, 12.3, 12.4, 12.5, 12.8, 12.9, 12.9, 13.0, 13.1, 13.2, 13.3, 13.6, 13.7, 13.7, 13.9, 14.1, 14.5, 15.4, 15.4, 16.0, 16.5, 17.3, 18.1, 18.4, 19.0, 20.6, 21.4, 23.0, 28.0, and 33.1 Sample (3): 0.1, 0.3, 0.8, 0.9, 1.2, 1.5, 1.8, 1.9, 2.0, 2.2, 2.3, 2.5, 2.6, 2.7, 2.9, 3.1, 3.1, 3.2, 3.4, 3.5, 3.6, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.7, 4.7, 4.9, 5.0, 5.3, 5.6, 5.7, 6.1, 6.2, 6.2, 6.3, 6.7, 6.9, 7.1, 7.1, 7.4, 7.6, 7.7, 8.0, 8.0, 8.5, 8.6, 8.6, 8.8, 8.8, 8.9, 8.9, 9.5, 9.5, 9.6, 9.7, 9.8, 10.7, 10.7, 10.9, 10.9, 11.0, 11.0, 11.0, 11.1, 11.2, 11.2, 11.5, 11.9, 12.1, 12.3, 12.4, 12.5, 12.8, 12.9, 12.9, 13.0, 13.1, 13.2, 13.3, 13.6, 13.7, 13.7, 13.9, 14.1, 14.5, 15.4, 15.4, 16.0, 17.3, 18.1, 18.4, 19.0, 20.6, 21.4, 23.0, 28.0, and 33.1.

Then, all parameters, , and will be estimated. The results are given in Tables 6 and 7.

6. Conclusions

In this paper, based on type-II progressive censoring samples, the estimation problem of the parameters of a finite mixture of distributions has been studied, after studying the identifiability property of the mixture, using a suggested algorithm. A comparative study has been carried out between the suggested algorithm and the ordinary algorithm for maximizing the , and it is found that the suggested algorithm is better than the ordinary algorithm which can be interpreted as follows: the accuracy of the estimates using the ordinary algorithm decreases in case of high number of parameters (like our case). Finally, the results of the paper are applied on simulated and real data.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

N. Balakrishnan and R. Aggarwala, Progressive Censoring: Theory, Methods, and Applications, Springer, Berlin, Germany, 2000.
Y. K. Bozidar, C. Gauss, M. O. Edwin, and A. R. F. Marcelino, “A new extended mixture normal distribution,” Mathematical Communications, vol. 22, pp. 53–73, 2017.
View at: Google Scholar
G. J. McLachlan and D. Peel, Finite Mixture Models, Wiley, New York, USA, 2000.
S. Kumar C and L. Manju, “A note on logistic mixture distributions,” Biostatistics and Biometrics Open Access Journal, vol. 2, no. 5, pp. 555–598, 2017.
View at: Publisher Site | Google Scholar
D. M. Titterington, A. F. M. Smith, and U. E. Makov, Statistical Analysis of Finite Mixture Distributions, Wiley, New York, USA, 1985.
S. Chandra, “On mixture of probability distributions,” Scandinavian Journal of Statistics, vol. 4, pp. 105–112, 1977.
View at: Google Scholar
G. J. Krishnan and T. Krishnan, The EM Algorithm and Extensions, Wiley, New York, USA, 1997.
E. K. Al-Hussaini and S. F. Ateya, “Maximum likelihood estimations under a mixture of truncated type I generalized logistic components model,” J. Statist.Th. Appl., vol. 2, pp. 47–60, 2003.
View at: Google Scholar
E. K. Al-Hussaini and S. F. Ateya, “Bayes estimations under a mixture of truncated type I generalized logistic components model,” J. Statist. Th. Appl., vol. 4, pp. 183–208, 2005.
View at: Google Scholar
S. F. Ateya and A. S. Alharthi, “Maximum likelihood estimation under a finite mixture of modified Weibull distributions based on censored data with application,” JASS, vol. 20, pp. 231–239, 2014a.
View at: Google Scholar
S. F. Ateya and A. S. Alharthi, “Estimation under a finite mixture of modified Weibull distributions based on censored data via EM algorithm with application,” Journal of Statistical Theory and Applications, vol. 13, no. 3, pp. 196–204, 2014b.
View at: Publisher Site | Google Scholar
S. F. Ateya, “Maximum likelihood estimation under a finite mixture of generalized exponential distributions based on censored data,” Statistical Papers, vol. 55, no. 2, pp. 311–325, 2014.
View at: Publisher Site | Google Scholar
A. Al-Angary, “Truncated Logistic Distributions as Lifetime Models,” Department of Statistics, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia, 1997, M. Sc. Thesis.
View at: Google Scholar
E. K. Al-Hussaini, G. R. Al-Dayan, and A. Al-Angary, “Bayesian prediction bounds under the truncated type I generalized logistic model,” J. Egyptian Math. Soc., vol. 14, pp. 55–67, 2006.
View at: Google Scholar
S. F. Ateya and A. E.-B. A. Ahmad, “Inferences based on generalized order statistics under truncated type-I generalized logistic distribution,” Statistics, vol. 47, no. 1, pp. 141–155, 2013.
View at: Publisher Site | Google Scholar
K. E. Ahmad, “Identifiability of finite mixtures using a new transform,” Annals of the Institute of Statistical Mathematics, vol. 40, no. 2, pp. 261–265, 1988.
View at: Publisher Site | Google Scholar
K. E. Ahmad and E. K. Al-Hussaini, “Remarks on the non-identifiability of mixtures of distributions,” Annals of the Institute of Statistical Mathematics, vol. 34, no. 3, pp. 543-544, 1982.
View at: Publisher Site | Google Scholar
A. S. Al-Moisheer, “A mixture of two burr type III distributions: identifiability and estimation under type II censoring,” Mathematical Problems in Engineering, vol. 2016, Article ID 7035279, 12 pages, 2016.
View at: Publisher Site | Google Scholar
G. Menges, “Three essays in econometrics,” Statistische Hefte, vol. 4, no. 1, pp. 1–37, 1963.
View at: Publisher Site | Google Scholar
N. C. Mohanty, “On the identifiability of finite mixture of Laguerre distributions,” IEEE Transactions on Information Theory, vol. 18, pp. 514-515, 1872.
View at: Google Scholar
C. E. G. Otiniano, C. R. Gonçalves, and C. C. Y. Dorea, “Mixture of extreme-value distributions: identifiability and estimation,” Communications in Statistics - Theory and Methods, vol. 46, no. 13, pp. 6528–6542, 2017.
View at: Publisher Site | Google Scholar
G. L. M. Pezzott, L. E. B. Salasar, J. G. Leite, and F. Louzada-Neto, “A note on identifiability and maximum likelihood estimation for a heterogeneous capture-recapture model,” Communications in Statistics - Theory and Methods, vol. 49, no. 21, pp. 5273–5293, 2020.
View at: Publisher Site | Google Scholar
R. R. Rennie, “On the interdependence of the identifiability of multivariate mixtures and the identifiability of the marginal mixtures,” Sankhya A, vol. 34, pp. 449–452, 1972.
View at: Google Scholar
H. Teicher, “Identifiability of mixtures,” The Annals of Mathematical Statistics, vol. 32, no. 1, pp. 244–248, 1961.
View at: Publisher Site | Google Scholar
H. Teicher, “Identifiability of finite mixtures,” The Annals of Mathematical Statistics, vol. 34, no. 4, pp. 1265–1269, 1963.
View at: Publisher Site | Google Scholar
H. Teicher, “Identifiability of mixtures of product measures,” The Annals of Mathematical Statistics, vol. 38, no. 4, pp. 1300–1302, 1967.
View at: Publisher Site | Google Scholar
S. J. Yakowitz and J. D. Spragins, “On the identifiability of finite mixtures,” The Annals of Mathematical Statistics, vol. 39, no. 1, pp. 209–214, 1968.
View at: Publisher Site | Google Scholar
G. S. Rao, “Estimation of stress-strength reliability from truncated type-I generalised logistic distribution,” International Journal of Mathematics in Operational Research, vol. 7, no. 4, pp. 372–381, 2015.
View at: Publisher Site | Google Scholar
D. K. Al-Mutairi, M. E. Ghitany, and D. Kundu, “Inferences on stress-strength reliability from Lindley distributions,” Communications in Statistics - Theory and Methods, vol. 42, no. 8, pp. 1443–1463, 2013.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2022 Saieed F. Ateya et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies