Bayesian Estimations of Exponential Distribution Based on Interval-Censored Data with a Cure Fraction

Ahmed, Al Omari Mohammed

doi:https://doi.org/10.1155/2021/9822870

Journal of Mathematics

On this page

Abstract Introduction Results and Discussion Abbreviations Data Availability Conflicts of Interest References Copyright Related Articles

Research Article | Open Access

Volume 2021 | Article ID 9822870 | https://doi.org/10.1155/2021/9822870

Bayesian Estimations of Exponential Distribution Based on Interval-Censored Data with a Cure Fraction

Al Omari Mohammed Ahmed¹

Academic Editor: Musavarah Sarwar

Received14 May 2021

Revised23 Sept 2021

Accepted29 Sept 2021

Published30 Oct 2021

Abstract

Censored data are considered to be of the interval type where the upper and lower bounds of an event’s failure time cannot be directly observed but only determined between interval inspection times. The analyses of interval-censored data have attracted attention because they are common in the fields of reliability and medicine. A proportion of patients enrolled in clinical trials can sometimes be cured. In some instances, their symptoms mostly disappear without any recurrence of the disease. In this study, the proportion of such patients who are cured is estimated. Furthermore, the Bayesian approach under the gamma prior and maximum likelihood estimation (MLE) is used to estimate the cure fraction depending on the bounded cumulative hazard (BCH) model based on interval-censored data with an exponential distribution. The Bayesian approach uses three loss functions: squared error, linear exponential, and general entropy. These functions are compared with the MLE and used between estimators. Moreover, they are obtained using the mean squared error, which locates the best option to estimate the parameter of an exponential distribution. The results show that the BCH model and lambda parameter of the exponential distribution based on the interval-censored data can be best estimated using the Bayesian gamma prior with a positive loss function of the linear exponential.

1. Introduction

Exponential distribution is commonly used in survival analysis, particularly cure rate modeling, due to its constant failure rate and memoryless property. Cure fraction methods incorporate data from diseases, such as cancer, during clinical trials. Two types of cure models are used to fit survival data, both of which consist of a cure fraction. The first model is a mixture cure rate model, also known as the standard cure rate model. It was first developed by Boag [1] and then modified by Berkson and Gage [2]. Several studies on mixture cure models have been reported in literature [3–5]; Sy and Taylor [6]; Peng and Dear, [7–10].

The second type is a nonmixture cure model, called the promotion time cure model or bounded cumulative hazard (BCH) model, which was developed by Yakovlev et al. [11]. Chen et al. [12] proposed an alternative BCH model and discussed its advantages and disadvantages. The BCH model has been applied and examined in medical research. For instance, Aljawadi et al. [13] estimated the cure fraction in cancer trials with interval-censored data using maximum likelihood estimation (MLE). Various applications of the BCH model have been demonstrated [14, 15]; Ramos et al., 2017 [16].

In the cure fraction, MLE was applied on the BCH model using the Newton–Raphson method to derive the estimation of parameters in an exponential distribution with interval-censored data. Moreover, the application of the Bayesian approach in the cure fraction has received increasing attention for the last years, wherein the Metropolis–Hasting (MH) algorithm in the Bayesian model is considered a general Markov chain Monte Carlo algorithm method. Moreover, numerous methods have been used to estimate the parameters and survival function in the Bayesian method. For example, Upadhyay and Gupta [17]; Monfared et al. [18]; Yousaf et al. [19]; Soliman et al. [20]; and Al Omari [21] discussed some Bayesian estimations using the MH algorithm to estimate parameters in various censored data and priors.

In the literature, cure fraction has been used in the BCH model with MLE. Meanwhile, the objective of this study was to estimate the survival function and the parameters of an exponential distribution using a cure fraction with interval-censored data using the MLE and Bayesian approach; the Bayesian estimation including interval-censored data was used with three types of loss functions, i.e., squared error (SELF), linear exponential (LINEX), and general entropy (GELF), via the MH algorithm, demonstrating the novelty of this study. The MSE was used to compare the methods and determine the best estimator.

2. Methodology

2.1. Maximum Likelihood Estimation (MLE)

The MLE approach uses an exponential distribution, whose cumulative distribution function is denoted by and the probability density function (pdf) is represented by . X represents a random variable with pdf, where parameter needs to be estimated. A random sample with different sizes is represented by . The likelihood function is as follows:

Let be the indicators of censoring and curing for the patient, respectively, defined as follows: .

If , then . However, if , then will not be observed. If it is zero or one, then we assume that the censored data are independent of the failure times. The MLE considers the exponential distribution function to represent the distributional function for the dataset. Furthermore, the survival function , and the pdf for the same group is , as presented in Klein and Moeschberger [22].

The likelihood function is

Furthermore, the log-likelihood function is

We partially differentiate equation (3).

The parameter is expressed as follows:

The Newton–Raphson method is used in equation (5) to solve the problem because it cannot be solved analytically.

The survival function of the exponential distribution is estimated bywhere parameter is estimated by MLE.

2.2. Bayesian Estimations

We assume that X is a random variable with a pdf , where is a random sample of size n. We consider gamma a prior in the Bayesian method. Moreover, and are given as follows:

The posterior of the exponential distribution iswhere

2.3. Loss Functions

Many loss functions have been proposed to explain various loss structures. Here, we consider three loss functions: one symmetric (i.e., SELF) and two asymmetric (i.e., LINEX and GELF).

2.3.1. Bayesian Estimation Using SELF

The parameters in SELF with a cure fraction are estimated as follows:

The Bayesian estimation of the survival function under SELF is as follows:

The estimation of the parameters and survival function in equations (12)–(14) cannot be performed analytically. Therefore, the MH algorithm is used to obtain the solution.

2.3.2. Bayesian Estimation Using the LINEX Loss Function

LINEX assumes that the minimal loss occurs at , shown as follows:. It is overestimated when and underestimated when . When the value of the LINEX loss function r approaches zero, the function approximates the SELF.

We obtain the expected posterior of the loss function of LINEX as follows:

The estimated parameters of the exponential distribution using the Bayesian method with the LINEX loss function are shown as follows:

Furthermore, the survival function of the exponential distribution is shown as follows:

The parameters and survival function of equations (17)–(19) cannot be estimated analytically. Therefore, the MH algorithm is used to obtain the solution.

2.3.3. Bayesian Estimation Using GELF

GELF, presented below, is the second asymmetric loss function used in this study:

The parameters of the Bayesian estimation under GELF are

The survival function of the exponential distribution under GELF is given as follows:

The parameters and survival function in equations (21)–(24) cannot be estimated analytically. Therefore, the MH algorithm is used to obtain the solution.

2.4. MH Algorithm

We combined the gamma prior with the likelihood function in MH algorithms. The full conditional posterior density function is given as follows:

The conditional posterior of the theta parameter is

The conditional posterior of the lambda parameter is

The conditional posteriors of parameters in equations (18) and (19) are theta and lambda, respectively. The parameters do not follow any particular distributions. The implemented MH Algorithm 1 is explained as follows:

(1)	Start with initial values .
(2)	are the current values. Thereafter, generate the candidate values from the uniform distribution (0, 1).
(3)	The increment in is shown as follows: , where .
(4)	Generate the candidate value from uniform (0, 1).
(5)	If , then accept with probability p and return to step 2; otherwise, accept and return to step 2.
(6)	The value of is increased shown as follows: where .
(7)	Generate candidate value from uniform (0, 1).
(8)	Accept with the probability of if and return to step 2; else, accept and return to step 2.
(9)	The Bayesian with a cure fraction depends on the interval-censored type of the parameters under SELF, given as follows: and .
(10)	The Bayesian with a cure fraction depends on the interval-censored type of the parameters under the LINEX loss function, shown as follows: and .
(11)	The Bayesian with the cure fraction-based interval-censored data of the parameters under the GELF is and .

3. Simulation Study

We conducted a Monte Carlo experiment and compared four methods: the MLE method and a Bayesian with loss functions, i.e., SELF, LINEX, and GELF. The sample sizes in each method are n = 20, 40, and 80 to ensure that small, medium, and large sample sizes, respectively, are reflected in the 10,000 repetitions with an initial theta parameter value of 2, which, in the BCH model, is assumed to be the mean of the Poisson distribution.

The steps are explained as follows:(1)Lifetime T was generated from the exponential distribution with an initial value of lambda parameters (1.5 and 3) for various sample sizes (20, 40, and 80).(2)A vector V was generated from a set of clinic visits, which is considered the sample size in this study (20, 40, and 80 clinic visits). The first visit in this study was generated from a uniform (0, 1), and the second visit was generated from uniform (, ). The following generations employ a similar approach.(3)In each dataset, a set of matrices was generated. The following equations were used to obtain the lower and upper bounds:(4)The indicators are defined as follows:(5)The MLE parameter depends on interval-censored data with the cure fraction (equation (4)). Furthermore, the dependencies of the parameter lambda and survival function are based on equations (5) and (6), respectively.(6)The MH algorithm in equations (12)–(14) was used for the Bayesian under SELF to estimate the parameters and survival function. Furthermore, each hyperparameter in the gamma priors is equal to 1.(7)The MH algorithm was also used in equations (17)–(19) for the Bayesian with the LINEX loss function. The Bayesian with the GELF in equations (21)–(24) estimates the parameters and survival function, which depend on interval-censored data with a cure fraction. The values of the loss parameters are (for details, see [14]).(8)The steps mentioned above were repeated 10,000 times. The MSE was calculated for the parameters and survival function of the MLE and Bayesian methods. The results are shown in Tables 1–5, which show the choice of the scale parameter, loss parameter, censoring rate, and sample size.

4. Results and Discussion

The lambda parameter of the exponential distribution based on interval-censored data with a cure fraction was obtained using the MLE method and Bayesian with SELF (BS), LINEX (BL), and GELF (BG) loss functions (see Table 1).

Table 2 presents a comparison of the estimated lambda parameter of the exponential distribution using MSE. The outputs show that the Bayesian estimation with the LINEX loss function is more effective than its maximum likelihood counterparts for r = +0.7. Moreover, the Bayesian under SELF and LINEX loss functions performs better than the MLE for r = −0.7, except for the censoring rate (45%) with 40 and 90 sample sizes. Furthermore, the estimator provides an MSE value of less than 1.5 at a parameter value of 3, for all sample sizes.

The values of parameter using the MLE and MSE methods with a sample size of 20, an initial value of 1.5 for lambda, and 15% for the censoring rate are 1.2424 and 0.1184, respectively. After repeating the steps 10,000 times, the parameter and MSE were determined. The parameters using the Bayesian estimator with SELF priors and MSE with a sample size of 20, lambda value of 1.5, and censoring rate of 15% are 1.2473 and 0.1171, respectively (see Tables 1 and 2).

Table 3 shows a comparison of the estimated with respect to the MSE. The results show that MLE is more effective than the other estimations except for the censoring rates of 15% and 45% for a sample size of 20. The Bayesian SELF and LINEX loss functions perform better than the MLE and other estimations for the 15% and 45% censoring rates, sample size of 20, and r = +0.7.

was calculated to be 1.7646 and 0.0963 using the MLE methods with a sample size of 20 and equal to 1.5% and 15% censoring rates, respectively. After repeating the steps 10,000 times, the parameters and MSE were determined. The parameters using the Bayesian estimator with SELF priors and the MSE for a sample size of 20 with of 1.5% and 15% censoring rates are 1.7605 and 0.0974, respectively (see Tables 3 and 4).

Table 5 shows the MSE of the survival function of the exponential distribution. The Bayesian with LINEX is the best method for r = +0.7. Moreover, the Bayesian with SELF and LINEX has a lower MSE value than the MLE method for r = −0.7, except when the censoring rate is 45% for sample sizes of 40 and 80. Furthermore, the estimator provides an MSE of less than 1.5 when the parameter value is 3, which is maintained for all the sample sizes.

The comparison of the censoring rates of 15%, 30%, and 45% from Tables 1–5 shows that a censoring rate of 15% is better than the other rates when estimating the parameters and survival function. The finding indicates that the smaller the censoring rate is, the more accurate the estimates become. Conversely, the larger the censoring rate, the poorer the estimates of the parameters. Tables 1–5 show that the sample size n increases the MSE of the parameter. Moreover, the survival function of the exponential distribution based on interval-censored data with the cure fraction decreases for all cases.

5. Real Data Analysis

The dataset considered was obtained from the study by Zhou et al. [23], and analyses were performed using our methods from the MLE and Bayesian approach with the cure fraction and interval-censored data.

The dataset comprised 7703 males and 1611 females. The lifetime is the diagnosis age of hypertension (HTN). Each participant visited the clinic for a periodic preventive medical examination. In each visit, the blood pressure was tested. The HTN diagnosis can be performed between two consecutive visits.

The bootstrapping method was used, in which 50 lifetimes were randomly selected, and repeated 10,000 times.

The standard error for our methods was determined by calculating the variance as follows:

Hence, the standard error iswhere S indicates the observed and R is the number of repetition. See Blair et al. [24] and Lee et al. [25] for more details.

6. Results and Discussion

The lambda parameter was obtained from the dataset using the MLE method and Bayesian with SELF (BS), LINEX (BL), and GELF (BG) (see Table 6).

Table 7 presents a comparison of the estimated lambda parameter of the exponential distribution using the standard error. The outputs show that the Bayesian estimation with the LINEX loss function is more effective than its maximum likelihood counterparts for r = +0.7.

The values of parameter using the MLE and standard error methods with a dataset, an initial value of 1.5 for lambda, and a censoring rate of 15% were calculated to be 1.2802 and 0.1008, respectively. After repeating the steps 10,000 times, the parameter and standard error were determined. The parameters using the Bayesian estimator with SELF priors and MSE with a dataset, lambda value of 1.5, and censoring rate of 15% were determined as 1.2848 and 0.0994, respectively (see Tables 6 and 7).

Table 8 shows a comparison of estimated with respect to the standard error. The results show that MLE is more effective than the other estimations except for the censoring rates of 15% and 45%. was calculated to be 1.9154 using the MLE methods, at 1.5% and 15% censoring rates. The results after repeating the steps 10,000 times are presented in Tables 8 and 9.

The comparison between censoring rates of 15%, 30%, and 45% from Tables 6–9 shows that a censoring rate of 15% is better than the other values when estimating the parameters. This finding indicates that the smaller the censoring rate is, the more accurate the estimates become. Conversely, the larger the censoring rate, the poorer the estimates of the parameters.

After performing the computation method in R program for the survival function for our method, the output is presented in Figure 1, showing slight changes in the curves of the survival function of our method.

7. Conclusions

This study considers the parametric estimation and survival function based on the BCH model via the Bayesian approach with a gamma prior based on interval-censored data. Comparisons were performed between the Bayesian estimates with three loss functions, i.e., SELF, LINEX, and GELF, with maximum likelihood methods based on the simulation and dataset. The comparison between the censoring rates shows that a censoring rate of 15% is better than the other values when estimating the parameter. This finding indicates that the smaller the censoring rate is, the more accurate the estimates become. Consequently, the larger the censoring rate is, the poorer the estimates of the parameters are. The theta of the BCH model and lambda parameter of the exponential distribution based on the interval-censored data can be best estimated using the Bayesian gamma prior with a positive LINEX loss function. In the future, this study can be extended to other censoring approaches, such as hybrid and progressive censoring schemes. Furthermore, the schemes could include covariates through the use of exponential models.

Abbreviations

MLE:	Maximum likelihood estimation
MH:	Metropolis–Hastings algorithm
MSE:	Mean squared error
SELF:	Loss function of squared error
LINEX:	Loss function of linear exponential
GELF:	Loss function of general entropy.

Data Availability

The data considered were obtained from the study by Zhou et al. [23].

Conflicts of Interest

The author declares that there are no conflicts of interest.

References

J. W. Boag, “Maximum likelihood estimates of the proportion of patients cured by cancer therapy,” Journal of the Royal Statistical Society: Series B, vol. 11, no. 1, pp. 15–44, 1949.
View at: Publisher Site | Google Scholar
J. Berkson and R. P. Gage, “Survival curve for cancer patients following treatment,” Journal of the American Statistical Association, vol. 47, no. 259, pp. 501–515, 1952.
View at: Publisher Site | Google Scholar
V. T. Farewell, “Mixture models in survival analysis: are they worth the risk?” Canadian Journal of Statistics, vol. 14, no. 3, pp. 257–262, 1986.
View at: Publisher Site | Google Scholar
J. W. Gamel, I. W. McLean, and S. H. Rosenberg, “Proportion cured and mean log survival time as functions of tumour size,” Statistics in Medicine, vol. 9, no. 8, pp. 999–1006, 1990.
View at: Publisher Site | Google Scholar
A. Y. C. Kuk and C.-H. Chen, “A mixture model combining logistic regression with proportional hazards regression,” Biometrika, vol. 79, no. 3, pp. 531–541, 1992.
View at: Publisher Site | Google Scholar
J. P. Sy and J. M. G. Taylor, “Estimation in a Cox proportional hazards cure model,” Biometrics, vol. 56, no. 1, pp. 227–236, 2000.
View at: Publisher Site | Google Scholar
Y. Peng and K. B. G. Dear, “A non-parametic mixture model for cure rate estimation,” Biometrics, vol. 56, pp. 237–243, 2000.
View at: Google Scholar
M. R. Abu Bakar, K. A. Salah, N. A. Ibrahim, and K. Haron, “Bayesian approach for joint longitudinal and time-to-event data with survival fraction,” Bulletin of the Malaysian Mathematical Sciences Society, vol. 32, pp. 75–100, 2009.
View at: Google Scholar
P. Naseri, A. R. Baghestani, N. Momenyan, and M. E. Akabari, “Application of a mixture cure fraction model based on the generalized modified Weibull distribution for analyzing survival of patients with breast cancer,” International Journal of Cancer Management, vol. 11, pp. 1–8, 2018.
View at: Publisher Site | Google Scholar
M. E. A. M. E. Omer, M. R. A. Bakar, M. B. Adam, and M. S. Mustafa, “Cure models with exponentiated Weibull exponential distribution for the analysis of melanoma patients,” Mathematics, vol. 8, p. 8, 2020.
View at: Publisher Site | Google Scholar
A. Y. Yakovlev, B. Asselain, V. J. Bardou et al., A simple stochastic model of tumor recurrence and its applications to data on pre-menopausal breast cancer, In Biometrieet Analyse de Dormees Spatio, Société Francaise de Biométrie, ENSA Renned, Rennes, France, 1993.
M.-H. Chen, J. G. Ibrahim, and D. Sinha, “A new Bayesian model for survival data with a surviving fraction,” Journal of the American Statistical Association, vol. 94, no. 447, pp. 909–919, 1999.
View at: Publisher Site | Google Scholar
B. A. I. Aljawadi, M. R. A. Bakar, N. A. Ibrahim, and M. Al-Omari, “Parametric maximum likelihood estimation of cure fraction using interval-censored data,” Journal of Advanced Computing, vol. 1, pp. 43–53, 2013.
View at: Publisher Site | Google Scholar
C. M. Carvalho Lopes and H. Bolfarine, “Random effects in promotion time cure rate models,” Computational Statistics & Data Analysis, vol. 56, no. 1, pp. 75–87, 2012.
View at: Publisher Site | Google Scholar
M. T. Uddin, M. N. Islam, and Q. I. U. Ibrahim, “An analytical approach on cure rate estimation based on uncensored data,” Journal of Applied Sciences, vol. 6, no. 3, pp. 548–552, 2006.
View at: Publisher Site | Google Scholar
S. Lipovetsky and M. Conklin, “Decreasing respondent heterogeneity by Likert scales adjustment via multipoles,” Stats, vol. 1, no. 1, pp. 169–175, 2018.
View at: Publisher Site | Google Scholar
S. K. Upadhyay and A. Gupta, “A Bayes analysis of modified Weibull distribution via Markov chain Monte Carlo simulation,” Journal of Statistical Computation and Simulation, vol. 80, no. 3, pp. 241–254, 2010.
View at: Publisher Site | Google Scholar
M. Mohammadi Monfared, R. Arabi Belaghi, M. H. Behzadi, and S. Singh, “Estimation and prediction based on type-I hybrid censored data from the Poisson-exponential distribution,” Communications in Statistics - Simulation and Computation, vol. 68, pp. 1–26, 2019.
View at: Publisher Site | Google Scholar
R. Yousaf, S. Ali, and M. Aslam, “Bayesian estimation of transmuted Weibull distribution under different loss functions,” Journal of Reliability and Statistical Studies, vol. 13, pp. 287–324, 2020.
View at: Publisher Site | Google Scholar
A. Soliman, E. Ahmed, A. Farghal, and A. Al-Shibany, “Estimation of generalized inverted exponential distribution based on adaptive type-II progressive censoring data,” Journal of Statistics Applications & Probability, vol. 9, no. 2, pp. 215–230, 2020.
View at: Publisher Site | Google Scholar
M. A. Al Omari, “Comparison on the Bayesian estimation of Gompertz distribution based on Type I censored data,” Journal of Mathematical Theory and Modeling, vol. 10, 2020.
View at: Google Scholar
J. P. Klein and M. L. Moeschberger, Survival Analysis Techniques for Censored and Truncated Data, Springer, New York, NY, USA, 2nd edition, 2003.
J. Zhou, J. Zhang, and W. Lu, Computationally Efficient Estimation for the Generalized Odds Rate Mixture Cure Model with Interval-Censored Data, Taylor & Francis, New York, NY, USA, 2017.
S. N. Blair, J. B. Kampert, H. W. Kohl et al., “Influences of cardiorespiratory fitness and other precursors on cardiovascular disease and all-cause mortality in men and women,” Journal of the American Medical Association: The Journal of the American Medical Association, vol. 276, no. 3, pp. 205–210, 1996.
View at: Publisher Site | Google Scholar
D.-C. Lee, X. Sui, T. S. Church, C. J. Lavie, A. S. Jackson, and S. N. Blair, “Changes in fitness and fatness on the development of cardiovascular disease risk factors,” Journal of the American College of Cardiology, vol. 59, no. 7, pp. 665–672, 2012.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Al Omari Mohammed Ahmed. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies