Abstract

In this study, two new distributions are developed by compounding Sine-Weibull and zero-truncated geometric distributions. The quantile and ordinary moments of the distributions are obtained. Plots of the hazard rate functions of the distributions show that the distributions exhibit nonmonotonic failure rates. Also, plots of the densities of the distributions show that they exhibit decreasing, skewed, and approximately symmetric shapes, among others. Mixture and nonmixture cure rate models based on these distributions are also developed. The estimators of the parameters of the cure rate models are shown to be consistent via simulation studies. Covariates are introduced into the cure rate models via the logit link function. Finally, the performance of the distributions and the cure rate and regression models is demonstrated using real datasets. The results show that the developed distributions can serve as alternatives to existing models for survival data analyses.

1. Introduction

Parametric distributions play an important role in modeling survival data. A well-known classical distribution is the Weibull distribution. Though the Weibull distribution, including other classical distributions, is very common and has much usefulness in different fields, it is not able to model data that exhibit nonmonotonic failure rates. Because of this, several extensions of this distribution have been developed by researchers to accommodate the shortcomings of classical distributions. In this study, the sine-G family of distributions proposed by Kumar et al. [1] is used to modify the Weibull distribution. This is also a special case of exponentiated sine-Weibull (ESW) distribution proposed by Muhammad et al. [2].

Let a random variable follow the sine-G family of distributions proposed by Kumar et al. [1]. Its cumulative distribution function (CDF) is given bywhere is the CDF of the baseline distribution. In this study, the Weibull distribution is taken as the baseline distribution with CDF

Substituting the CDF of the Weibull distribution in equation (2) into equation (1) gives the CDF of the sine-Weibull (SW) distribution as

Compounding continuous distributions with discrete distributions is among the methods of generating new distributions and making them more flexible and useful. Among these methods of generating new distributions is the power series method. In this study, two extensions of the SW distribution are developed using the zero-truncated geometric distribution.

In modeling survival data, is it important to be able to model the proportions of individuals who are cured, known as cured fraction, and who may not remain susceptible to the event of interest. Cure rate models are very popular for this purpose as they allow more information to be used for analyzing survival data. There are two main cure rate models known as mixture and nonmixture cure rate models. The mixture cure rate model was introduced by Boag [3] and further developed by Berkson and Gage [4]. The nonmixture cure rate model was introduced by Klebanov et al. [5]. For both models, parametric, semiparametric, and nonparametric methods have been used to estimate the cure fraction [6]. There are several research studies on both types of cure rate models and their extensions to include covariates. Some of these include generalized log-gamma regression models with cure fraction [7], cure fraction models using mixture and nonmixture models based on Weibull distribution [8], mixture and nonmixture cure fraction models based on generalized modified Weibull distribution [9], exponentiated exponential mixture and nonmixture cure rate model with covariates [10], nonmixture cure model with Fréchet distribution [6], cure models based on exponentiated Weibull exponential distributions [11], destructive power series cure model with covariates [12], and cure model based on generalized Weibull distribution with covariates [13]. In this study, mixture and nonmixture cure rate models are developed based on the SW geometric distributions. Furthermore, regression models are developed based on these mixture and nonmixture models to accommodate covariates.

The rest of the paper is organized as follows: Section 2 presents the Sine-Weibull geometric distribution for the first and last activation schemes. Mixture and nonmixture cure rate models with simulation studies to assess the estimators of the parameters of the models are presented in Sections 3 and 4, respectively. The regression models based on the cure rate models are presented in Section 5. The applications of the developed SW geometric distributions, the cure rate, and regression models are presented in Section 6. The conclusion of the research is presented in Section 7.

2. Sine-Weibull Geometric Distribution

The SW geometric (SWG) distribution under the first and last activation schemes is presented in this section. Suppose that , representing the failure times of a subsystem, are independent and identically distributed (iid) SW random variables with CDF given by equation (3) and follows the zero-truncated geometric distribution. Then, and follow the SW geometric I (SWGI) and SW geometric II (SWGII) distributions, respectively. The CDFs of SWGI and SWGII distributions are defined, respectively, aswhere is the CDF of SW distribution and , . It is worth noting that the distributions are well defined for . Substituting the CDF of the SW distribution in equation (3) and the definition of into equation (4) gives the CDF of the SWGI distribution as

The corresponding probability density function (PDF) obtained by differentiating equation (6) is given as

Also, the hazard rate function (HRF) of the SWGI distribution is given as

The quantile function of the SWGI distribution is useful for the generation of random numbers from the distribution. The quantile function of SWGI distribution is obtained as the inverse function of the CDF of SWGI distribution given in equation (6). The quantile function of the SWGI distribution is obtained as

Figure 1 shows plots of some possible shapes of PDF and HRF of the SWGI distribution. It can be observed that the PDF can assume decreasing, right and left-skewed, and approximately symmetric shapes. Also, the HRF function can assume decreasing, increasing, J-shape, and modified bathtub shapes.

The PDF of the SWGI distribution can be written in a mixture form aswhere , , and is the derivative of .

Similarly, the CDF of the SWGII distribution is obtained by substituting the CDF of the SW distribution in equation (3) and the definition of into equation (5). This gives the CDF of the SWGII distribution as

The corresponding PDF and HRF are given, respectively, as

The quantile function of the SWGII distribution is also obtained as

Possible shapes of the PDF and HRF of SWGII distribution are given in Figure 2. It can be observed that the PDF of the SWGII distribution can assume decreasing, left and right-skewed, and approximately symmetric shapes. Also, the HRF of the distribution show increasing, decreasing, J-shape, and upside-down bathtub shapes.

The PDF of the SWGII distribution can be written aswhere and is as defined in equation (10).

The ordinary moment of a distribution is defined as . Substituting the mixture representations of the SWGI and SWGII distributions in equations (10) and (15) into the definition gives the ordinary moment aswhere and is defined as and in equations (10) and (15) for the SWGI and SWGII distributions, respectively.

3. Mixture Cure Rate Models

Cure in a population occurs when the level of mortality in a cohort of patients returns to the expected level in a population. Cure rate models are used to model the time-to-event of various types of datasets of different kinds of conditions, especially cancer. The cohort is divided into two groups, an individual that is either cured with probability and those with proper survival function with probability . This gives an improper population survival function expressed in a mixture form as

Let be the CDF of SWGI and SWGII distributions; then, we obtain SWGI and SWGII mixture cure rate models. Substituting equations (6) and (11) into equation (16) gives the survival function of the SWGI and SWGII cure rate models, respectively, as

Also, the PDF of the SWGI and SWGII mixture cure rate models is given, respectively, as

Consider pairs of time and censoring indicators , where if is a time-to-event and if is censored for . The log-likelihood function is given by

To obtain the log-likelihood function of the SWGI mixture cure rate model, equations (6) and (7) are substituted into equation (22). Similarly, substituting equations (11) and (12) into equation (22) gives the log-likelihood function of the SWGII mixture cure rate model. The estimates of the parameters of the models are obtained directly by maximizing the log-likelihood function given in equation (22).

3.1. Simulation Studies

Simulation studies are conducted in this section to assess the performance of the maximum likelihood estimators for the parameters of the mixture cure fraction models. The steps used to achieve this are given as follows:(I)Generate a sample of size of from .(II)Given that is the cure fraction,where is the quantile function of SWGI and SWGII distributions.(III)Generate censored samples from the exponential distribution with a rate equal to .(IV)Obtain the right-censored data as .(V)Obtain the pairs , where if and otherwise.(VI)Obtain the maximum likelihood estimates of the parameters using equation (22) and obtain the average estimates (AE), absolute bias (AB), and the root mean square error (RMSE) of the estimates. Also, the censoring rate (CR) is computed.(VII)Repeat steps I–VI 5000 times for sample sizes .(VIII)Steps I and VII are repeated for parameter set and cure fraction of , , and .

The results of the simulation studies for SWGI and SWGII are given in Tables 1 and 2, respectively. It can be observed that all the parameters are consistent as the ABs and RMSEs decrease with increasing sample size for all the different cure fractions considered. It can also be observed that as the cure fraction increases, the censored rate also increases. The AEs of the parameters are close to the true parameter values.

4. Nonmixture Cure Rate Models

Suppose that a cancer patient has number of cancer cells after treatment. Assume that is distributed as a Poisson random variable with mean due to rapid growth which may lead to later production of detectable cancer disease. If is the random time for the cancer to produce a detectable cancer mass, then the relapse time . Given that are independently and identically distributed with CDF and survival function and , respectively, then the survival function is defined aswhere . Alternatively,

Thus, SWGI and SWGII nonmixture cure rate models are given, respectively, as

It should be noted that , where is the hazard rate function. Therefore, the log-likelihood function of a nonmixture cure rate model is given as

The estimates of the parameters are obtained by maximizing equation (28) directly.

4.1. Simulation Studies

The steps for the simulation studies are similar to the steps for the simulation studies in the SWGI and SWGII mixture cure rate models. However, equation (28) is used for the estimation of the parameters of the model. Tables 3 and 4 show the simulation results for SWGI and SWGII nonmixture cure rate models, respectively. The ABs and RMSEs again decrease as the sample size increases affirming that the maximum likelihood estimators for the parameters of the nonmixture cure rate models are also consistent.

5. Regression

In survival analysis, regression models with cure fraction are useful. In this section, a regression model in which the time-to-event of competing causes of the event of interest follows the SWGI and SWGII distributions is considered. If the lifetimes are affected by covariates, then we develop a regression model considering the covariates. To achieve this, we relate the cure fraction to the covariates using the logit link function given aswhere is the vector of covariates and is the vector of regression coefficients. This gives the regression model structure as

The log-likelihood functions for the mixture cure rate and nonmixture cure rate regression models are obtained by substituting into equations (22) and (28), respectively. Thus, the estimates of the regression parameters can be obtained via the maximum likelihood method.

6. Applications

In this section, the applications of the developed distributions and their corresponding mixture cure rate, nonmixture cure rate, and regression models are demonstrated.

6.1. Application of SWGI and SWGII Distribution

The usefulness of the SWGI and SWGII distributions is demonstrated in this section. The performance of the distributions is compared with the performance of the sine-Topp-Leone exponentiated exponential (STLEE) [14], sine-Weibull (SW), and Weibull (W) distributions. The distributions are compared using the Akaike information criterion (AIC), Bayesian information criterion (BIC), Hannan–Quinn information criterion (HQIC), Cramér–von Mises (CVM), and Anderson–Darling (AD) goodness-of-fit measures. The distribution with the least value of these measures and the highest of the values of CVM and AD measures is considered the best distribution that fits the data.

The data used consist of remission times of 128 bladder cancer patients. The data are obtained from Lee and Wang [15] and are given as follows: 0.08, 2.09, 3.48, 4.87, 6.94, 8.66, 13.11, 23.63, 0.20, 2.23, 3.52, 4.98, 6.97, 9.02, 13.29, 0.40, 2.26, 3.57, 5.06, 7.09, 9.22, 13.80, 25.74, 0.50, 2.46, 3.64, 5.09, 7.26, 9.47, 14.24, 25.82, 0.51, 2.54, 3.70, 5.17, 7.28, 9.74, 14.76, 26.31, 0.81, 2.62, 3.82, 5.32, 7.32, 10.06, 14.77, 32.15, 2.64, 3.88, 5.32, 7.39, 10.34, 14.83, 34.26, 0.90, 2.69, 4.18, 5.34, 7.59, 10.66, 15.96, 36.66, 1.05, 2.69, 4.23, 5.41, 7.62, 10.75, 16.62, 43.01, 1.19, 2.75, 4.26, 5.41, 7.63, 17.12, 46.12, 1.26, 2.83, 4.33, 5.49, 7.66, 11.25, 17.14, 79.05, 1.35, 2.87, 5.62, 7.87, 11.64, 17.36, 1.40, 3.02, 4.34, 5.71, 7.93, 1.46, 18.10, 11.79, 4.40, 5.85, 8.26, 11.98, 19.13, 1.76, 3.25, 4.50, 6.25, 8.37, 12.02, 2.02, 13.31, 4.51, 6.54, 8.53, 12.03, 20.28, 2.02, 3.36, 12.07, 6.76, 21.73, 2.07, 3.36, 6.93, 8.65, 12.63, and 22.69.

Table 5 shows the parameter estimates of all the competing distributions with their corresponding standard errors and values.

Table 6 shows the information criteria and goodness-of-fit measures of the estimated distributions. It can be observed that SWGI distribution has the least value of the information criteria and the goodness-of-fit measures and the highest of the corresponding values of the goodness-of-fit measures. This is followed by SWGII distribution.

The performance of the models is illustrated graphically using the estimated densities of the fitted distribution plotted over the histogram of the bladder cancer data. This is shown in Figure 3. It can be observed that SWGI and SWGII distributions best describe the histogram of the data as compared to the other distributions.

Probability-probability (P-P) plots of the fitted distributions are shown in Figure 4. It can be observed that SWGI and SWGII distributions have points clustering more along the diagonal line. Thus, these confirm that the SWGI and SWGII distributions best describe the bladder cancer data and can be used as alternative distributions to model lifetime data.

6.2. Cure Rate Models without Covariates

The usefulness of the cure rate models developed in this study is demonstrated in this section. The data used for the demonstration are the melanoma data from Eastern Cooperative Oncology Group (ECOG) phase III clinical trial e1684. The data are available in R package smcure [16]. The study was conducted from 1984 to 1990 and consisted of 286 patients. About 69% of the data are censored. A total of 284 observations are used for this analysis after missing data were deleted.

The mixture and nonmixture cure rate models based on SWGI and SWGII distributions are fitted to the data. Also, Weibull and exponentiated exponential (EE) mixture and nonmixture cure fraction models [10] are fitted to the data. The models are compared using AIC, BIC, and HQIC. Table 7 shows the parameter estimates of the fitted models with their corresponding standard errors.

Table 8 shows the goodness-of-fit measures of the fitted models. It can be observed that the SWGII mixture and nonmixture models provide a better fit to the data as they have the least of , AIC, BIC, and HQIC as compared to the other competing models.

To illustrate how the models fit the data, the fitted survival curves of the models are obtained and overlaid with the Kaplan–Meier survival curve. The mixture and nonmixture cure rate models are overlaid with the Kaplan–Meier survival curve for each distribution. These are shown in Figure 5. It can be observed that the survival curves for the mixture and nonmixture SWGII cure rate models, shown in Figure 5, are closer to the Kaplan–Meier estimates as compared to the other survival curves. This confirms the results in Table 8.

6.3. Cure Fraction Models with Covariates

The usefulness of the cure rate regression model developed in this research is also demonstrated. The data used are the melanoma data from Eastern Cooperative Oncology Group (ECOG) phase III clinical trial e1684. The data are available in R package smcure. Again, about 69% of the data are censored and a total of 284 observations are used for this analysis after missing data were deleted. The dependent variable is defined as the survival times, while the covariates are age and gender. Thus, the regression structure used is of the formwhere and represent the age and gender of the subject , respectively. Mixture and nonmixture regressions models for SWGI and SWGII distributions are fitted to the data. The performances of these models are compared with Weibull and EE mixture and nonmixture models. Table 9 shows the parameter estimates with their corresponding standard errors and values for all the fitted regression models. It can be observed that the covariates, age and gender, are not significant for all the regression models fitted at the significance level, with gender being highly insignificant with a value greater than .

Table 10 shows the goodness-of-fit measures of the fitted regression models. It can be observed that the mixture and nonmixture regression models of SWGII distribution performed better than all the other competing regression models as they have the least of the information criteria.

The assessment of the adequacy of the regression models is performed via Cox–Snell residual analysis [17]. Given the CDF of a distribution, the Cox–Snell residual is given as . If the model fits the data, the Cox–Snell residuals are approximately standard exponentially distributed. A P-P plot of the empirical probabilities of the residuals against the theoretical probabilities from standard exponential distribution can be used to check the behavior of the Cox–Snell residual. Figure 6 shows the P-P plots for all the fitted regression models. It can be observed that the plotted points for the mixture and nonmixture cure rate SWGII regression models are closer to the diagonal as compared to the other regression models. This confirms the results in Table 10.

7. Conclusion

In this study, two new distributions based on Sine-Weibull geometric distribution known as SWGI and SWGII distributions were developed. Some properties including the quantile and ordinary moments are obtained. Also, plots of the hazard rate functions of the distributions show that the distributions exhibit nonmonotonic failure rates. The plots of the density functions of the distributions also show that they exhibit various desirable shapes. Mixture and nonmixture cure rate models based on these distributions were also developed. Simulation studies were performed to assess the estimators of the models. The results show that the estimators were consistent. Again, regression models based on the cure rate models were developed. Finally, the performance of the distributions and the cure rate and regression models is demonstrated using real datasets. The results show that the developed distributions can serve as an alternative to existing models for fitting lifetime data. Also, the developed mixture and nonmixture cure rate and regression models can be used to perform survival analyses. Future work may consider using Bayesian methods to estimate the parameters of the sine-Weibull geometric distribution. Also, other zero-truncated distributions may be considered in describing the number of occurrence of the cases instead of the zero-truncated geometric distribution, with sine-Weibull distribution describing the failure times.

Data Availability

The data used to support the findings of the research are either presented in the article or their sources are given in the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.