Abstract

Ranked set sampling is a very useful method to collect data when the actual measurement of the units in a population is difficult or expensive. Recently, the generalized quasi-Lindley distribution is suggested as a new continuous lifetime distribution. In this article, the ranked set sampling method is considered to estimate the parameters of the generalized quasi-Lindley distribution. Several estimation methods are used, including the maximum likelihood, the maximum product of spacings, ordinary least squares, weighted least squares, Cramer–von Mises, and Anderson–Darling methods. The performance of the proposed ranked set sampling based estimators is achieved through a simulation study in terms of bias and mean squared errors compared to the simple random sample. Additional results are obtained based on real data for the survival times of 72 guinea pigs and 23 ball bearings. The simulation study results and the real data applications showed the superiority of the proposed ranked set sampling estimators compared to the simple random sample competitors based on the same number of measuring units.

1. Introduction

One of the significant interesting fields in statistics is the cost-effective sampling methods. The motivation of this field arises from its superiority in facilitating data collection, especially when collecting data of interest consumes a long time or is expensive. Over the past decades, researchers developed different sampling methods in order to achieve reliable results with low cost and more accuracy. McIntyre [1] proposed a new sampling method for estimating the mean of pasture and forage yields in Australia. This method is known as ranked set sampling (RSS) and has greater efficiency than the commonly used simple random sampling (SRS). Later, Halls and Dell [2]; Takahasi and Wakimoto [3]; and Dell and Clutter [4] published others studies on the RSS method. Due to its cost-effectiveness, it is used in wide applications, including reliability, estimation of population parameters, statistical quality control, medicine, acceptance sampling plans, and so on (see Chen et al. [57], Haq et al. [8], and Al-Omari and Haq [9, 10]).

It is well documented that the RSS method is an attractive procedure and more acclimated to the nature of the underlying data. This advantage motivated researchers to propose and study new RSS schemes (see, for example, the double ranked set by Al-Saleh and Al-Kadiri [11]; median ranked set sampling by Mutllak [12]; neoteric ranked set sampling by Zamanzade and Al-Omari [13]; extreme ranked set sampling by Samawi et al. [14]; and L ranked set sampling by Al-Nasser [15]). For a good review of RSS and more motivations on this method, one can refer to Al-Omari and Bouza [16]; Al-Hadhrami and Al-Omari [17]; Haq et al. [18]; Santiago et al. [19]; and Haq et al. [20].

The problem of estimating distributions parameters, in general, is considered by many authors. For example, Nassar et al. [21] considered parameters estimation of the new extension of Weibull distribution. Nassar et al. [22] treated the estimation problem using the alpha power exponential distribution. Later, Afify and Mohamed [23] and Afifiy et al. [25] dealt with parameter estimation based on the new three-parameter exponential distribution and the Weibull Marshall -Olkin Lindley distribution, respectively. Recently, Alfaer et al. (2021) considered the extended log-logistic distribution and estimated its parameters. Although these works considered parameter estimations in related distributions, their works did not deal with sampling design techniques.

Undoubtedly, the parametric estimation method using the sampling design technique plays a vital role in statistical inference. Many studies considered the estimation of parameters based on RSS designs and their extensions using different estimation methods. Yousef and Al-Subh [26] estimated the Gumbel parameters using the maximum likelihood method, method of moment, and the method of regression. Hussian [27] used the Bayesian and maximum likelihood estimation methods to estimate the Kumaraswamy distribution parameters. Chen et al. [28] estimated the scale parameter for the scale distribution using moving extreme ranked set sampling, and Abu-Dayyeh et al. [29] considered the logistic method for parameter estimation based on both SRS and RSS. Pedroso et al. [30] considered the RSS to estimate the parameters of the two-parameter Birnbaum–Saunders distribution, and Akgul et al. [31] used the same RSS in system reliability estimation for generalized inverse Lindley distribution. (see also Taconeli and Bonat [32] for some estimation methods based on the RSS). For more details on parameter estimation, the readers are referred to Stokes [33]; Dey et al. [34]; Khamnei and Mayan [35]; and Al-Saleh and Al-Hadhrami [36].

This paper aims to study the performance of using RSS design in estimating the parameters of the generalized quasi-Lindley distribution (GQLD) introduced recently by Benchiha and Al-Omari [37]. A random variable is said to follow a GQLD distribution with parameters and if its pdf is given bywith cdf given by

The first two moments of , respectively, are

The variance of the GQLD distribution is given by

The corresponding reliability and hazard functions of the GQLD distribution, for , are given, respectively, by

To the best of our knowledge, there are no published papers which used the RSS in estimating the parameters of GQLD. The remainder of this paper is organized as follows. The RSS method is explained and the suggested various estimators for the GQLD are given in Section 2. In Section 3, a simulation study is provided to investigate the performance of the RSS estimators relative to the SRS counterparts based on the same number of measured units. Applications to real datasets fitted to the GQLD are given in Section 4. The paper is ended in Section 5 with concluding remarks and suggestions for future works.

2. Methods of Estimation

In this section, six methods of estimation are considered for estimating the unknown parameters and of the GQLD distribution using RSS design. These methods are the maximum likelihood method, method of maximum product of spacings, ordinary least squares method, weighted least squares method, Cramer–von Mises method, and Anderson–Darling method. The RSS strategy can be described as follows:(1)Select a simple random sample of size units from the desired population. Randomly partition them into sets of each size , where is known as the set size.(2)Rank the units within each set of size from smallest to largest with respect to the variable of interest.(3)Obtain the order statistic from the set, for , 2, …, .(4)Repeat Steps (1)–(3), times (cycles) if needed, to get a ranked set sample of size .

The resulting sample is denoted as , where is the largest unit in a set of size in the cycle. It is of interest to note that perfect ranking is assumed in this study.

2.1. Maximum Likelihood Estimation

Let denote the order statistics from the set of size at the cycle, and take it as the RSS data for of a sample size . Then, the maximum likelihood function based on the RSS sample is given bywhere

The log-likelihood function, , is

The estimators of and of the GQLD using RSS can be obtained by solving the nonlinear equations

2.2. Method of Maximum Product of Spacings

Cheng and Amin [38, 39] introduced the method of maximum product of spacings (MPS). This method is based on maximization of the geometric mean of spacings in the data. The MPS is a consistent and efficient in most general cases. Consider the ordered units that form a ranked set sample of size where is the number of cycle and is the set size from the GQLD. Then, the uniform spacing is given bywhere and . Clearly, .

The MPS estimators, and , are the estimates of and , which maximize the geometric mean of the spacing , where

The natural logarithm of (11) is

The estimators and of the parameters and , respectively, can also be obtained by solving the nonlinear equations:whereandcan be then obtained numerically.

2.3. Methods of Least Squares

Swain et al. [40] was the first who used the method of least squares to estimate the parameters of beta distribution based on one of the famous results in probability theory which indicates that where is a cumulative distribution function and is the th order statistic of the random sample . Therefore, in our case, we have

Using the above expectations and variances, we obtain two variants of the least squares methods.

2.3.1. Ordinary Least Squares

Let the ordered units constitute a ranked set sample of size . Then, the ordinary least squares (OLS) estimators, say and of the parameters and , respectively, can be obtained by minimizing the function:with respect to and . Alternatively, these estimates can also be obtained by solving the following nonlinear equations:where and are defined as in (14) and (15), respectively.

2.3.2. Weighted Least Squares

Consider the RSS units that form ranked set sampling of size . Then, the weighted least squares (WLS) estimators of and , say and , respectively, can be obtained by minimizing the following function:with respect to and . Equivalently, the estimates are the solution of the following nonlinear equations:where and are specified as in (14) and (15), respectively.

2.4. Methods of Minimum Distances

Methods of estimation based on minimizing some famous goodness of fit statistics are useful in many cases and give good results. Here, two popular methods based on the minimization of test statistics between the theoretical and empirical cumulative distribution functions are considered. The methods are the Cramer–von Mises method and Anderson–Darling method (for more details, see D’Agostino and Stephens [41] and Luceño [42]).

2.4.1. Cramer–von Mises Method

Consider the ordered units that form a ranked set sample of size . Then, the Cramer–von Mises estimators (CV) and of and , respectively, are obtained by minimizing the functionwith respect to and . Equivalently, the estimates are the solution of the following nonlinear equations:where and are given in (14) and (15), respectively.

2.4.2. Anderson–Darling Method

Suppose that is a ranked set sample of size . Then, the estimates based on the Anderson–Darling (AD) method for the GQLD distribution parameters and , denoted by and , can be obtained by minimizing the functionwith respect to and , or equivalently by solving the following two equations:where and are as specified in (14) and (15), respectively.

3. Simulation

In order to evaluate the performance of the estimation methods under RSS, a simulation study is conducted by using R software. 1000 samples are generated from the GQLD with different parameters values as , and in different sizes for both RSS and SRS. For the RSS design, the number of cycles is selected to be  = 3, 4, and 5 while the set size is taken as 5, 10, and 15. In each case, we combine with all values of to study the effect of set size and the number of cycles. For the SRS design, the size of SRS is which is required for having the same size in each design. We have considered perfect ranking assumptions. The mean squared error (MSE) is calculated for each estimator in order to compare SRS and RSS. The MSE and the efficiency (Eff) are calculated by

The results are reported in Tables 18 for the parameter estimates (Es), the MSE, and the Eff.

Based on simulation results presented in Tables 18, one can observe that(i)The values of the efficiency given in the tables are larger than 1. Hence, we can say that the suggested RSS estimators perform better than their SRS counterparts based on all methods considered in this study.(ii)Under the RSS design, where the number of cycles is increasing, the MSE values are decreasing. For example, when  = 5, the MSEs of the estimators of using the MLE method are 0.015, 0.011, and 0.009 for  = 3, 4, and 5, respectively.(iii)Under the RSS design, the MSE is decreasing when the set size is increasing. For illustration, when  = 5, the MSEs of estimators of using AD method are 1.148, 0.189, and 0.082 for  = 5, 10, and 15, respectively.(iv)The MSE of the SRS estimators is decreasing when is increasing.(v)In most cases, the efficiency is increasing when is increasing. For instance, from Table 6, the values of efficiency of using the CV method when  = 3 are 2.707, 5.114, and 8.812 for  = 5, 10, and 15, respectively.(vi)The MLE estimators perform better than the other methods under RSS and SRS designs for most cases presented in the tables.(vii)The values of the estimated parameters indicate that the bias values are negligible and go to zero as the sampling size increases.

4. Application to Real Datasets

In this section, we illustrate the performance of the suggested estimators based on RSS design for two well-known real datasets. The first dataset presents the survival times (in days) of 72 guinea pigs infected with virulent tubercle bacilli, observed and reported by Bjerkedal [43]. The data were previously studied by Afify et al. [44]. The data observations areDataset 1. 0.1, 0.33, 0.44, 0.56, 0.59, 0.72, 0.74, 0.77, 0.92, 0.93, 0.96, 1, 1, 1.02, 1.05, 1.07, 07, .08, 1.08, 1.08, 1.09, 1.12, 1.13, 1.15, 1.16, 1.2, 1.21, 1.22, 1.22, 1.24, 1.3, 1.34, 1.36, 1.39, 1.44, 1.46, 1.53, 1.59, 1.6, 1.63, 1.63, 1.68, 1.71, 1.72, 1.76, 1.83, 1.95, 1.96, 1.97, 2.02, 2.13, 2.15, 2.16, 2.22, 2.3, 2.31, 2.4, 2.45, 2.51, 2.53, 2.54, 2.54, 2.78, 2.93, 3.27, 3.42, 3.47, 3.61, 4.02, 4.32, 4.58, 5.55.

The second dataset represents the number of million revolutions before failure for each of the 23 ball bearings in the life tests. It was presented by Lawless [45], and the data observations areDataset 2. 17.88, 28.92, 33.0, 41.52, 42.12, 45.6, 48.8, 51.84, 51.96, 54.12, 55.56, 67.8, 68.44, 68.88, 84.12, 93.12, 98.64, 105.12, 105.84, 105.84, 127.92, 128.04, 173.4.

First, we fitted the GQLD model to both datasets. Then, we considered the Kolmogorov–Smirnov (KS) test with its value for quantifying the distance between the empirical distribution function of the real data and the cumulative distribution function using the estimators’ parameters in each dataset. The results are summarized in Table 9. As shown in this table, the value for the corresponding critical value in each dataset is greater than 5%, which indicates that the GQLD model fitted both datasets well.

To show the superiority of RSS over the SRS using the different estimation methods, we considered the Kolmogorov–Smirnov (KS) test but now for quantifying the distance between the empirical distribution function of the real data and the cumulative distribution function using the estimators’ parameters in each design, based on the choice of n and k. Note that we used the KS here as an alternative to the mean squared error and relative bias, and it is defined in our case aswhere is the sample size. Of course, estimators with lower KS values and higher value (greater than 5%) are better than the other competitions. Recall that the MLE estimators based on SRS for all datasets are considered the real population parameters.

For the first dataset, we considered the SRS design, and for each estimation method, we calculate the estimators using a sample of size . Then, we used a sample of sizes  = 2 and  = 6 for calculating estimators using the RSS design, and based on the cycles shown in Table 10, we compare SRS and RSS designs in terms of the KS distance value and p value. The results are given in Table 11, and the corresponding fittings are displayed in Figure 1.

For the second dataset, the sample size is selected as in the SRS design, while we used and for calculating estimators based on the RSS design using the cycles in Table 12. The estimators’ values, KS distance, and values are computed and summarized in Table 13, and the corresponding fittings are displayed in Figure 2.

The results in Tables 11 and 13 indicate that for the estimates based on the RSS design, the KS distance values are less than their counterparts using SRS design, and the corresponding values based on the RSS estimators are greater than those by SRS design. Figures 1 and 2 support this claim.

5. Conclusion

In this paper, RSS-based estimation is presented for the GQLD. Six estimation methods are considered, including the maximum likelihood, the maximum product of spacings, ordinary least squares, weighted least squares, Cramer–von Mises, and Anderson–Darling methods. The performances of the proposed estimators are compared with their SRS counterparts using a simulation study and two applications of real data. The numerical simulation results demonstrate that the proposed RSS estimators are better than their SRS counterparts in terms of the MSE for all results presented in the tables based on the same number of measuring units. The results of the real data also confirm the superiority of the RSS design over the SRS design.

For future works, the authors are interested in modifying the GQLD to the transmuted GQLD (see, for example, [46]) and estimating its parameters using the modified robust extreme ranked set sampling [47].

Data Availability

The real datasets related to guinea pigs and ball bearings used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors thank the Deanship of Scientific Research at King Khalid University for supporting and funding this work through the research group program under grant no. R.G.P. 2/82/42.