Abstract
In this paper, we propose a generalized class of exponential-type estimators for estimating the finite population distribution function using dual auxiliary variables under stratified sampling. The biases and mean squared errors (MSEs) of the proposed class of estimators are derived up to the first order of approximation. The empirical and theoretical study of comparisons is discussed. Four populations are taken for the support of the theoretical findings. It is observed that the proposed class of estimators performs better as compared to all other considered estimators in stratified sampling.
1. Introduction
In survey sampling, the auxiliary information is often used to increase the precision of an estimator of population parameter(s), such as population mean, median, distribution function, quantiles, and standard deviation, etc., exist in the literature, which need single or two auxiliary information.
Our primary goal is to enhance the precision of the estimator; for this reason, we use stratified random sampling. If the population of interest is homogeneous, then simple random sampling performs good. But there is a situation when the population of interest is heterogeneous, in such situation, it is advisable to use the stratified random sampling instead of simple random sampling. In stratified random sampling, we split the whole aggregate into number of nonoverlapping groups or subgroups called strata. These groups are homogeneous entirely and sample is drawn independently from each stratum separately. To obtain the maximum benefit from stratification, the values of the Nh must be known. When the strata have been determined, a sample is drawn from each, and the drawings being made independently. In stratified sampling, every stratum is handled as separate population, and consequently samples are drawn independently from every stratum.
In other words, if SRS is used in each stratum for the selection of the sample, then the corresponding sample is called a stratified random sample. For good stratification, it requires that each stratum should be internally homogeneous but should externally differ from one another. Stratification may often produce gains in the precision of estimates. In stratified random sampling, the given population is divided into several strata. Then, from each stratum, a simple random sample is selected depending upon the size of the stratum. Estimators are first drawn from each stratum and then combined into a precise estimate of the population parameter.
A lot of work has been done on the estimation of the population mean. Some important references on the population mean estimation using auxiliary information include Diana [1], Kadilar and Cingi [2, 3], Shabbir and Gupta [4], Shabbir and Gupta [5], Shabbir and Gupta [5], Haq and Shabbir [6], Aladag and Cingi [7], Singh and Khalid [8], Malik and Singh [9], Muneer et al. [10], Shabbir and Gupta [11], Haq et al. [12], Kaur et al. [13], Ahmad and Shabbir [14], Singh and Khalid [15], Al-Marzouki et al. [16], Ahmad et al. [17], Ahmad et al. [18], and Ahmad et al. [19]. In these works, the authors have suggested improved ratio, product, and regression-type estimators for estimating the finite population mean. They introduced estimators which used the auxiliary information to estimate the population mean and total.
In the literature of sampling, the authors have estimated the DF using information on one or more auxiliary variable. Chambers and Dunstan [20] suggested an estimator for estimating the DF that requires information both on the study and auxiliary variables. Similarly, Rao et al. [21] and Rao [22] suggested ratio and difference/regression estimators for estimating the DF under a general sampling design. Kuk [23] suggested a kernel method for estimating the DF using the auxiliary information. Ahmed and Abu-Dayyeh [24] estimated the DF using information on multiple auxiliary variables. A calibration approach was used by Rueda et al. [25] to devise an estimator for estimating the DF. Singh et al. [26] considered the problem of estimating the DF and quantiles with the use of auxiliary information at the estimation stage of a survey. Moreover, Yaqub and Shabbir [27], Hussain et al. [28], and Hussain et al. [29] considered a generalized class of estimators for estimating the DF in the presence of non-response, while Hussain et al. [30] proposed two new families of estimators using dual auxiliary information under simple and stratified random sampling. Furthermore, Ahmad et al. [31] suggested a new estimator of DF using auxiliary information.
In this paper, we propose a new estimator for estimating the DF using information on the distribution function and mean of the auxiliary variable. The biases and mean squared errors (MSEs) of the existing and proposed estimators of the DF are derived under the first order of approximation. From theoretical and numerical comparisons, we can say that the proposed estimator is more precise than the existing adapted estimators when estimating the DF.
The rest of the paper is organized as follows. In Section 2, some notations are given. In Section 3, some existing estimators of the finite population mean for estimating the finite DF are studied. The proposed estimator is given in Section 4. In Sections 5 and 6, theoretical and numerical comparisons are made, respectively. In Section 7, interpretation of the results in tables is deliberated. Finally, conclusions are drawn in Section 8.
2. Notation
Consider a finite population of distinct units, which is divided into homogeneous strata, where the size of th stratum is , for , such that . Let and be the study and auxiliary variables which take values and , respectively, where and ; for estimating finite population distribution function, assume that a sample of size is drawn from the th stratum using simple random sampling without replacement, such that , where is the sample size.
: the study variable.
: the auxiliary variable.
Letand
: indicator variable based on ,
: indicator variable based on ,
: the population distribution function of for the th stratum,
: the population distribution function of ,
: the population distribution function of for the th stratum,
: the population distribution function of ,
: the population mean of for the th stratum,
: the population mean of ,
: the sample distribution function of for the th stratum,
: the sample distribution function of ,
: the sample distribution function of for the th stratum,
: the sample distribution function of ,
: the sample mean of for the th stratum,
: the sample mean of ,
: the population variance of for the th stratum,
: the population variance of for the th stratum,
: the population variance of for the th stratum,
: the population coefficient of variation of for the th stratum,
: the population coefficient of variation of for the th stratum,
: the population coefficient of variation of for the th stratum,
: the population covariance between and , for the th stratum,
: the population covariance between and , for the th stratum,
: the population covariance between and , for the th stratum,
: the population correlation coefficient between and for the th stratum,
: the population correlation coefficient between and for the th stratum,
: the population correlation coefficient between and for the th stratum,
: the population correlation coefficient between and ,
: the population correlation coefficient between and ,
: the population correlation coefficient between and ,where ,
: multiple correlation coefficient of on and .
In order to obtain the biases and mean squared errors (MSEs) of the adapted and proposed estimators of , we consider the following relative error terms.
Let , , and such that for , where is the mathematical expectation of . Let , ,
3. Existing Estimators
In this section, we briefly review some existing estimators of .(1)The conventional unbiased mean per unit estimator of is as follows: the reference of this estimator is not included because this is a conventional unbiased estimator under simple random sampling. The variance of is(2)Cochran [32] suggested the traditional ratio estimator of , which is given by The bias and MSE of , to first order of approximation, respectively, are The ratio estimator is better than , in terms of MSE, if .(3)Murthy [33] suggested the usual product estimator of , which is given by The bias and MSE of , to first order of approximation, are given by The product estimator is better than , in terms of MSE, if .(4)The conventional difference estimator of is where is an unknown constant. is an unbiased estimator of . The simplified minimum variance of at the optimum value of is(5)Rao [37] suggested an improved difference-type estimator of , which is given by where and are unknown constants. The bias and MSE of , to the first order of approximation, respectively, are The optimum values of and are respectively. The simplified minimum MSE of at the optimum values of and is given by(6)Bahl and Tuteja’s exponential ratio-type and product-type estimators [34] are given by The biases and MSEs of and , to first order of approximation, respectively, are(7)Grover and Kaur [35] suggested a generalized class of ratio-type exponential estimators, which is given by where and are unknown constants. The bias and MSE of , to the first order of approximation, respectively, are
The optimum values of and determined by minimizing (24) arerespectively. The minimum MSE of at the optimum values of and is given by
4. Proposed Class of Estimators
The precision of an estimator surges by using the appropriate secondary information at the estimation stage. In previous studies, the sample distribution function of the auxiliary variable was used to expand the productivities of the prevailing distribution function estimators. In a recent study, Hussain et al. [30] recommended to use ranks of the auxiliary variable as an additional auxiliary variable to increase the precision of an estimator of the population distribution function. Similarly, we use additional auxiliary information on sample mean and sample distribution function of the auxiliary variable along with the sample distribution function of study variable to estimate the finite CDF.
Using the above idea on the lines of Shukla et al. [36], we suggest a general class of exponential factor-type estimators which contains many stable and efficient estimators. By combining the idea of Bahl and Tuteja and Shukla et al. [34, 36], the first estimator is given bywhere
Substituting different values of (i = 1,2,3,4) in (18), we can generate many more different types of estimators from our general proposed class of estimators, which are given in Table 1.
Solving given in (28) in terms of errors, we havewhere
To first-order approximation, we have
Taking squaring and expectation of (33) to first order of approximation, we get the bias and MSE:
Differentiate (35) with respect to and , and we get the optimum values of and , i.e.,
Substituting the optimum values of and in (35), we get minimum of which is given bywhereis the multiple correlation coefficient of on and . Now by putting different values of in (28), some members of the proposed class of estimators can be obtained as(1)For and , The bias and MSE of are given by(2)For and , The bias and MSE of are given by(3)For and , The bias and MSE of are given by(4)For and , The bias and MSE of are given by(5)For and , The bias and MSE of are given by(6)For and , The bias and MSE of are given by(7)For and , The bias and MSE of are given by(8)For and , The bias and MSE of are given by(9)For and , The bias and MSE of are given by(10)For and , The bias and MSE of are given by(11)For and , The bias and MSE of are given by(12)For and , The bias and MSE of are given by(13)For and , The bias and MSE of are given by
5. Theoretical Comparison
(i)From (5) and (37),(ii)From (8) and (37),(iii)From (11) and (37),(iv)From (13) and (37),(v)From (17) and (37),(vi)From (21) and (37),(vii)From (23) and (37),(viii)From (27) and (37),6. Empirical Study
In this portion, we conduct a numerical study to judge the performances of the existing and proposed DF estimators. For this purpose, two datasets are taken. The summary statistics of these datasets are reported in Tables 2 and 3. The PRE of an estimator with respect to iswhere .
The PREs of DF estimators, computed from five populations, are given in Tables 4 and 5.
7. Interpretation of Results
As mention above, we used two datasets for numerical illustration. The proposed estimator and the existing estimators were compared between each other with respect to their MSE and PRE values. The results of PREs are presented in Tables 4 and 5. In Tables 2 and 3, we see the summary statistics about the populations. It is further noted that the proposed estimator is more precise than the existing distribution function estimators of Cochran [32], Murthy [33], Rao [37], and Grover and Kaur [38], in terms of MSEs and PREs.
8. Conclusion
In this paper, we proposed an improved class of estimators of finite population DF by utilizing real-life datasets on dual auxiliary variables in stratified random sampling (StRS) scheme. Bias and MSE expressions of a proposed class of estimators are acquired up to first order of approximation. Based on the theoretical and numerical results, the proposed class of estimators performs better than the existing estimators considered under stratified random sampling. From these findings, we suggest the utilization of the proposed estimators for efficient estimation of population distribution function in the presence of the auxiliary information under stratified random sampling.
Data Availability
All the data used for this study can be found inside the manuscript.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This study was fully supported by School of Statistics, Shanxi University of Finance and Economics. The 2nd author will pay the fee of this paper.