Estimation of Population Median under Robust Measures of an Auxiliary Variable

Irfan, Muhammad; Javed, Maria; Shongwe, Sandile C.; Zohaib, Muhammad; Haider Bhatti, Sajjad

doi:https://doi.org/10.1155/2021/4839077

Mathematical Problems in Engineering

On this page

Abstract Introduction Data Availability Conflicts of Interest References Copyright Related Articles

Special Issue

Robust Estimation Methods in the Presence of Extreme Observations

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 4839077 | https://doi.org/10.1155/2021/4839077

Estimation of Population Median under Robust Measures of an Auxiliary Variable

Muhammad Irfan,¹Maria Javed,¹Sandile C. Shongwe,²Muhammad Zohaib,¹and Sajjad Haider Bhatti¹

Academic Editor: Ishfaq Ahmad

Received30 Jul 2021

Accepted01 Sept 2021

Published18 Sept 2021

Abstract

In this paper, a generalized class of estimators for the estimation of population median are proposed under simple random sampling without replacement (SRSWOR) through robust measures of the auxiliary variable. Three robust measures, decile mean, Hodges–Lehmann estimator, and trimean of an auxiliary variable, are used. Mathematical properties of the proposed estimators such as bias, mean squared error (MSE), and minimum MSE are derived up to first order of approximation. We considered various real-life datasets and a simulation study to check the potentiality of the proposed estimators over the competitors. Robustness is also examined through a real dataset. Based on the fascinating results, the researchers are encouraged to use the proposed estimators for population median under SRSWOR.

1. Introduction

Extensive work has been done on the estimation of the population mean, proportion, variance, regression coefficient, and so forth; but very little attention has been made to propose the efficient estimators of the median. In many situations, researchers are often interested in dealing with variables such as income, expenditure, taxes, consumption, and production; and the latter variables have highly skewed distributions. In such situations, the median is considerably a more appropriate measure of location than the mean. The problem of estimation of median under simple random sampling scheme has been discussed by Gross [1], Sedransk and Meyer [2], and Smith and Sedransk [3]. Kuk and Mak [4] were the first authors to investigate the estimation of the median using auxiliary information. After Kuk and Mak’s [4] estimator, Singh et al. [5], Aladag and Cingi [6], Solanki and Singh [7], Shabbir and Gupta [8], Baig et al. [9], and Shabbir et al. [10] have developed different estimators for estimating finite population median based on the known conventional measures of the auxiliary variable under different sampling schemes. A brief explanation of Kuk and Mak’s [4] estimator is as described as follows.

Let be the study and the auxiliary variables selected from a finite population of size “” under simple random sampling without replacement (SRSWOR) subject to the constraint . Further let be the values of the units of the population and sample, respectively. Let be the population median of the study and auxiliary variables with the probability density functions given by , respectively. We further assume that are positive.

Suppose that are the values of sample units in ascending order; furthermore, let be the integer such that and are the proportion of values in the sample which are less than or equal to Kuk and Mak [4] considered a two-way classification as given in Table 1.

Suppose that and are the sample estimators of ; then the correlation coefficient between is ranging from −1 to +1 as increases from 0 to 0.5, where is the proportion of units in the population with . Gross [1] proved that is consistent and asymptotically normally distributed with mean and variancewhere is the sampling fraction.

Efficiency of the ratio, product, and regression type estimators are ambiguous in the presence of the extreme values/outlier(s) in the dataset. In our present study, the problem under consideration is to estimate the median for finite population and suggest some generalized classes of estimators by utilizing known robust measures of an auxiliary variable under SRSWOR. The novelty of this work is as follows:(i)Robust measures (i.e., decile mean, Hodges–Lehmann estimator, and trimean) of an auxiliary variable are utilized for the first time to investigate the progressive estimation of the population median(ii)A variety of estimators can be generated through the proposed generalized estimator(iii)Robustness study is examined to check the performance of the proposed generalized estimator in the presence of outlier

The following relative error terms and notations are used to obtain the mathematical properties such as bias, mean squared error (MSE), and minimum MSE of various estimators: such that .where

The rest of the article is organized in the following way. Section 2 gives comprehensive details of existing estimators for the population median. Section 3 proposes generalized classes of estimators for estimating population median using robust measures of an auxiliary variable. Bias, mean squared error (MSE), and minimum MSE of generalized classes of estimators are derived up to the first degree of approximation in the same section. Four real-life datasets and a simulation study are performed in Section 4 to check the potential of the new estimators as compared to the existing ones. Robustness of the proposed estimators is evaluated by carrying out a real-life dataset in Section 5. Section 6 contains the concluding remarks and some recommendations.

2. Existing Median Estimators

The major drawback of all the suggested estimators for estimating population median is that they are based on the usual conventional measures of an auxiliary variable. In this section, we discuss the usual and well-known estimators for estimating population median under SRSWOR as suggested by different authors.

Kuk and Mak [4] suggested a ratio-type estimator by assuming the known median of the variable.

The expression for mean square error of estimator is given as

The exponential ratio-type estimator for estimating median is given as

The MSE of up to the first degree of approximation is given by

Singh [11] developed an unbiased difference estimator which is given bywhere is an unknown constant whose value needs to be determined.

Minimum MSE of up to the first degree of approximation is as follows:

Remark 1. The MSE of is always smaller than the MSE of if respectively.
Rao [12] and Gupta et al. [13], respectively, suggested three difference types of estimators for estimating median aswhere are unknown constants.
The minimum MSE of at optimum values of is given byThe minimum MSE of at optimum values of is given byThe minimum MSE of at optimum values of is given byShabbir and Gupta [8] suggested a generalized difference type estimator for the estimation of median aswhere are unknown constants whose values need to be determined, are the known population parameters, and are the scalar quantities.

Remark 2. By substitution of the scalar quantities as , equation (14) becomesThe minimum MSE of at optimum values of is given by

3. Proposed Generalized Estimator

One eminent disadvantage of existing estimators/class of estimators is that they are typically based on conventional measures. Efficiency of the estimators is uncertain in the occurrence of the extreme values in the dataset. In this section, we define a generalized class of estimators for the estimation of population median using robust measures of an auxiliary variable with the linear combination of nonconventional measures: quartile deviation, midrange, interquartile range, and quartile average. We included three robust measures: decile mean suggested by Rana et al. [14], Hodges–Lehmann estimator suggested by Hettmansperger and McKean [15], and the trimean suggested by Wang et al. [16]. For more details of these robust measures, see the works of Irfan et al. [17, 18].

A generalized estimator for the estimation of population median iswhere are suitably chosen constants, and takes on the values for designing new estimators. Note that and may be any constant values or functions of the known robust measures as well as nonconventional measures associated with variable.

Remark 3. Robust measures related to are the following: Trimean: Hodges–Lehmann: Decile mean:

Remark 4. The nonconventional measures (i.e., interquartile range, midrange, quartile average, and quartile deviation) of an auxiliary variable can be defined as follows: Interquartile range: Midrange: Quartile average: Quartile deviation:

Remark 5. By putting different values of in equation (17), we get the following families of estimators:(i)Put ; proposed family of estimators reduces to(ii)Put ; proposed family of estimators reduces to(iii)Put ; proposed family of estimators reduces to(iv)Put ; proposed family of estimators reduces to(v)Put ; proposed family of estimators reduces to

Remark 6. When we put robust measures of auxiliary variable with the linear combination of median, quartile deviation, midrange, interquartile range, and quartile average of an auxiliary variable in equation (17), we obtain different series of estimators such as . Some members of the class of estimator are presented in Table 2. Placing the same values of and in , we obtain a number of estimators.

Remark 7. Putting appropriate constants or known conventional parameters of the auxiliary variable in place of and in equation (17), we can get many optimal estimators. Conventional parameters associated with auxiliary variable are variance, standard deviation, coefficient of variation, coefficient of skewness, coefficient of kurtosis, coefficient of correlation, and so forth.

3.1. Bias, MSE, and Minimum MSE of

The suggested generalized class of estimators in terms of is expressed as follows:

After some simplification of equation (23), we havewhere .

Subtracting from both sides of equation (24), we get

The bias of the proposed estimators, , is defined as

Taking expectations on both sides of equation (25), we get the bias of generalized class of estimators :

The MSE of the proposed estimators, , is defined as

Squaring both sides of equation (25), we have

Taking expectations on both sides of equation (29), we get the MSE of proposed estimators up to the first order of approximation aswhere

Partially differentiating equation (30) with respect to and equating them to zero, we get the optimal values of as follows:

Placing these optimal values in equation (30), we obtained the minimum MSE as given by

4. Application

In this section, comparison of the estimators with other existing estimators under study is given by using real-life application and simulated datasets.

4.1. Real-Life Application

We evaluated the performance of proposed class of estimators as compared to other competing estimators in terms of the MSE. For this purpose, we selected four real-life datasets: Population 1: source: Singh [11]. X = number of fish caught by the marine recreation fishermen in the previous year 1994. Population 2: source: Koyuncu and Kadilar [19]. Population 3: source: Singh [11]. X = number of fish caught by the marine recreation fishermen in the previous year 1993. Population 4: source: Murthy [20].

Table 3 presents the detailed descriptions of each of the abovementioned populations.

We calculated the MSE and minimum MSE of all the estimators, that is, , for populations 1–4. Expressions for the MSE of all the existing and proposed estimators are given in Sections 2 and 3 in detail. All empirical results are summarized in Tables 4–9 and the important deductions are as follows:(i)performs better than all existing estimators, that is, (ii)All the proposed estimators, that is, , have minimum MSE as compared to all existing estimators(iii)A deep insight of columns of provides the least MSE among all other classes of proposed estimators

4.2. Simulation Study

A Monte Carlo simulation study is conducted to assess the performance of the proposed generalized estimators through a real population. We consider a real-life application of primary and secondary schools for 923 districts of Turkey in 2007, considering number of teachers as study variable and number of enrolled students as auxiliary variable (source: [19]). The following are some important measures of the dataset:

The following steps are made to carry out the simulation study: Step 1: select a SRSWOR of size from the population of size Step 2: use sample data from step 1 to find the MSE of all the existing and proposed estimators Step 3: perform 20,000 iterations to conduct step 1 and step 2 Step 4: get 20,000 values for MSE of all existing and proposed estimators Step 5: take the average of 20,000 values obtained in step 4 to get the simulated MSE of each estimator

The following is revealed from Table 10:(i) performs better than all existing estimators, that is, (ii)Minimum MSE of all the proposed estimators is the least as compared to all the existing estimators under study(iii)As sample size increases, there is a decrease in the minimum MSE of all the proposed estimators

It is concluded that our generalized estimator impeccably performs the best in the presence of extreme value(s).

5. Robustness of

In this section, robustness is examined to check the performance of the proposed generalized estimator as compared to other existing estimators under study. If the estimator performs efficiently in the presence of the extreme values, the estimator is called a robust estimator. For this purpose, we consider a real-life application taken from Punjab development statistics (PDS) for the year of 2012 [21]. For the deep study regarding robustness, different sample sizes are taken . The following are the important statistics of the data:

Scatter plot confirms the presence of the extreme value in the dataset. Scatter plot can be seen in Figure 1. Therefore, we can access the robustness of the generalized estimator for this dataset. Numerical results based on the robustness study are reported in Table 11. It is revealed from Table 11 that the minimum MSE of all the proposed estimators is the least as compared to all the existing estimators under study. Moreover, as the sample size increases, the minimum MSE of all the proposed estimators decreases. Therefore, it is concluded that our proposed estimator performs impeccably in the presence of the extreme value(s).

6. Concluding Remarks and Recommendations

We proposed the generalized classes of estimators for estimating population median under simple random sampling using robust measures of an auxiliary variable. Bias, mean squared error, and minimum mean squared error of the proposed generalized classes are derived up to the first degree of approximation. Four real-life datasets are used to check the numerical performance of the new estimators. A simulation study through a real dataset is also conducted to assess the potential of suggested classes of estimators. Robustness is also examined through a real dataset. On the basis of numerical findings, it is concluded that the new generalized classes can generate optimum estimators. Therefore, use of the proposed generalized class is recommended for future applications.

The possible extensions of this work are to estimate the following: (1) finite population median under other sampling designs like stratified random sampling, double sampling, rank set sampling, and so forth; (2) other unknown finite population parameters including mean, variance, and proportions; and (3) population median in the presence of nonsampling errors.

Data Availability

The data used to support the findings of this study are available within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

S. Gross, “Median estimation in sample surveys,” in Proceedings of the Section on Survey Research Methods, American Statistical Association Ithaca, Alexandria, VA, USA, 1980.
View at: Google Scholar
J. Sedransk and J. Meyer, “Confidence intervals for the quantiles of a finite population: simple random and stratified simple random sampling,” Journal of the Royal Statistical Society: Series B, vol. 40, no. 2, pp. 239–252, 1978.
View at: Publisher Site | Google Scholar
P. Smith and J. Sedransk, “Lower bounds for confidence coefficients for confidence intervals for finite population quantiles,” Communications in Statistics—Theory and Methods, vol. 12, no. 12, pp. 1329–1344, 1983.
View at: Google Scholar
A. Y. C. Kuk and T. K. Mak, “Median estimation in the presence of auxiliary information,” Journal of the Royal Statistical Society: Series B, vol. 51, no. 2, pp. 261–269, 1989.
View at: Publisher Site | Google Scholar
S. Singh, A. H. Joarder, and D. S. Tracy, “Median estimation using double sampling,” Australian & New Zealand Journal of Statistics, vol. 43, no. 1, pp. 33–46, 2001.
View at: Publisher Site | Google Scholar
S. Aladag and H. Cingi, “Improvement in estimating the population median in simple random sampling and stratified random sampling using auxiliary information,” Communications in Statistics—Theory and Methods, vol. 44, no. 5, pp. 1013–1032, 2015.
View at: Publisher Site | Google Scholar
R. S. Solanki and H. P. Singh, “Some classes of estimators for median estimation in survey sampling,” Communications in Statistics—Theory and Methods, vol. 44, no. 7, pp. 1450–1465, 2015.
View at: Publisher Site | Google Scholar
J. Shabbir and S. Gupta, “A generalized class of difference type estimators for population median in survey sampling,” Hacettepe Journal of Mathematics and Statistics, vol. 46, no. 5, pp. 1015–1028, 2017.
View at: Google Scholar
A. Baig, S. Masood, and T. Ahmed Tarray, “Improved class of difference-type estimators for population median in survey sampling,” Communications in Statistics—Theory and Methods, vol. 49, no. 23, pp. 5778–5793, 2020.
View at: Publisher Site | Google Scholar
J. Shabbir, S. Gupta, and G. Narjis, “On improved class of difference type estimators for population median in survey sampling,” Communications in Statistics—Theory and Methods, 2021, In press.
View at: Publisher Site | Google Scholar
S. Singh, Advances Sampling Theory and Applications: How Michael Selected Amy, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2003.
T. J. Rao, “On certail methods of improving ration and regression estimators,” Communications in Statistics—Theory and Methods, vol. 20, no. 10, pp. 3325–3340, 1991.
View at: Publisher Site | Google Scholar
S. Gupta, J. Shabbir, and S. Ahmad, “Estimation of median in two-phase sampling using two auxiliary variables,” Communications in Statistics—Theory and Methods, vol. 37, no. 11, pp. 1815–1822, 2008.
View at: Publisher Site | Google Scholar
S. Rana, M. Siraj-Ud-Dulah, and H. Midi, “Decile mean: a new robust measure of central tendency,” Chiang Mai Journal of Science, vol. 39, no. 3, pp. 478–485, 2012.
View at: Google Scholar
T. Hettmansperger and J. W. McKean, Robust Nonparametric Statistical Methods, Chapman & Hall/CRC Press, Boca Raton, FL, USA, 2nd edition, 2011.
T. Wang, Y. Li, and H. Cui, “On weighted randomly trimmed means,” Journal of Systems Science and Complexity, vol. 20, no. 1, pp. 47–65, 2007.
View at: Publisher Site | Google Scholar
M. Irfan, M. Javed, and Z. Lin, “Optimized estimation for population mean using conventional and non-conventional measures under the joint influence of measurement error and non-response,” Journal of Statistical Computation and Simulation, vol. 88, no. 12, pp. 2385–2403, 2018.
View at: Publisher Site | Google Scholar
M. Irfan, M. Javed, and Z. Lin, “Improved estimation of population mean through known conventional and non-conventional measures of auxiliary variable,” Iranian Journal of Science and Technology Transaction A-Science, vol. 43, no. 4, pp. 1851–1862, 2019.
View at: Publisher Site | Google Scholar
N. Koyuncu and C. Kadilar, “Family of estimators of population mean using two auxiliary variables in stratified random sampling,” Communications in Statistics—Theory and Methods, vol. 38, no. 14, pp. 2398–2417, 2009.
View at: Publisher Site | Google Scholar
M. N. Murthy, Sampling Theory and Methods, Statistical Publishing Society, Calcutta, India, 1967.
PDS, Punjab Development Statistics, Bureau of Statistics, Government of the Punjab, Lahore, Pakistan, 2012.

Copyright

Copyright © 2021 Muhammad Irfan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies