Poisson Regression-Based Mean Estimator

Shahzad, Usman; Shahzadi, Shabnam; Afshan, Noureen; Al-Noor, Nadia H.; Alilah, David Anekeya; Hanif, Muhammad; Anas, Malik Muhammad

doi:https://doi.org/10.1155/2021/9769029

Mathematical Problems in Engineering

On this page

Abstract Introduction Data Availability Conflicts of Interest References Copyright Related Articles

Special Issue

Robust Estimation Methods in the Presence of Extreme Observations

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 9769029 | https://doi.org/10.1155/2021/9769029

Poisson Regression-Based Mean Estimator

Usman Shahzad,¹Shabnam Shahzadi,²Noureen Afshan,¹Nadia H. Al-Noor,³David Anekeya Alilah,⁴Muhammad Hanif,¹and Malik Muhammad Anas¹

Academic Editor: Amer Al-Omari

Received15 Jul 2021

Accepted07 Oct 2021

Published18 Oct 2021

Abstract

The most frequent method for modeling count responses in numerous investigations is the Poisson regression model. Under simple random sampling, this paper offers utilizing Poisson regression-based mean estimator and discovers its associated formula of the mean square error (MSE). The MSE of the proposed estimator is compared to the MSE of traditional ratio estimators in theory. As a result of these evaluations, the proposed estimator has been proven to be more efficient than traditional estimators. Furthermore, the practical results corroborated the theoretical findings.

1. Introduction

The emphasis on using supplemental/auxiliary data to improve estimate precision might be the criterion that distinguishes sample survey theory from other statistical theories. Auxiliary data are used in almost every important phase of a sample survey, including stratification, selection probability creation, and the formula for the population parameter estimator of interest. The assessment of the total or mean of a population is a common goal of most surveys. Utilizing auxiliary data to improve the estimation of the mean population has been considered by numerous authors, such as Upadhyaya et al. [1], Upadhyaya and Singh [2], Koyuncu [3], Shahzad [4], Abid et al. [5], Shahzad et al. [6–8], Zaman and Bulut [9, 10], Ali et al. [11], and Zaman [12].

When a positive correlation between values of an auxiliary variable with the study variable is available, the ratio-type and regression estimators are good choices for mean estimation. In sampling theory, auxiliary variable’s population information, like the coefficient of variation or kurtosis, is frequently employed to improve the efficiency of the estimate for a population mean for ratio estimators. On the other hand, the presence of outliers or extreme values in the data degrades the efficiency of traditional estimation methods. For minimizing the impact of outliers in ratio-type mean estimators, Kadilar et al. [13] presented a robust regression strategy based on the Huber-M approach. For the mean estimation with a simple random sampling (SRS) technique, Oral and Kadilar [14, 15] used two approaches that are modified maximum likelihood and its integrated method. Through combining the ratio estimators given in Zaman and Bulut [9], Zaman [12] created a new class of robust ratio-type estimators. Using robust regression estimates as well as robust covariance matrices under stratified random sampling, Zaman and Bulut [10] proposed novel regression-type estimators. More recently, with the case of sensitive research under SRS, Ali et al. [11] proposed a class of robust regression-type estimators.

Furthermore, with the count data, when the mean is large enough, it is inconvenient to apply a linear regression model to such data, even though the Poisson distribution converges to the normal distribution. Negative prediction values are possible because the linear model links the predicted value with auxiliary (or explanatory or independent) variables. Also, in linear regression, the validity of hypothesis tests is contingent on the assumption of constant variance of the study (or response or dependent) variable. For count data, these assumptions are invalid. As a result, the Poisson regression model is the most commonly employed approach for modeling count data in the applied sciences.

In Poisson regression, the study variable is the number of events that occur at a particular period, with a Poisson distribution given byand its mean and variance are both the same, .

The natural log-likelihood function is defined as follows:

Let be the explanatory variable matrix of order . Then, the relationship between and row of matrix , associating with , iswhere are the coefficients of regression parameters. Such a model is well known as the Poisson regression model. represents the maximum likelihood estimator of , and it may be obtained by differentiating (3) with respect to .

Iterative approaches such as algorithms of Fisher scoring and Newton–Raphson are employed to solve these equations (see Cameron and Trivedi [16], Montgomery et al. [17], and Koҫ [18]).

This article focused on utilizing Poisson regression to estimate the parameter mean of the study variable under . The rest of the article is constructed as follows. First, we introduce the notations and review some existing ratio-type mean estimators with their MSE. Second, we deﬁne a novel mean estimator based on Poisson regression and its MSE. Furthermore, the proposed estimator’s condition efficiency is inspected theoretically. Then, using numerical illustrations based on two real datasets, we examine the relative efficiency of the proposed estimator over the adapted estimators. Finally, some concluding remarks are introduced.

2. Some Existing Ratio-Type Estimators of Mean with Their MSE

This section outlines some of the existing population mean estimators that employed SRS and relied on known information on the auxiliary variable’s conventional parameters to improve the mean estimators’ efficiency. Before delving into the specifics of the existing population mean estimators, the notations used are as follows. : population size; : sample size. : sample ratio, . : population means associated with study and auxiliary variables , respectively. : sample means associated with and , respectively. : population coefficients of variation associated with and , respectively. : population coefficient of the kurtosis associated with . : population variance associated with and , respectively. : sample variance associated with and , respectively. : population and sample covariance between and , respectively. : correlation coefficient between and . : coefficient of slope attained by the least squares method, and .

As one of the important estimators for the population mean in SRS, with assumed known, Kadilar and Cingi [19] proposed the following ratio estimators inspired by Sisodia and Dwivedi [20] and Upadhyaya et al. [1]:

Kadilar and Cingi [19] also provided the following formula for the MSE associated with their estimators:where the population ratios can be gained as follows:

Recently, Koҫ [18] offered his idea to improve the aforementioned estimators, introducing new ratio estimators based on Poisson regression as follows:where represents the coefficient of slopes attained by Poisson regression.

Koҫ [18] also provided the following formula for the MSE associated with his suggested estimators that is the same as the MSE equation in (6), but it is evident that should be changed by , whose value is acquired from the Poisson regression model.

In addition, Koҫ [18] demonstrated that his estimators are more efficient than Kadilar and Cingi [19] estimators if any of the following conditions are satisfied:

2.1. Proposed Estimator Based on Poisson Regression and Its MSE

The newly constructed estimator of the mean population can be arranged in the frameworks of Zaman and Bulut [10] and Ali et al. [11]. But we are implementing their frameworks based on Poisson regression as

Furthermore, taking advantage of established results, with some basic algebra, and eliminating tedious or futile calculations, we mention the MSE expression of the proposed mean estimator as

2.2. Efficiency Comparisons

The efficiency condition for the proposed mean estimator may be found by comparing the MSE of the proposed estimator in (12) to the MSE of Haydar’s estimators in (9):

Now

The proposed Poisson mean estimator is more efficient than Haydar’s estimators, if condition is satisfied.

2.3. Numerical Illustrations

Here, we use two real datasets to evaluate the performance of proposed and existing estimators.

Population I (Pop-I). We consider the dataset collected between 2006 and 2010 from the Afyon Respiratory Disease Hospital and the Afyon Environmental Department Air Pollution Unit, which was used in [18]. The number of patients admitted to the hospital on a weekly basis was taken as the dependent variable , and PM10 was taken as the explanatory variable .

Population II (Pop-II). We consider the dataset obtained from TUIK of 81 provinces in 2019 used in [18]. The number of people who died due to traffic accidents was taken as a dependent variable , and the number of motor vehicles was taken as explanatory variable .

For these datasets, using SRS, we consider and . The characteristics of the two populations, as well as the values of population ratios, are given in Tables 1 and 2.

Based on the MSE values of the proposed estimator and reviewed estimators, we calculate the relative efficiency (RE) values of the proposed estimators with respect to reviewed estimators, say , as follows:

Tables 3 and 4 give the outcomes for the relative efficiencies.

After assessing the relative efficiency values, in Tables 3 and 4, it is observed that all values exceed 100. As a result, the proposed Poisson regression estimator is more efficient than the reviewed estimators. On the other hand, it is observed that all relative efficiency values of proposed estimator with respect to Kadilar and Cingi [19] estimators are greater than corresponding values with respect to Haydar estimators, and thus

The proposed estimator and Haydar’s estimators outperform Kadilar and Cingi [19] estimators, which is an expected result due to the employment of the count data in this analysis, and the proposed estimator has the best performance. The consequences of relative efficiency values are also provided graphically in Figures 1–3 .

Furthermore, we investigated the efficiency condition of the proposed estimator as follows.

For Pop-I:

Therefore, the condition is fulfilled.

For Pop-II:

Therefore, the condition is fulfilled.

3. Concluding Remarks

In this article, the mean estimator based on the Poisson regression model has been proposed under simple random sampling. It is found that the proposed estimator produced more efficient results than the ratio estimators proposed in Koҫ [18] and Kadilar and Cingi [19]. For the numerical investigation, the performances of these estimators are compared based on two real populations, and it is seen that the new proposed estimator in terms of relative efficiency is more proficient than the reviewed estimators (see Figures 1–3), as all values surpass 100. As a result, we recommend emphatically using the proposed mean estimator over the other estimators considered in this study for such count data analysis. Furthermore, the estimator developed here can be utilized to calculate new estimates in other count models. The proposed estimator can also be derived in future studies using the concepts of Zaman and Bulut and Ali et al. [10, 11].

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

References

L. N. Upadhyaya, H. P. Singh, and J. W. E. Vos, “On the estimation of population means and ratios using supplementary information,” Statistica Neerlandica, vol. 39, no. 3, pp. 309–318, 1985.
View at: Publisher Site | Google Scholar
L. N. Upadhyaya and H. P. Singh, “Use of transformed auxiliary variable in estimating the finite population mean,” Biometrical Journal, vol. 41, no. 5, pp. 627–636, 1999.
View at: Publisher Site | Google Scholar
N. Koyuncu, “Efficient estimators of population mean using auxiliary attributes,” Applied Mathematics and Computation, vol. 218, no. 22, pp. 10900–10905, 2012.
View at: Publisher Site | Google Scholar
U. Shahzad, “On the estimation of population mean under systematic sampling using auxiliary attributes,” Oriental Journal Physical Sciences, vol. 1, no. 1–2, pp. 21–25, 2016.
View at: Publisher Site | Google Scholar
M. Abid, N. Abbas, H. Z. Nazir, and Z. Lin, “Enhancing the mean ratio estimators for estimating population mean using non-conventional location parameters,” Revista Colombiana de Estadística, vol. 39, no. 1, pp. 63–79, 2016.
View at: Publisher Site | Google Scholar
U. Shahzad, M. Hanif, N. Koyuncu, and A. V. Garcia Luengo, “A new family of estimators for mean estimation alongside the sensitivity issue,” Journal of Reliability and Statistical Studies, vol. 10, pp. 43–63, 2017.
View at: Google Scholar
U. Shahzad, M. Hanif, N. Koyuncu, and A. V. G. Luengo, “A regression type estimator for mean estimation under ranked set sampling alongside the sensitivity issue,” Communications Faculty of Science University of Ankara Series A1 Mathematics and Statistics, Series A1: Mathematics and Statistics, vol. 65, no. 2, pp. 2037–2049, 2019.
View at: Publisher Site | Google Scholar
U. Shahzad, P. F. Perri, and M. Hanif, “A new class of ratio-type estimators for improving mean estimation of nonsensitive and sensitive variables by using supplementary information,” Communications in Statistics - Simulation and Computation, vol. 48, no. 9, pp. 2566–2585, 2019.
View at: Publisher Site | Google Scholar
T. Zaman and H. Bulut, “Modified ratio estimators using robust regression methods,” Communications in Statistics-Theory and Methods, vol. 48, no. 8, pp. 2039–2048, 2019.
View at: Publisher Site | Google Scholar
T. Zaman and H. Bulut, “Modified regression estimators using robust regression methods and covariance matrices in stratified random sampling,” Communications in Statistics-Theory and Methods, vol. 49, no. 14, pp. 3407–3420, 2020.
View at: Publisher Site | Google Scholar
N. Ali, I. Ahmad, M. Hanif, and U. Shahzad, “Robust-regression-type estimators for improving mean estimation of sensitive variables by using auxiliary information,” Communications in Statistics-Theory and Methods, vol. 50, no. 4, pp. 979–992, 2021.
View at: Publisher Site | Google Scholar
T. Zaman, “Improvement of modified ratio estimators using robust regression methodsﬁed ratio estimators using robust regression methods,” Applied Mathematics and Computation, vol. 348, pp. 627–631, 2019.
View at: Publisher Site | Google Scholar
C. Kadilar, M. Candan, and H. Cingi, “Ratio estimators using robust regression,” Hacettepe Journal of Mathematics and Statistics, vol. 36, no. 2, pp. 181–188, 2007.
View at: Google Scholar
E. Oral and C. Kadilar, “Improved ratio estimators via modified maximum likelihood,” Pakistan Journal of Statistics, vol. 27, no. 3, pp. 269–282, 2011.
View at: Google Scholar
E. Oral and C. Kadilar, “Robust ratio-type estimators in simple random sampling,” Journal of the Korean Surgical Society, vol. 40, no. 4, pp. 457–467, 2011.
View at: Publisher Site | Google Scholar
A. C. Cameron and P. K. Trivedi, Regression Analysis of Count Data, Cambridge University Press, Cambridge, 1998.
D. C. Montgomery, E. A. Peck, and G. G. Vining, Introduction to Linear Regression Analysis, John Wiley and Sons, Hoboken, 4th edition, 2006.
H. Koç, “Ratio-type estimators for improving mean estimation using Poisson regression method,” Communications in Statistics - Theory and Methods, vol. 50, no. 20, pp. 4685–4691, 2021.
View at: Publisher Site | Google Scholar
C. Kadilar and H. Cingi, “Ratio estimators in simple random sampling,” Applied Mathematics and Computation, vol. 151, no. 3, pp. 893–902, 2004.
View at: Publisher Site | Google Scholar
B. V. S. Sisodia and V. K. Dwivedi, “A modified ratio estimator using coefficient of variation of auxiliary variable,” Journal of the Indian Society of Agricultural Statistics, vol. 33, no. 2, pp. 13–18, 1981.
View at: Google Scholar

Copyright

Copyright © 2021 Usman Shahzad et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies