Linear Twin Quadratic Surface Support Vector Regression

Zhai, Qianru; Tian, Ye; Zhou, Jingyue

doi:https://doi.org/10.1155/2020/3238129

Mathematical Problems in Engineering

On this page

Abstract Introduction Related Works Discussion Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 3238129 | https://doi.org/10.1155/2020/3238129

Linear Twin Quadratic Surface Support Vector Regression

Qianru Zhai,¹Ye Tian,²and Jingyue Zhou¹

Academic Editor: Giuseppe Vairo

Received01 Nov 2019

Revised12 Feb 2020

Accepted26 Feb 2020

Published04 Apr 2020

Abstract

Twin support vector regression (TSVR) generates two nonparallel hyperplanes by solving a pair of smaller-sized problems instead of a single larger-sized problem in the standard SVR. Due to its efficiency, TSVR is frequently applied in various areas. In this paper, we propose a totally new version of TSVR named Linear Twin Quadratic Surface Support Vector Regression (LTQSSVR), which directly uses two quadratic surfaces in the original space for regression. It is worth noting that our new approach not only avoids the notoriously difficult and time-consuming task for searching a suitable kernel function and its corresponding parameters in the traditional SVR-based method but also achieves a better generalization performance. Besides, in order to make further improvement on the efficiency and robustness of the model, we introduce the 1-norm to measure the error. The linear programming structure of the new model skips the matrix inverse operation and makes it solvable for those huge-sized problems. As we know, the capability of handling large-sized problem is very important in this big data era. In addition, to verify the effectiveness and efficiency of our model, we compare it with some well-known methods. The numerical experiments on 2 artificial data sets and 12 benchmark data sets demonstrate the validity and applicability of our proposed method.

1. Introduction

Support Vector Machine (SVM) was first introduced by Vapnik in [1, 2]. It is a classical classification method based on the principle of structural risk minimization. Due to its great performance, it has attracted much attention in the field of machine learning and has been widely applied in many areas [3, 4]. Similarly, the SVM-based method, Support Vector Regression (SVR), is an efficient model for regression problems, which guarantees a strong generalization ability. In the literature, the traditional regression methods include General Regression Neural network [5], Multiple Linear Regression [6], and Ridge Regression [7]. Compared with these models, SVR achieves better robustness and generalization, since it considers not only the structural risk but also the experience risk. At present, SVM-based regression models are mainly composed of Support Vector Regression (SVR) [8], Least Square Support Vector Regression (LSSVR) [9], and Twin Support Vector Regression (TSVR) [10]. These models have been widely utilized in various areas, such as stock market forecasting [11, 12], image understanding [13, 14], and pattern recognition [15].

The classical SVR [8] obtains the final regression function by solving a quadratic programming problem in which the training error and complexity of the model are minimized in the objective function. Similar to SVM, SVR obtains some good statistical properties. Based on this work, LSSVR makes an improvement by introducing a series of equality constraints. In this way, it only needs to solve a linear system of equations which leads to a much lower computational complexity. Inspired by the twin support vector classification machine [16], Peng [10] proposed a method called TSVR. Unlike the traditional SVR with two parallel hyperplanes, this model uses two nonparallel hyperplanes to construct the final regression function. Due to its flexibility, TSVR has a better generalization ability than the traditional SVR. Moreover, another advantage of TSVR is that the final regression function is obtained by solving two small-sized quadratic programming problems, which also leads to a lower computational burden.

It is worth pointing out that, in order to deal with the nonlinear structures in most data sets, these three models first have to use the mapping functions to project the original points into some high-dimensional spaces. Since these mapping functions cannot be directly handled, researchers introduce some kernel functions to solve the dual problems. However, there is no universal rule to automatically choose a suitable kernel function for a given data set. Since the kernel function largely affects the final performances of these kernel-based models, the users have to spend a lot of time and effort in selecting a proper kernel function and its corresponding parameters. This is indeed a very tedious and hard work and damages the applicability of those models. Recently, Luo et al. [17] proposed a kernel-free fuzzy quadratic surface support vector machine (FQSSVM) model which directly generates a quadratic surface for the classification. In this way, it can skip the notorious searching process for a proper kernel function in the classical kernel-based SVM. Hence, this new model can save much effort of the user and greatly improve the total efficiency. Based on this leading work, some extensions of kernel-free SVM have been developed and applied [18–20]. The good results of these models demonstrate that the traditional kernel method can be replaced by the novel quadratic surface method.

In this paper, we propose a completely new linear twin quadratic surface support vector machine (LTQSSVR) model. The main contribution of our work can be concluded on three aspects. First, inspired by the cutting-edge idea of the FQSSVM model, we apply two nonparallel quadratic surfaces to build the regression model. In this way, we can successfully avoid the searching process of kernel function selection. Hence, the applicability and efficiency of our model get significantly improved. It is worth noting that, to the best of our knowledge, this is the first time to develop a kernel-free SVM-based regression model. Second, in order to increase the robustness of the model, we incorporate the 1-norm regularization into our model to measure the error between the predicted and actual values. The numerical experiment demonstrates the effectiveness and robustness of this extended model. Third, we equivalently transform the above 1-norm regularization regression model into a linear programming problem. Hence, we can further greatly relieve the computational burden and speed up the efficiency. Compared with those benchmark SVM-based regression models, our new model not only beats them with a slightly better performance but also leads far ahead with a much higher efficiency in building and solving the model.

The remaining paper is arranged as follows. Section 2 briefly introduces those benchmark works. Then, the TQSSVR and LTQSSVR are proposed in Sections 3 and 4, respectively. After that, we conduct a comprehensive numerical experiment to compare the performances of those models in Section 5. Finally, we summarize this paper in Section 6.

2.1. Support Vector Regression

First, we show some notations used in this paper. e denotes the vector of ones with an appropriate dimension, denotes the set of real numbers, denotes the n-dimensional vector space, and I denotes the identity matrix with an appropriate dimension. For a matrix , A_ij denotes the element in the i_th row and j_th column of A.

For a giving data set , where are input vectors and y_i = {−1, 1} are outputs. Let , where A_i = x_i. Let denote the output vector. The basic idea of the SVM model is to find a hyperplane f(x) = + b that separates the training points into two classes with a maximum level of separation [2].

Comparatively, the elements in the output vector are real numbers for the SVR model. Hence, the targets of SVR and SVM are various. Instead of constructing a proper classifier as that in SVM, SVR aims to find a suitable regressor. In other words, SVR wants to let all the training points as close to the regressor plane as possible, while SVM tries to let them as far away from this plane as possible. Specifically, SVR can be transformed into a classification problem by introducing an ϵ-insensitive tube, which allows a small error in fitting the training data. In this way, any error smaller than ϵ is ignored in the classification model. Let ξ₁ and ξ₂ be the slack vectors to measure the errors of those samples inside the ϵ-tube. Then SVR can be formulated as the following quadratic programming problem:

The interested readers can refer to [8] for more details.

Specifically, the first term of the objective function maximizes the margin of this classification problem. And the constraints indicate that all training points should be contained in that soft ϵ-tube. Here, C is a tradeoff parameter between margin and training errors.

2.2. Least Squares Support Vector Regression

LSSVR was first introduced by [9, 21]. Like SVR, the main idea of LSSVR is to seek the decision function in the form of f (x) = Aw + b and include all the points in a small region. Then, the regression function is obtained by solving the following quadratic programming problem [9, 22, 23]:where ξ is the slack vector. Similarly, its objective function is to maximize the margin while its constraints indicate all training points should be close to the regression plane. One significant difference between the above model and SVR model is that solving LSSVR is just equivalent to solving a linear equation set. Therefore, the computational process is relatively simple. On the other side, it should be pointed out that almost all training points contribute to the decision function; hence, LSSVR no longer has the sparsity of SVR.

2.3. Twin Support Vector Regression

TSVR is similar to TSVM as it also derives the following pair of nonparallel planes around the data points:

However, there are some differences between TSVR and TSVM. First, TSVM considers only one class of data points in each quadratic programming problem, while TSVR uses all data points in its both quadratic programming problems. Second, TSVM finds two hyperplanes such that each plane is close to one class and is as far as possible from the other class, whereas TSVR determines the up- or down-bound function by using only one group of constraints in each quadratic programming problem. Specifically, TSVR is obtained by solving the following pair of quadratic programming problems [10]:where C₁ and C₂ > 0 are the parameters and ξ and η are the slack vectors. The final regressor is generated by . TSVR is approximately four times faster than the standard SVR in theory due to its formulation.

2.4. Nonlinear Case and Dual Problems

For a nonlinear case, these three models (SVR, LSSVR, and TSVR) first need to project data points into a higher dimensional space via a mapping function ϕ (x): , d > m and then conduct the linear regression (A) + b = 0 in this new space [24]. Since the mapping function is difficult to calculate directly, researchers have to introduce a kernel function K (x_i, x_j) = (ϕ (x_i) ⋅ ϕ (x_j)) into their dual problems [25].

Following this way, the dual problem of SVR can be written as follows:where α and are the Lagrange multipliers that satisfy [8]. Then, the decision function of SVR is of the following form:

Similarly, we can obtain the dual problems of LSSVR by introducing the following Lagrangian multiplier:

The decision function of LSSVR is

Moreover, the dual problem of TSVR is in the following forms:wherewhere

The final regression result of TSVR is determined by the average values of f₁ (x) and f₂ (x) as follows:

It is worth noting that there are many kinds of kernel functions such as linear kernel, Gaussian kernel, and polynomial kernel. However, a user cannot find a general guideline to help him/her choose a suitable kernel for a given data set. And the choice of kernel function would largely affect the regression performance. Hence, the time-consuming searching process for a proper kernel function and its corresponding parameters significantly drags down the total efficiency.

3. Twin Quadratic Surface Support Vector Regression

In this paper, we propose a totally new kernel-free TQSSVR model, which directly generates a quadratic surface for regression in the original space instead of projecting data points into a higher dimensional space. Our new model can overcome the main drawback of those traditional SVM-based regression models by avoiding the selection process. Hence, it has a much higher efficiency and applicability.

For a given training data sets, we want to find two quadratic surfaces:which determine the ϵ-insensitive down- and up-bound regressors. Following the basic scheme of TSVR, TQSSVR can be reformulated as the following two quadratic programming problems:where > 0 are the parameters chosen as a priori and ξ and η are the slack vectors. (x) generates the -insensitive down-bound regressor, while (x) generates the -insensitive up-bound regressor.

Note that the first term in the objective function of (14) is the sum of squared distance from the training points to the quadratic surface (x). Therefore, minimizing it equals to minimizing the regression error. Moreover, the constraint indicate that the estimated function (x) is at least ϵ₁ away from training points. The second term of the objective function is to minimize the sum of errors. As noted, the loss is zero if and only if (x) is exactly the same as y. Moreover, like SVR, we also assume that we can tolerate at most ϵ₁ deviation between (x) and y. The similar explanations apply for problem (15).

It is worth pointing out that the above formulations can be further simplified by the following steps. Note that matrices W₁ and W₂ are symmetric. First, let and be the vectors formed by taking the elements of the upper triangle part of matrices W₁ and W₂, respectively:

Then, for each point , we construct an vector sⁱ as follows:

Let . Finally, we can define two vectors of variables in the following form:

It is easy to check that problem (14) and problem (15) can be equivalently reformulated as follows:

For a point x, the final regression result of TQSSVR is the mean value of (x) and (x):

4. Linear Twin Quadratic Surface Support Vector Regression

In the above model, the error between the predicted value and the actual value is measured by the 2-norm. It is noting that 2-norm is sensitive to those points which are far away from the regressor and may amplify the influence of those error points, especially in the case with outliers or mislabeled information. Compared with 2-norm, 1-norm is more robust and insensitive to the errors [26, 27].

In this section, in order to increase the robustness, we will introduce the 1-norm regularization condition and transform problem (19) and problem (20) into two linear programming problems. This not only enhances the performance but also greatly improves the computational efficiency. Hence, we extended our model by incorporating the 1-norm regularization as follows:

For a vector , ‖a‖₁ is the sum of the absolute values of all its elements, and the sum of the absolute values of each element is greater than or equal to the absolute value of the sum. Let s₁ = ∥Y − eϵ₁ − (s^Tz₁ + c₁)∥₁, s₂ = ∥Y + eϵ₂ − (s^Tz₂ + c₂)∥₁. Then, we have

Based on these inequalities, we can relax and reformulate those two twin regression problems as the following two linear programming problems:

It is worth pointing out that, the linear structure of these two reformulations greatly improves the computational efficiency. Moreover, the existence of the 1-norm distance opposed to the 2-norm distance in TQSSVR makes the new model less sensitive to noise or large errors.

To further illustrate the robustness of 1-norm, we give an example here. The points are generated by , where x_i uniformly distributed between [0, 1] and ξ_i belong to a Gaussian noise N (0, 0.15²). For simplicity, we set ϵ₁ = ϵ₂ = 0. The final regressors are derived by LTQSSVR and TQSSVR. Then, to test the robustness of TQSSVR and LTQSSVR, we added some outliers to the data. As shown in Figure 1, it can be easily seen from the left graph that two regressors are almost overlapping for the data without some noises. However, for the data with outliers, TQSSVR is obviously biased while LTQSSVR is quite robust.

(a)

(b)

5. Numerical Experiment and Discussion

To investigate the performance of our proposed LTQSSVR and compare it with other benchmark SVM-based regression methods, we conducted a comprehensive numerical experiment. The comparison list includes Ridge Regression, Linear Regression, nu-SVR, ϵ-SVR, LSSVR, TSVR, LTSVR, and our LTQSSVR. LTSVR was proposed by Xu in [28] which adds 1-norm to TSVR. Besides, two different types of artificial data sets and twelve various benchmark real data sets were used in the experiment.

All computational tests were executed via MATLAB (R2016a) on a personal laptop with a system configuration as Intel p4 processor (1.8 GHZ) and 4 GB usable RAM. All the corresponding models were solved by the function “quadprog.m” or “linprog.m” in the MATLAB toolbox (available from https://pan.baidu.com/s/1QRX40tVcnO–f4bz8c0–58Q.password:itiz).

5.1. Kernel Function and Parameters Selection

For the kernel-based models, since Gaussian kernel function K (x_i, x_j) = exp(−∥x_i − x_j∥²)/σ² is the most commonly used one, it was applied in this paper. The corresponding optimal kernel parameter σ in nu-SVR, ϵ-SVR, LSSVR, TSVR, and LTSVR was selected from the set {2ⁱ∣i = −4, −3, …, 4, 5}. It is worth pointing out that we only considered the Gaussian kernel for those kernel-based models in the experiment. If we compare other kernel functions in the searching process, the total efficiencies of those models would be tremendously reduced.

Moreover, we set c₁ = c₂ = c and ϵ₁ = ϵ₂ = ϵ in our experiment. Specifically, the optimal penalty parameter c for these models was selected from the set {2ⁱ∣i = −4, −3, …, 4, 5}, and the insensitive parameter ϵ was chosen from the set {0.001, 0.01, 0.1, 0.2}. Note that all these parameters were determined by the fivefold cross validation. That is, the data set was randomly split into five subsets. In each time, one of those subsets was reserved as a testing set while other four subsets were used together as a training set. For every data set, this process was repeated ten times. And all the results are the average values of these ten tests.

In each cross validation, we use the following four indicators in [10, 28] as the criteria for selecting the optimal parameters and verify the performance of eight models:(1)RMSE: Root Mean Squared Error, defined as . It is used to measure the deviation between the observed value and the true value. Because RMSE uses the average error, it is sensitive to abnormal points. If the regression value of a certain point is not reasonable, its error is relatively large, which has a great impact on the value of RMSE. In general, the smaller the RMSE is, the more accurate the prediction results will be.(2)MAE: Mean Absolute Error, defined as . It is the mean of absolute error which can reflect the actual situation of predicted error.(3)SSE/SST: SSE (Sum of Squares due to Error) measures the difference between the actual value and the predict value, which represents the part interpreted by the regression equation. SST (Sum of squares for total) measures the deviation of y, which represents the total change of data. In most cases, a smaller SSE/SST means a better consistency between estimates and actual values.(4)SSR/SST: SSR (Sum of Squares due to Regression) measures the difference between estimate values and the mean value of y, which represents the part not explained by the regression equation. The larger the SSR, the more statistic information it captures from the test sample. In order to obtain a small SSE/SST, it is usually accompanied by an increase in SSR/SST.

5.2. Artificial Data Sets

First, we followed the traditional way in [10, 28] to generate a classical type of artificial data sets in the experiment. This is a kind of 2-D artificial data sets which is obtained by the equation . In reality, data sets always contain noise points. In order to check the performance of our model in this situation, two kinds of noises including the Gaussian noise and the uniformly distributed noise were added. Specially, four types of training points were generated as follows:where U[a, b] represents the uniform distribution in the range of [a, b] and N[0, d²] represents the normal distribution with a mean of 0 and a variance of d². To avoid bias comparison, we randomly generated 10 groups of independent noise samples for each type of noise. Following the traditional way in [10, 28], each group contains 600 training samples and 400 test samples. And the test points were generated by function sin c (x) without noise. The corresponding results of eight algorithms Ridge Regression, Linear Regression, nu-SVR, ϵ-SVR, LSSVR, TSVR, LTSVR, and LTQSSVR are shown in Table 1.

From Table 1, we can see that Ridge Regression achieves the best results by obtaining the minimum RMSE and MAE values for noises of Type 1 and Type 2, and ϵ-SVR obtains the best for noises of Type 3 and Type 4. Besides, LTQSSVR take a stable performance for most data sets which has a better robustness for noises. And it can also be seen from Table 1 that Linear Regression is the fastest one among all algorithms. However, Linear Regression is only suitable for the cases with linear separable data.

In order to verify the universal ability of eight models, we used another type of artificial data sets in [20] to test the performances of eight models. Now, we show how to obtain this type of artificial data sets. First, we used different matrices U and vectors c to get various quadratic surfaces , where each element of U and c was randomly selected from the domain [−10, 10]. Then, we randomly generated points on both sides of the quadratic surface. Similarly, we generated 10 independent groups of samples. Each group includes 60% training samples with noise and 40% test points without noise. The corresponding results are shown in Table 2.

From Table 2, we can see that LTQSSVR still achieves the best performance in seven cases among total eight cases by obtaining the smallest RMSE, MAE, and SSE/SST values. Compared with TSVR, LTSVR improves the training results by introducing 1-norm instead of the original 2-norm. But for large-scaled data sets, its advantages are not significant. Among these methods, Ridge Regression has the shortest processing time, followed by Linear Regression, nu-SVR, and ϵ-SVR, which lack robustness to noises. Moreover, the experimental results with different scaled data sets show that LTQSSVR has better performance with large-scaled but small-dimension data sets. Besides, we also show the box plot of RMSE values in Figure 2. It is easy to check that LTQSSVR are concentrated and obtain a smallest average, Hence LTQSSVR is robust in most cases.

(a)

(b)

(c)

5.3. Benchmark Data Sets

In this part, we use 12 benchmark data sets from the UCI Repository to test these four models (available from https://archive.ics.uci.edu/ml/datasets.php). This list includes Auto MPG, Oring, Wis, Slump, Haberman, and so on. The detailed information of these data sets is summarized in Table 3.

To eliminate the overwhelming dominance between features, we normalized the features of all data to [0,1] before the training. The experimental results of eight models are summarized in Table 4. For each result, the first term denotes the average value of ten times while the second term denotes the standard deviation. From Table 4, we can see that the LTQSSVR obtains good results in most of the cases. Compared with LSSVR and TSVR, the new model greatly improves the computational efficiency on the basis of ensuring accuracy.

The performances of eight methods on 12 data sets are summarized in Table 4. We can see that the introduction of 1-norm in LTSVR improves the efficiency in some cases, but this advantage is not valid for large-sized data sets. On the contrary, LTQSSVR greatly improves the training efficiency in most of the cases. It can be seen from Figures 3 and 4 that the processing time of LTQSSVR is much shorter than those of other methods; hence, LTQSSVR has a large advantage on the aspect of efficiency. This is because our kernel-free model avoids the time-consuming task of selecting a proper kernel and parameters. Besides, the structure of two small-scaled linear programming problems further accelerates the computational speed [29–32].

From Table 4, we can easily check out that our proposed method generates stable regressions on most of the data sets. It achieves the best accuracy on two data sets (Auto mpg, Computer hardware), the second best accuracy on one data sets (Hayes roth), the third best accuracy on two data sets (Body fat, Haberman), and the fourth best accuracy on five data sets (Oring, Wis, Real estate valuation, Slump, Ozone). We summarize the average rank information about these eight methods on 12 data sets in Table 5.

Then, we employ the Friedman test to check the difference of all methods under the null hypothesis that all the algorithms are no significantly different from the mean rank:where and denotes the jth of k methods on the ith of N data sets. Then, the Friedmans F_F is obtained byWith eight methods and 12 data sets, F_F is distributed according to the F distribution with (7, 77) degrees of freedom. According to (26), (27), and Table 5, we can obtain and F_F = 2.3593. The critical value of F (7, 77) for α = 0.05 is 2.131 and similarly is 1.796 for α = 0.1, so we reject both levels of the null hypothesis. We can identify that there is a significantly different between the eight methods. Note that LTQSSVR obtains the smallest average rank which is 3.25. It means that the accuracy of our approach is also very close to the best one.

As for the processing time, Ridge Regression and Linear Regression take the shortest time among these methods, which lack robustness. In addition, LTQSSVR takes much shorter computational times than TSVR and LTSVR on all data sets. For the large data sets, TSVR and LTSVR require a lot of time for solving. This is because they have to tuning those parameters in the kernel function. In particular, we only used the Gaussian kernel function in this paper. Hence, if we add the selecting process of kernel function to those kernel-based models, their efficiencies will be further dragged down a lot. Therefore, these methods are not suitable for dealing with those huge-sized cases and their applicabilities would be strictly limited in this big data era. In contrast, due to the kernel-free and linear programming structure, our new approach has a strong ability to handling these huge-sized data sets quickly. Above all, our proposed method not only has a good generalization ability but also achieves a great efficiency.

6. Conclusion

In this paper, we proposed a new approach of TSVR, which directly uses two quadratic surfaces in the original space to regress the data sets. It is worth noting that our new approach avoids the difficult and time-consuming task for searching a suitable kernel function and its corresponding parameters in the traditional kernel-based SVR methods. Besides, in order to make further improvement of the efficiency and robustness, our model incorporates the 1-norm to measure the regression error. The corresponding linear programming structure avoids the matrix inverse operation and leads to a high efficiency even for those huge-sized problems. Finally, the experimental results on different data sets demonstrate the validity and applicability of our method. Specifically, compared with those benchmark nonlinear regression models, our model is superior in both accuracy and efficiency.

For the future work, since the advantage of our method is not obvious on efficiency for those high-dimensional data sets, we hope to incorporate a feature selection method to improve it.

Data Availability

The data used in this paper are available from https://archive.ics.uci.edu/ml/datasets.php.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

Tian’s research has been supported by the Fundamental Research Funds for the Central Universities (Nos. JBK2002001, JBK1805005, and JBK190504).

References

V. N. Vapnik, The Natural of Statistical Learning Theory, Springer, New York, NY, USA, 1995.
V. N. Vapnik, Statistical Learning Theory, Wiley, New York, NY, USA, 1998.
R. Y. Goh and L. S. Lee, “Credit scoring: a review on support vector machines and metaheuristic approaches,” Advances in Operations Research, vol. 2019, no. 2, pp. 1–30, 2019.
View at: Publisher Site | Google Scholar
M. P. Kumar and M. K. Rajagopal, “Detecting facial emotions using normalized minimal feature vectors and semi-supervised twin support vector machines classifier,” Applied Intelligence, vol. 49, no. 12, pp. 4150–4174, 2019.
View at: Publisher Site | Google Scholar
D. F. Specht, “A general regression neural network,” IEEE Transactions on Neural Networks, vol. 2, no. 6, pp. 568–576, 1991.
View at: Publisher Site | Google Scholar
B. Leo and J. H. Friedman, “Predicting multivariate responses in multiple linear regression,” Journal of the Royal Statistical Society, vol. 59, no. 1, pp. 3–54, 1997.
View at: Publisher Site | Google Scholar
H. Cao, J. Zhang, F. Yang, and Q. An, “Application of ridge regression in pose control of telescope primary mirror with sparse aperture,” Infrared and Laser Engineering, vol. 48, no. 3, Article ID 318003, 2019.
View at: Publisher Site | Google Scholar
A. J. Smola and B. Schölkopf, “A tutorial on support vector regression,” Statistics and Computing, vol. 14, no. 3, pp. 199–222, 2004.
View at: Publisher Site | Google Scholar
J. A. K. Suykens and J. Vandewalle, “Least squares support vector machine classifiers,” Neural Processing Letters, vol. 9, no. 3, pp. 293–300, 1999.
View at: Publisher Site | Google Scholar
X. Peng, “TSVR: An efficient twin support vector machine for regression,” Neural Networks, vol. 23, no. 3, pp. 356–372, 2009.
View at: Publisher Site | Google Scholar
C. L. Huang and C. Y. Tsai, “A hybrid sofm-svr with a filter-based feature selection for stock market forecasting,” Expert Systems with Applications, vol. 36, no. 2, pp. 1529–1539, 2009.
View at: Publisher Site | Google Scholar
X. Shao, K. Wu, and B. Liao, “L(p) norm multikernel learning approach for stock market price forecasting,” Computational Intelligence & Neuroscience, vol. 2012, Article ID 601296, 2012.
View at: Publisher Site | Google Scholar
K. Ishiguro, A. Kimura, and K. Takeuchi, “Towards automatic image understanding and mining via social curation,” in Proceedings of the IEEE International Conference on Data Mining, Dallas, TX, USA, December 2013.
View at: Google Scholar
Z. Liu, S. Xu, C. L. Chen, Y. Zhang, X. Chen, and Y. Wang, “A three-domain fuzzy support vector regression for image denoising and experimental studies,” IEEE Transactions on Cybernetics, vol. 44, no. 4, pp. 516–525, 2014.
View at: Publisher Site | Google Scholar
Y. Pang, K. Zhang, Y. Yuan, and K. Wang, “Distributed object detection with linear svms,” IEEE Transactions on Cybernetics, vol. 44, no. 11, pp. 2122–2133, 2014.
View at: Publisher Site | Google Scholar
Jayadeva, R. Khemchandani, and S. Chandra, “Twin support vector machines for pattern classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 905–910, 2007.
View at: Publisher Site | Google Scholar
J. Luo, S. C. Fang, Y. Bai, and Z. Deng, “Fuzzy quadratic surface support vector machine based on Fisher discriminant analysis,” Journal of Industrial and Management Optimization, vol. 12, no. 1, pp. 357–373, 2015.
View at: Publisher Site | Google Scholar
Y. Xin, Y. Bai, S. C. Fang, and J. Luo, “A kernel-free quadratic surface support vector machine for semi-supervised learning,” Journal of the Operational Research Society, vol. 67, no. 7, pp. 1001–1011, 2016.
View at: Publisher Site | Google Scholar
Y. Tian, Z. Deng, J. Luo, and Y. Li, “An intuitionistic fuzzy set based S³ VM model for binary classification with mislabeled information,” Fuzzy Optimization and Decision Making, vol. 17, no. 4, pp. 475–494, 2018.
View at: Publisher Site | Google Scholar
Y. Tian, M. Sun, Z. Deng, J. Luo, and Y. Li, “A new fuzzy set and nonkernel SVM approach for mislabeled binary classification with applications,” IEEE Transactions on Fuzzy Systems, vol. 25, no. 6, pp. 1536–1545, 2017.
View at: Publisher Site | Google Scholar
J. A. K. Suykens, L. Lukas, and V. Dooren, “Least squares support vector machine classifiers: a large scale algorithm,” in Proceedings of European conference of circuit theory design, pp. 839–842, Stresa, Italy, September 1999.
View at: Google Scholar
C. Chen, Y. Li, C. Yan, J. Guo, and G. Liu, “Least absolute deviation-based robust support vector regression,” Knowledge-Based Systems, vol. 131, pp. 183–194, 2017.
View at: Publisher Site | Google Scholar
L. Yao, X. Zhang, and D. H. Li, “An interior point method for l_1/2 svm and application to feature selection in classification,” Journal of Applied Mathematics, vol. 2014, no. 16, 2014.
View at: Publisher Site | Google Scholar
B. Schlkopf and A. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization and beyond, MIT Press, Cambridge, MA, USA, 2002.
I. Steinwart, “Consistency of support vector machines and other regularized kernel classifiers,” IEEE Transactions on Information Theory, vol. 51, no. 1, pp. 128–142, 2005.
View at: Publisher Site | Google Scholar
X. Hua, S. Xu, J. Gao, and S. Ding, “L1-norm loss-based projection twin support vector machine for binary classification,” Soft Computing, vol. 23, no. 21, pp. 10649–10659, 2019.
View at: Publisher Site | Google Scholar
H. Aytug and S. Sayın, “Exploring the trade-off between generalization and empirical errors in a one-norm svm,” European Journal of Operational Research, vol. 218, no. 3, pp. 667–675, 2012.
View at: Publisher Site | Google Scholar
Z. Ping, Y. T. Xu, and Y. H. Zhao, “Training twin support vector regression via linear programming,” Neural Computing & Applications, vol. 21, no. 2, pp. 399–407, 2012.
View at: Publisher Site | Google Scholar
Y.-P. Zhao, J. Zhao, and M. Zhao, “Twin least squares support vector regression,” Neurocomputing, vol. 118, no. 11, pp. 225–236, 2013.
View at: Publisher Site | Google Scholar
X. L. Xia, “A novel sparse least-squares support vector machine,” in Proceedings of the International Conference on Biomedical Engineering & Informatics, Chongqing, China, October 2012.
View at: Google Scholar
S. Goli, H. Mahjub, J. Faradmal, H. Mashayekhi, and A.-R. Soltanian, “Survival prediction and feature selection in patients with breast cancer using support vector regression,” Computational and Mathematical Methods in Medicine, vol. 2016, no. 7, pp. 1–12, 2016.
View at: Publisher Site | Google Scholar
Y.-H. Shao, W.-J. Chen, and N.-Y. Deng, “Nonparallel hyperplane support vector machine for binary classification problems,” Information Sciences, vol. 263, pp. 22–35, 2014.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Qianru Zhai et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Mathematical Problems in Engineering

Linear Twin Quadratic Surface Support Vector Regression

Abstract

1. Introduction

2. Related Works

2.1. Support Vector Regression

2.2. Least Squares Support Vector Regression

2.3. Twin Support Vector Regression

2.4. Nonlinear Case and Dual Problems

3. Twin Quadratic Surface Support Vector Regression

4. Linear Twin Quadratic Surface Support Vector Regression

5. Numerical Experiment and Discussion

5.1. Kernel Function and Parameters Selection

5.2. Artificial Data Sets

5.3. Benchmark Data Sets

6. Conclusion

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright