Abstract
With the improvement of living standard and the development of science and technology, Internet of Vehicle (IOV) will play an important part in industrial transportation as a main research field of Internet of Things. As a result, it is very necessary to grasp the location of vehicle. However, the traditional single global position system is easily affected by the external environment, so an accessorial locating approach based on wideband direction of arrival (DOA) estimation in intelligent transportation is proposed. First, model the array received signal on the road infrastructure. Then, by means of random forest regression (RFR) in the supervised learning, upper triangle elements of the covariance matrix of each frequency and the actual DOA are, respectively, extracted as the input features and output parameters; thus, the corresponding prediction coefficients are solved by training. After that, the trained RFR model can be used to calculate the final direction using test samples. Finally, these vehicles can be located according to the geometrical relation between the vehicle and the infrastructure. The proposed algorithm is not only suitable for uncorrelated signals but also for uncorrelated and correlated mixed signals without wideband focusing. The simulations show that compared with some sparse recovery algorithm, the prediction accuracy and resolution are effectively improved.
1. Introduction
Nowadays, automobile traffic has become an indispensable part of our modern industry, and vehicle-related technologies and industries are becoming increasingly mature [1–4]. At the same time, the development of technologies related to the Internet of Vehicles is also promoting the evolution of the traditional automobile industry to the new intelligent vehicles, and the market demand and technological exploration are mutually reinforce each other, including industrial transportation, advanced traffic information system, and travel technology, which can provide navigation and positioning, road status, parking instructions, safe driving assistance, and other all-round services [5–8]. Industry consists of two important aspects: production and transportation. Large enterprises and logistics distribution centres often have a large amount of cargo transportation, such as steel and cement in industrial products and oil and coal in mineral raw material transportation. The freight volume of these goods is generally very large; sometimes multiple full load transportations are required. After the vehicle completes a freight task, it drives empty to the city where the task is not performed to continue loading. If their locations are not obtained, reasonable scheduling can not be realized, which will reduce the utilization rate of vehicles, leading to the increase of transportation cost. In addition, electronic commerce and customer requirements are changing rapidly, requiring real-time locating for vehicles too, so it is urgent to study the vehicle positioning method in industrial transportation.
The rapid development and wide application of computer technology, intelligent transportation system, and especially Internet of Things technology provide comprehensive technical support for the industrial transportation. Among them, the Internet of Vehicles (IOV) is a network of wireless communication and information exchange based on big data and agreed communication protocols, which integrates a variety of key technologies such as on-board positioning, wireless transmission, and cloud computing. According to different application objects, IOV technology can be divided into two categories on the whole: vehicle to infrastructure (V2I) [9–11] and vehicle to vehicle (V2V) [12–14]. Of course, global position system (GPS) can obtain their locations and surroundings. However, GPS are extremely limited when vehicle comes to subways, tunnels, and other hidden objects. Fortunately, the technology of big data in IOV ensures that vehicles can communicate with each other in real time and fuse various information, which has a significant impact on road safety, such as intersection mobility assistance and left turn assistance. The big data has the features of scale, diversity, high speed, and authenticity. We can calculate when and where traffic jams will occur through analysing massive data, even track and trace the trajectory of vehicles and persons. V2I architecture allows the vehicle to transmit its position and speed data to the central server through the infrastructure, so as to realize the data exchange during the driving process and integrate overall road and vehicle information to solve traffic congestion, route planning, and accident; it also can provide high precision radio locating service for users through cloud computing platform and other advanced means [15, 16].
In general, radio locating is usually classified into ranging and nonranging; the former needs to measure the actual Euclidean distance or angle, such as received signal strength indication (RSSI) [17, 18], time of arrival (TOA) [19, 20], time difference of arrival (TDOA) [21–24], and their fusion algorithms [25, 26], while the latter is based on topology [27], connectivity [28], multihop [29], and fingerprint information [30] of the network itself. Where the accessorial vehicle positioning based on RSSI needs to know the spatial attenuation characteristics of signals, which is difficult to be accurately obtained due to the increasing complexity of wireless channels, the TDOA and TOA algorithms are both very sensitive to the measurement of time, which makes them difficult to achieve high precision; hence, direction of arrival (DOA) estimation has become a good choice. Compared with algorithms based on RSSI, TDOA, and TOA, the locating accuracy of this algorithm is only related to DOA estimation, which can be easily obtained by the corresponding superresolution direction finding technique.
The research for narrowband DOA estimation has developed rapidly over the past half-century; the most famous algorithm is multiple signal classification (MUSIC) [31], which is based on the new idea of orthogonality between signal steer vector and noise subspace; it finds the target through searching the whole angle space. Another one is estimation of signal parameters via rotational invariance techniques (ESPRIT) [32] which uses the characteristic of rotation invariance of the signal subspace to calculate DOA, avoiding spectrum peak searching. Subsequently, the appearance of maximum likelihood estimation [33] is another important progress; although very complex optimization and iteration are needed, it has a higher precision; all of the three means can break Rayleigh limitation. After that, more and more corresponding improvements with more functions have sprung up, such as propagator [34] and decorrelation method [35, 36]. Among these algorithms, compressed sensing (CS) is a new signal processing theory in recent years; since it was born, the relevant research has been carried out continuously. It can be classified into grid-division and grid-less methods; the first needs to partition the airspace into multiple grids before signal recovery [37–39], while the idea of grid-less originally presented by Candes and Fernandez [40] can solve the DOA in continuous domain; as a result, it has aroused great interests and attentions [41–43], but the problem of positive semidefinite problem is inevitable, so the calculation is heavy.
In recent years, wideband signals have been promoted in both civilian and military fields; compared with narrowband signals, wideband echo carries more information and has strong anti-interference ability, which is conducive to detection, parameter estimation, and feature extraction in multiple targets [44–46], and narrowband DOA estimation technology can also be expanded. Where the famous coherent signal subspace method (CSSM) [47] focuses the data of every frequency on a single reference point, then the processing suitable for narrowband signal is employed; it both adapts to correlated and uncorrelated sources, but preestimation to DOA is needed. In order to improve the performance of wideband signal processing, not only compressed sensing but also approaches of discrete cosine transform, Kalman filters, higher-order statistics, spatial-temporal analysis, fractional Fourier transform, especially support vector regression [48, 49], and neural networks [50] in machine learning are introduced into DOA estimation one after another, and good results have been acquired.
As is shown in Figure 1, based on cloud platform, this paper considers a kind of accessorial method for IOV locating; we can calculate the DOA of the vehicle by random forest regression (RFR), then determine its position according to the geometrical relation between the vehicle and the infrastructure. The paper mainly has the following three contributions: (1)tAt present, the mainstream global position system (GPS) for outdoor positioning can not effectively locate the vehicles in the presence of obstructions. Because of the high resolution of wideband signal, this paper proposes a scheme to use wideband DOA estimation to assist GPS for IOV locating. Borrowing the cloud platform, we can estimate their DOA and locate these vehicles according to geometrical relationship of the signals and the infrastructure. It not only improves the accuracy but also enhances the resolution for the target, which is very suitable for multiple vehicles(2)In this paper, signal information of different frequencies is extracted as input feature, while DOA is taken as the output for training; then, DOA of vehicle is estimated by RFR; the prediction model can be adjusted through parameter optimization. This process is more intuitive and without complex parameter settings and has good robustness, scalability, and flexibility. Moreover, the RFR is especially suitable for small snapshots(3)In practical applications, multiple vehicles will be possibly close to one another on the road, and multipath transmission often exists, leading to a large number of correlated or even coherent signals in the locating process. In such an environment, the performance of traditional DOA estimation algorithms will decline sharply, or even fail. The algorithm proposed in this paper is suitable for uncorrelated signals, as well as the two coexisting scenes of uncorrelated and coherent signals without wideband focusing

2. Signal Model
The array signal model for IOV locating can be seen from Figure 2. far-field wideband signals are arriving at the uniform linear array (ULA) on the infrastructure from ; due to the multipath propagation, signals are made up of two parts; and are, respectively, the numbers of uncorrelated and coherent signals, where the former is formed by emitting from Ku antenna on vehicles, and the latter is composed of groups of coherent sources reflected by original signals ; meanwhile, each group includes multipath reflection, so their relations satisfy.

The first sensor is deemed as the reference; background noise obeys Gaussian white distribution with zero mean and variance ; then, the received data on the -th sensor is where represents the time delay between the -th array element and the reference of the -th uncorrelated source, is that between the -th array element and the reference of the -th coherent source in the -th group, means attenuation coefficient of the corresponding signal, and denotes the Gaussian white noise on the -th sensor.
Because the wideband steering vector is associated with DOA and signal frequency, so it is necessary to give the model in frequency domain. During the observation time , the signal is sampled times with uniform interval in time domain, and discrete Fourier transform is performed to each interval; then, the signal at frequency can be modelled as follows. where , , , and are separately the received data and the noise on the -th sensor, the -th uncorrelated source, the -th group of coherent source at the -th snapshots, and here the uncorrelated signal is the coherent signal is the array manifold of uncorrelated signal is where is the corresponding steering vector and the -th element is the array manifold of coherent signal is where and the attenuation coefficient matrix is
3. Vehicle Locating Based on DOA Estimation
The general idea of vehicle locating based on DOA estimation is summed up as follows: First, the feature data sample related to DOA estimation is acquired. Then, model the RFR algorithm. After that, the DOA is predicted by the trained RFR. Finally, the vehicle can be located according to the obtained DOA, the geometrical relation between the vehicle and the infrastructure.
3.1. Data Preprocessing and Feature Selection
The covariance matrix in frequency domain is since RFR can only handle real value matrix, but the array receives complex data; it is necessary to convert them into real matrices; consequently, we introduce the following two matrices where is the matrix whose elements on the back-diagonal equal one and the others are all zero, and we have thus, or can be used for changing into real matrix as is symmetric, we select its upper triangular elements of different frequencies as input features, that is then the feature can be acquired by combining vectors of frequencies
3.2. Random Forest Regression
As is shown in Figure 3, random forest [51] is an ensemble learning method that uses Bootstrap aggregating (Bagging) to assemble multiple unrelated decision trees and obtain final results by voting or averaging. In other words, random forest is a strong learner that uses classification tree, regression tree, and Bagging algorithm for ensemble learning. The basic unit of random forest is decision tree, which can be used for classification and regression. The random extraction for samples with replacement and random selection for features when constructing decision tree ensure the randomness, so that the random forest is not easy to fall into overfitting and has good generalization ability. Furthermore, forest has multiple decision trees for integration.

Design flowchart of RFR is given in Figure 4; in the course of establishing regression tree, RFR will employ Bootstrap resampling to extract parts of observations with replacement from randomly, then select specified variables to determine the nodes of classification tree.

Thus, RFR can usually generate hundreds of trees, where the average of outputs of all the trees is regarded as the final result, so the process of establishing RFR is as follows: (1)Bootstrap is used to extract training samples from original data set with replacement repeatedly; then, decision trees are established. While the samples that are not extracted each time are deemed as testing samples, they are called out of bag (OOB) data(2)When constructing a decision tree, independent variables are randomly selected as the candidate branch variables at the subnodes of each tree, then determine the optimal branch according to Gini coefficient [52](3)Each tree grows from top to bottom continuously for recursive splitting; the set threshold is deemed as the termination condition of tree growth(4)The established decision trees form the RFR model, the average of outputs of all the trees is regarded as the final result, and the performance is evaluated in line with the estimation accuracy to the OOB data, that is the mean square error of testing set, assuming the number of OOB samples is , then where is the actual value of -th signal in the -th OOB sample, represents the DOA average of -th signal in all OOB samples, means the predictive value obtained by the RFR, denotes the corresponding variance, and is the coefficient of determination and derived from mathematical statistics; it indicates the fitting effect of a predicted value on the truth; the closer its value is to 1, the better the model is.
3.3. IOV Locating
The relative position between two infrastructures and the target vehicle is demonstrated in Figure 5; for the sake of simplicity, the two infrastructures are deployed on -axis; the first one is set at the origin, and the other one locates at . The array structure of each infrastructure is the same and given in Figure 2; it can be seen that the two infrastructures, respectively, receive the radio signal generated from the vehicle; then, cloud platform calls the proposed algorithm based on RFR; thus, and can be calculated. In addition, we can set up the following equation according to their geometrical relationship.

Thus, the position of the vehicle is obtained. Obviously, it is also appropriate for evaluating positions of multiple vehicles.
3.4. Computation
Next, wideband DOA estimation based on linear interpolation sparse Bayesian learning (LI-SBL) [53] and grid interpolation sparse Bayesian learning (GI-SBL) [54] are compared with the proposed RFR; for the sake of simplicity, we only compute the main steps. LI-SBL, GI-SBL, and RFR all calculate covariance matrix of each frequency. LI-SBL requires focusing and grid division, then searches spectrum peaks by iteration and estimate the quantization errors; the complexity is about , where is iteration times and is tthe number of grid division. GI-SBL needs signal power preestimation and focusing; then, coarse search is used for determining the scope of off-grid value; thereby, DOA can be evaluated through small step size searching; the computation is nearly , where means iteration times of preestimation to signal power and represents coarse search times. By comparison, we model RFR by multiple decision trees and select features randomly, then average their outputs to obtain the final result, so the complexity is almost , where denotes number of samples, is that of features, and means depth of decision tree.
4. Simulations
In the simulations, number of the array sensors , dB, the centre frequency of the sources , relative bandwidth is 40%, and five frequency points are selected. Assuming there are four wideband signals incident on this array, the first two are uncorrelated signals, and the others are coherent, assuming that source number is known. They are, respectively, expressed as , , , and , and the angle difference between adjacent signals is ; that is, . As , we can get. where varies from 0° to 165° and step size is 1°. In each scenario, 100 snapshots are sampled, and 80% of them are randomly selected as the training set; the rest are used for prediction. Both CART decision tree [52] and RFR use the same data set for model training, number of trees , and the maximum depth of the decision tree is 11.
In the first example, signals are coming from (10°, 15°, 20°, 25°); RFR and CART decision tree algorithms are used for the DOA estimation, where 10° and 15° are coherent signals and the other two are uncorrelated; we carried out 200 Monte-Carlo experiments; the results are shown in Table 1.
Table 1 shows that both the two algorithms can estimate these DOA; although RFR takes longer time than CART, the precision of the former is significantly higher than the latter. For IOV location, improving precision is more important, and the efficiency of RFR can also be accepted.
Then, the second example shows about DOA estimation precision of uncorrelated sources versus SNR and snapshots; in the first scenario, the signals from (10°, 15°, 20°, 25°) are independent with one another; the estimation error versus SNR when the snapshots is given in Figure 6, while that versus snapshots is shown in Figure 7 when SNR is 10 dB. We can see that RFR algorithm makes full use of the information of every frequency without more transformation, so it is more accurate than LI-SBL and GI-SBL. By contrast, both of the other two algorithms need to cope with wideband signals by DOA preestimation and the process of focusing, which will lead some error, especially at the circumstances of small snapshots. While GI-SBL uses twice hyperparameter estimation to enhance the ability of sparse recovery, so it is more precise than LI-SBL.


Then, in the second scenario, supposing 10° and 15° are coherent and the other two are independent, Figures 8 and 9 present the results at this time. We can see that compared with the first scenario, the performance of all the three algorithms is not significantly reduced. This demonstrates that the focusing courses of the LI-SBL and GI-SBL have weakened the coherence between the signals greatly; meanwhile, the correlation has little effect on the proposed algorithm.


In the final example, there are four wideband signals , , , and with near intervals, where , , , , and the angle interval is changed from 1°~10°; then, the data of the four signals with each interval are trained, respectively. °dB, , and 200 Monte-Carlo experiments are performed under each condition. Figures 10 and 11 separately indicate the estimation performance versus DOA interval when the signals are independent and mixed. It can be seen that as LI-SBL and GI-SBL both import a little errors by DOA preestimation and wideband focusing, the RFR still has a better DOA resolution on the whole, with the increase of DOA interval; all errors of the three algorithms are reduced, but due to the smaller snapshots, they can not estimate the DOA precisely, and the performances of two scenarios are still nearly the same. Subsequently, RFR also has a better performance for IOV locating.


5. Conclusions
The emergence of supervised learning provides a new idea for direction finding; therefore, this paper proposes a new locating method based on DOA estimation; upper triangular elements of covariance matrix of different frequencies are taken as input features, while the directions are taken as output; then, signal DOA is estimated through RFR model; consequently, these vehicles can be located according to the geometrical relationship between the vehicle and the infrastructure. The algorithm proposed in this paper is still effective under the circumstance of small snapshots, and it is suitable for dealing with the scene where coherent and uncorrelated signals exist at the same time without wideband focusing.
Data Availability
All data generated or analysed during this study are included in this published article.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Acknowledgments
This work was supported by the basic scientific research projects of Heilongjiang Provincial University (2020-KYYWF-1005).