Abstract

Trajectory tracking control based on waypoint behavior is a promising way for unmanned surface vehicle (USV) to achieve autonomous navigation. This study is aimed at the guidance progress in the kinematics; the artificial intelligence method of deep learning is adopted to improve the trajectory tracking level of USV. First, two deep neural network (DNN) models are constructed to evaluate navigation effects and to estimate guidance law parameters in real time, respectively. We then pretrain the DNN using a Gaussian–Bernoulli restricted Boltzmann machine to further improve the accuracy of predicting navigation effect. Finally, two DNNs are connected in parallel with the control loop of USV to provide predictive supervision and auxiliary decision making for traditional control methods. This kind of parallel way conforms to the ship manipulation of habit. Furthermore, we develop a new application on the basis of Mission Oriented Operating Suite Interval Programming named “pDeepLearning.” It can predict the navigation effect online by DNN and adjust the guidance law parameters according to the effect level. The experimental results show that, compared with the original waypoint behavior of USV, the prediction model proposed in this study reduces the trajectory tracking error by 19.0% and increases the waypoint behavior effect level.

1. Introduction

Unmanned surface vehicles (USVs) are mainly preferred by missions that are characterized as dull, dangerous, or ill-suited for manned ships. In the future, they will be developed for ocean mapping, hydrographic and meteorological monitoring, maritime search and rescue, etc. Autonomous control is the core technology of USV navigation. It belongs to motion control technology, which includes set-point regulation control [1], path following control [2], and trajectory tracking control.

Trajectory tracking is defined as the control for actual track of vehicle so that the vehicle can track the position of Cartesian coordinate relative to time [3]. The optimal time-varying path for trajectory tracking is derived from the dynamic model of the vehicle and a predefined target [4]. The difficulty is the impact of uncertain sea environment. Trajectory tracking is a fundamental capability for USV to perform missions such as automatic collision avoidance and cooperative formation. Therefore, we choose trajectory tracking control as the research issue.

This study follows the framework of the Guidance-Navigation-Control (GNC) system to solve the trajectory tracking problem of USV. The GNC system framework is a two-stage process consisting of guidance and control [4]. The guidance process refers to the transformation of a vehicle position to its heading and speed in the kinematics domain by means of guidance law. The control process refers to the transformation of heading to rudder angle and speed to throttle in the kinetics domain by means of control law. The essence of trajectory tracking control is to minimize the position error between the actual track and the reference track. Therefore, the performance of trajectory tracking is mostly dependent on the guidance process [5].

The object of this study is the guidance process of trajectory tracking control for USV. The modeling approach and experimental platform are founded on Mission Oriented Operating Suite Interval Programming (MOOS-IvP) [6]. MOOS-IvP is a research platform for autonomous maritime vehicles, open-sourced by MIT. It conforms to the framework of GNC system and divides guidance and control processes. The waypoint behavior in MOOS-IvP models the process of trajectory tracking, which is a Line-of-Sight (LOS) guidance method. The feedback effect generated by control law, vehicle, and its interaction with the environment is regarded as a whole. Trajectory tracking control of USV is accomplished by adjusting the LOS guidance parameters. MOOS-IvP can be deployed in real ships, so our work also enables its application to real-world scenarios.

On the other hand, the dramatic technological evolution of deep learning has delivered new insights and approaches for the study of USV autonomous navigation. Deep neural network (DNN) recognizes and extracts the relationship between the combined features. It is suitable for uncertain sea environment and complex vehicle motion control process [7]. In this paper, a dual-DNN model is established, and back propagation method is used to train the USV by taking the navigation data samples under different parameters such as speed and steering angle in a simulated environment. The trained DNN model is capable of predicting the tracking effect and estimating better guidance law parameters, so as to improve the trajectory tracking control process of USV.

During the last decade, a large number of methods for trajectory tracking control have been developed, but in practice there are still many difficulties. Velagic et al. proposed an adaptive fuzzy controller [8] and built a ship dynamics model, a steering equipment model, and a wind-flow disturbance model. All models have been simplified, including reducing dimensions, adding constraints, and removing higher order interference terms. On the other hand, the model parameters were adjusted according to predefined 49 fuzzy rules. If the ship encountered conditions that did not appear in the rules, it would be difficult to control them precisely. Aguiar et al. presented a nonlinear control algorithm according to Lyapunov theory and proved the global convergence of the model without constraints [9]. Lv et al. proposed a hybrid cooperative signal energy control law, which dealt with the speed and course control of ship. However, they did not consider the uncertainty of resistance and disturbance [10]. Xia et al. developed a dynamic model and an adaptive controller to promote tracking performance and convergence speed, but they did not consider the parametric perturbations caused by power devices [11]. Huang et al. decomposed the trajectory tracking problem into guidance law and control law loops, which are easier to solve than direct control methods, but they did not take into account external disturbances from the marine environment [12]. The traditional methods adopted in these studies aimed to build complex models with many parameters, and most of them did not consider the impact brought by real environment and equipment.

In recent years, many studies have used artificial neural network (ANN) in the control issues of USV, especially in trajectory tracking control problems. ANN and data-driven machine learning method have shown success in the analysis of uncertain and environmentally sensitive ship motion control problems [13, 14]. Cheng et al. used ANN as a replacement model for the traditional model to solve the problem of ship berthing [15]. Shuai et al. constructed two ANNs to extract features for controlling the ship propeller and rudder, respectively, to achieve automatic berthing under different environmental disturbances [16]. Zhang et al. proposed an adaptive robust ANN method for modeling uncertain ship dynamics and external influences and achieved good results in automatic ship berthing [17]. The ANN proposed by Wang et al. was used to control the course of USV and solve the problem of uncertainty in ship motion [18]. In the above literature works, the influence of sea environment has been considered, and the ANN method has been improved over traditional methods. However, since the ANN is shallow, it is difficult to learn the characteristic relationships of the parameters, so they selected a simple scenario of berthing or simplified the problem.

In recent years, DNN has demonstrated unprecedented ability in the field of traffic and control [19]. Kim et al. exploited DNN-based feedback controllers to compensate for the disturbance of curved road and reduced tracking error in lane keeping [20]. Xu et al. used DNN to learn the complex manipulation characteristics of USV based on the visual system [21]. Chen et al. proposed a DNN-based data-driven control method that greatly improved the capability and accuracy of control systems [22]. Deep belief network was proposed by Tan et al. to address the navigational safety of unmanned aerial vehicles [23]. It can be seen that DNN has shown stronger feature learning ability than shallow ANN [24]. However, it is not widely used in the trajectory tracking control problem of USV.

DNN has a gradient dispersion problem, which is generally solved by “pretraining and fine-tuning” method. Hinton et al. proposed a method of pretraining the restricted Boltzmann machine (RBM) in each layer, followed by fine-tuning the DNN [25]. Goudarzi et al. proposed two stacked RBMs for predicting short-term traffic flow by using the “pretraining and fine-tuning” method [26]. Zhao et al. developed a deep belief network consisting of several RBMs stacked, to reduce the risk of vehicle collision in snow and ice conditions on highways [27]. Pretraining of DNNs by using RBM is a common method [25]. However, RBM has been originally developed for binary vector modeling, where both visible and hidden layer variables are binary. In this study, the visible layer variables are types of continuous values, so binary RBM cannot be sufficiently used. Yamashita [28] gave a method for continuous value type vectors, i.e., Gaussian–Bernoulli restricted Boltzmann machine (GB-RBM), so we use GB-RBM for pretraining DNN to improve the performance, and make it applicable to our research.

In the implementation of intelligent navigation systems for USV, MOOS-IvP has been very popular in academic and industrial research fields nowadays, and it provides good support for autonomous navigation [29]. Firstly, MOOS-IvP provides a simulated experimental environment for USV. For example, Dong et al. used the MOOS-IvP platform to experiment with different test items for distributed remote control of USVs [30]. Secondly, MOOS-IvP provides integrated interfaces for software development and algorithm implementation. For example, the instruction filter control module developed by Djapic et al. was integrated into MOOS-IvP [31]. In addition, a set of algorithms for generating waypoints developed by Benjamin et al. have been used for path planning in MOOS-IvP [32]. This study is inspired by them. On the one hand, MOOS-IvP is used to acquire data and perform experiments. On the other hand, the developed DNN model is integrated into MOOS-IvP. From the view of artificial intelligence computing, this is also an upgrade of MOOS-IvP platform.

In this study, a deep learning methodology is utilized to predict the parameters of the waypoint behavior in MOOS-IvP, and the DNN prediction model is implemented in parallel with the control loop to achieve the trajectory tracking of USV. The whole project is implemented in two stages: In the first stage, a classification model based on DNN was constructed to provide assistance and reference for maneuvering decisions of USV [7]. In the second stage, we regard the feedback effect generated by the control law, vehicle, and its interaction with the environment as a whole and connect DNN in parallel with the control loop of USV, so as to predict LOS guidance law parameters in real time during voyage.

To predict the LOS guidance law parameters accurately in the second stage, we add a pretraining process by using GB-RBM, which improves the accuracy of model classification to 89.9% with an increase of 5% over the previous stage.

On the basis of the above works, the “waypoint behavior effect evaluation model” and the “real-time LOS parameter valuation model” are constructed based on DNN, denoted as DNN-1 and DNN-2, respectively. They are connected in parallel with the trajectory tracking control loop of USV. In the process of voyage, DNN-1 is used to predict the effect of navigation at first, and then the LOS parameters are given by DNN-2 when the effect is not good. The new LOS parameters are configured to adjust the waypoint behavior of USV. In this way, it is not only introducing intelligent computing to the control loop but also maintaining the reliability of the traditional control process as far as possible. At the same time, it also takes into account the traditional habit of steering infrequently in the ship maneuvering. In addition, we develop a new MOOS-IvP application to perform the computing of DNN model, and establish an interface between the trained DNN and the MOOS-IvP platform.

The contribution herein mainly includes the following three aspects:(1)With the deep learning training method of “pretraining and fine-tuning” and the model of GB-RBM, the prediction accuracy of classification model is improved. GB-RBM can fit the numerical data to prevent the model falling into local optimum.(2)A new predictive-based trajectory tracking control model has been innovatively constructed. The model consists of two DNNs, i.e., “DNN-1: waypoint behavior effect evaluation model” and “DNN-2: real-time LOS parameter valuation model.” We connect DNNs in parallel with the control loop of USV, which can obviously improve the trajectory tracking effect.(3)Intelligent trajectory tracking of USV is achieved by dynamically connecting the DNN model to a new application developed in MOOS-IvP. MOOS-IvP can be plugged into the real vehicle, so the application developed in this study can be employed in real maritime scenarios as well.

The rest of the paper is organized as follows: Section 2 starts with a description of the general process of trajectory tracking control based on waypoint behavior. Then, the method of connecting DNN in parallel with the control loop of USV is given. Finally, the implementation in the MOOS-IvP system architecture is illustrated and the waypoint behavior dataset is described. Section 3 first presents the overall implementation framework for training and accessing the DNN model into the control loop. Then the principle, construction, and training of the DNN model are given, and the implementation method and process of how to access the trained model into the control loop of USV are explained. In Section 4, the experimental simulation results are presented and analyzed in respect of model training, optimization effect of GB-RBM, and trajectory tracking control effect. Section 5 summarizes the paper.

2. Problem Formulation and System Architecture

2.1. General Process of Trajectory Tracking

The general process for a USV to perform trajectory tracking is to generate a set of waypoints based on the mission of the voyage. Then, the USV sequentially moves towards the waypoints and follows the planned route sailing. It can be divided into 5 phases, as shown in Figure 1:(1)Output speed and steering orders by the guidance algorithm to guide USV towards the next waypoint(2)Transform speed and steering orders to throttle and rudder actions(3)Under wind, waves and current conditions, the USV navigates in the sea(4)Send the feedback of USV speed and heading to the control module, which outputs new throttle and rudder angle actions(5)Send the feedback of actual position of USV to the guidance module, which outputs new speed and steering orders

According to the general process above, we consider the feedback effects produced by phases (2)-(4) as a whole and model the trajectory tracking control problem of USV as a waypoint behavior based on guidance algorithm.

This study is an optimization of the lookahead-based guidance algorithm by deep artificial neural networks. As shown in Figure 2(a), the lookahead-based guidance algorithm can be formulated as a geometric relationship between the vehicle, the previous waypoint, and the next waypoint [33].

Primarily, the angle formed by the two waypoints is , as in

Then, the lead distance and lead damper can be derived from and the position of the vehicle , as shown in where is the distance between the point of LOS and the vertical foot of the vehicle to the planned route. Generally, it takes 1.5–2.5 times the length of the vehicle. is the distance from the vehicle to the vertical foot. The trajectory tracking control is to reduce the track error approaching zero by regulating the and .

In addition, there are two circles associated with trajectory tracking during the planned route involving several waypoints. The inner circle is called the capture circle, which signifies the arrival of the vehicle to a waypoint while sailing in still water. The outer circle, known as the slip circle, marks the arrival when it is affected by wind, waves, and current, as illustrated in Figure 2(b).

2.2. DNN-Based Parallel Process of Trajectory Tracking

The DNN model proposed in this paper is in parallel with the general control process described above, as shown in Figure 3. We send the data from the navigation system of vehicle to the LOS guidance module and the DNN prediction model simultaneously and send the steering angle of the waypoint to the prediction model in advance. The prediction model consists of two submodels, where DNN-1 predicts the navigational effect firstly. If it works well, there will be no adjustment of parameters. Otherwise, DNN-2 is used to predict the relevant parameters of LOS algorithm, and the lead distance and the lead damper are adjusted to indirectly control the ship navigation to achieve better waypoint behavior effect.

This dual-DNN predictive model of trajectory tracking control using a parallel access approach is different from previous intelligent control models. It is not directly connected to the control loop. There are two advantages:(1)It is based on deep learning methodology for predictive model, analogous to ship officer which does not directly change guidance law and control law of the vehicle. If the original algorithm is good, it will not affect the ship navigation. Only when the navigation effect is predicted to be bad, the guidance law parameters are adjusted to improve the navigation effect of vehicle in the kinematic.(2)After long-term application and verification, the traditional control model is relatively reliable in engineering. The new model is parallel with the traditional model, which greatly enhances the practical value. In particular, when the navigation effect is good, DNN-2 is not involved in the control. It can also prevent frequent rudder manipulation, which is more in line with the regular mode of ship maneuvering.

2.3. System Architecture Based on MOOS-IvP

The autonomous navigation system of USV in this study adopts the system architecture of MOOS-IvP. MOOS-IvP was initially used on the Bluefin Odyssey III vehicle of MIT. The main motivation is to build high-performance autonomous systems [29]. Mission Oriented Operating Suite (MOOS) is a set of software components that provide a framework for the coordinated operation of multiple individual processes. Interval Programming (IvP) is a solution to the problems of multiobjective optimization, which is used for organizing various behaviors to achieve autonomous navigation of USV.

The system architecture of MOOS incorporates publish-subscribe middleware. Each MOOS application (MOOS app) interacts with information by connecting to a MOOS database (MOOSDB), and they form a star topology. As shown in Figure 4, the general process for trajectory tracking control of USV is implemented by a set of MOOS apps. The “pHelmIvP” app is a guidance module, “pMarinePID” app is a control module, “pNodeReport” app is a navigation module, and “uSimMarine” app is a simulation module of wind, waves, currents, and hull effects. All of them are connected to MOOSDB, constituting the autonomous navigation simulation system of USV. The DNN-based parallel control process is realized by adding a MOOS app called “pDeepLearning.” “pDeepLearning” is a new MOOS app developed in this study, which subscribes to the vehicle speed and course and publishes the predicted values given by DNN model.

The guidance algorithm of USV is implemented by the instance of waypoint behavior contained in “pHelmIvP.” The configuration parameters of behavior instance correspond to the variables of guidance algorithm. These parameters are stored in a “.bhv” file and invoked during the initialization of the mission. Once the mission is launched, the “.bhv” file can no longer be modified.

In this study, it is necessary to dynamically configure USV waypoint behavior with the predictions given by DNN model. This allows the USV to adjust the guidance parameters according to the navigation state. We use the “updates” parameters to publish the variables to the MOOSDB. To do this, it is necessary to configure a WPT_UPDATE variable in the “.bhv” file, as shown in Table 1.

In the process of voyage, the WPT_UPDATE variable is used to publish specific content for changing parameters in the configuration file. For example, if WPT_UPDATE = “lead_distance = 6.90,” it would immediately change the lead distance value of the USV from 8.0 to 6.9 in the original configuration file.

2.4. Dataset and Statistical Analysis

In previous work, we have made the waypoint behavior dataset [7]. The training samples comprise six features that are speed and steering angle, lead distance, lead damper, capture radius, and slip radius. These features can be classified into 3 categories, which are related to the definition of waypoint, guidance algorithm, and navigation process, respectively, as shown in Table 2. The training labels are different effect levels for USV, belonging to levels I∼III; therein, corresponding times vehicle lengths are as shown in Figure 5.

In this study, a preliminary statistical analysis of the dataset is conducted. From the statistical analysis in Figure 6, the categorical items are evenly distributed, and no special statistical patterns can be seen. However, these parameters do affect the effectiveness of USV. It is necessary to mine them with deep learning methods.

3. Methods

3.1. Dual-DNN Prediction Model

A predictive DNN model for trajectory tracking control is established. It consists of two submodels. The first submodel is DNN-1, which is used to predict the influence of waypoint behavior parameters on the drift effect. It is a “6-input-5-output” classification model, using waypoint behavior dataset for training. The second submodel is DNN-2, which is used to estimate the values of two guidance parameters: lead distance and lead damper. It is a “4-input-2-output” regression model, using the part of waypoint behavior dataset with better navigation performance for training.

The DNN model is designed to identify the complex effects of various factors and their combinations in the waypoint behavior of USV on high-dimensional spatial planes. In addition, the nonlinear effects are caused by the power unit installed in the vehicle and the effect of wind and waves on the vehicle. They have been involved in the acquisition of the data of waypoint behavior.

3.1.1. DNN-1:Waypoint Behavior Effect Evaluation Model

DNN-1 is a feedforward network with N6-6-7-7-8-7-6-5, as shown in Figure 7. The input layer corresponds to six features of waypoint behavior. The activation function in hidden layer is ReLU function. The output layer corresponds to five levels of effect. The loss function is cross-entropy.

3.1.2. DNN-2:Real-Time LOS Parameter Valuation Model

DNN-2 is a four-layer feedforward network N4-4-3-2, as shown in Figure 8. The hidden layer also uses the ReLU function, and the output layer uses softmax function, so that the model can predict lead distance and lead damper value. As a regression model, the “mean square” is used as a loss function to determine the deviation between the predicted value and the desired value.

3.1.3. Model Training and Saving

The two models are built and trained separately in Keras, and we train the DNNs using a gradient descent optimization algorithm with learning rate of 0.001 and batch size of 100. All experiments are performed on a 2.5 GHz Xeon 4215 CPU and two NVIDIA TITAN RTX GPUs.

The two submodels are trained by using the waypoint behavior dataset and its subsets, respectively. There are 16200 samples in the dataset. We perform 12000 iterations to train the classification model. The training set, validation set, and test set are randomly picked as 9720, 3420, and 3420, respectively. After training, in the validation set, the DNN model with the highest accuracy is stored.

For the regression model, 8000 iterations are trained. The dataset is divided into three parts randomly. 3927 samples are used as the training set. The validation set has 1309 samples, and the remainder are the test set. The DNN model with the lowest mean square error is saved.

In respect of the number of training parameters, the two submodels have 330 and 63 parameters to be trained, respectively. From the training results, the network size is well suited to the extraction of the waypoint behavior features, and the generalization capability satisfies practical applications.

3.2. Optimization by Using GB-RBM

The trained fully connected network model achieves an accuracy of 84.9% in the classification effect of waypoint behavior, which is the result of previous stage [7]. We find that setting different initialization parameters in the first layer of DNN has greatly affected the accuracy. This is due to the fact that DNN is easy to fall into local optimization. Therefore, we hope to further optimize DNN by tuning the initialization parameters.

3.2.1. The Training Method of DNN with GB-RBM

The method of “pretraining and fine-tuning” can effectively solve the difficult problem of neural network training [25]. This is considered both a training method for deep learning and a method for tuning initialization parameters. Therefore, the training is separated into two steps: “pretraining for first layer” and “fine-tuning.” The first step is the pretraining step, using GB-RBM for reconstruction training of the first layer by contrast divergence (CD) algorithm. This step trains 42 and 20 parameters as fixed values for two submodels, respectively. It reduces the calculation complexity, simultaneously ensuring that the mapping of the feature vector to the feature space is optimal.

The second step is to implement the fine-tuning of the whole network. This is done by taking the pretrained parameters as initial values and applying a back propagation (BP) algorithm in the DNN to learn further on the training set. Then the trained well dual-DNN is used for prediction. Figure 9 summarizes the proposed method.

3.2.2. The Construction of GB-RBM

The RBM is a neural network based on energy. It consists of two layers that are the visible layer and the hidden layer. The visible layer is generally used to describe the observation data, while the hidden layer can be regarded as the feature extraction layer. A six-dimension feature vector as visible layer neurons is utilized to build GB-RBM. Through reconstruction training, the RBM can learn the inner relationship between the 6-dimensional features. However, the RBM was originally developed for binary vector coding and decoding, so both the visible and the hidden layer variables are binary. In this study, the type of visible layer vector is numerical, so RBM cannot be used. According to the paper [28], the Gaussian–Bernoulli restricted Boltzmann machine (GB-RBM) can solve this problem. As shown in Figure 10, the GB-RBM constructed is of numerical type for its visible layer vector and Boolean type for its hidden layer variable, which conforms to the requirements of numerical type feature vectors.

The GB-RBM is a neural network based on energy. The combined energy functions of the visible and hidden variables are where the visible layer random vector ; the hidden layer random vector ; the weight matrix , and each element is the weight of connections between the visible layer variable and the hidden layer variable ; the bias and ; and is the standard deviation associated with Gaussian visible vector .

After defining the joint energy function of and , it can get the joint probability of and , as shown in (5). in (6) is the normalized factor also known as the partition function, which is the sum numbers of all the states of the system.where

3.2.3. The Training Process of GB-RBM

The training of GB-RBM is divided into two processes: (1) the coding process, also known as forward propagation; (2) the decoding process, also called back propagation or reconstruction process.

In the coding process, given the features in the visible layer, calculate the probability that a neuron in the hidden layer will be activated by sigmoid function, as shown in Then, the randomizer generates a number from 0 to 1. If the number is less than the calculated , then the hidden layer node takes 1; otherwise, it takes 0.

In the decoding process, given the current state of all neurons in the hidden layer, we calculate the probability that a neuron in the visible layer will be activated, as shown in (8). Other than RBM, the mean and variance that conform to the Gaussian distribution should be added for GB-RBM.where denotes the Gaussian probability density function with mean and standard deviation .

Then, the randomizer generates a number from 0 to 1. If the number is less than the calculated , then the visible layer node is ; otherwise, it takes that random number.

After training by performing coding process and decoding process alternately, the reconstruction error of and is very small, which indicates that the GB-RBM tends to stabilize.

In the training process of GB-RBM, an efficient CD algorithm which is common in deep learning has been applied, and it is illustrated in Algorithm 1. In this way, GB-RBM can be trained in the same way as a normal RBM. In Section 4.1, the experimental results show the improvement of classification accuracy after using GB-RBM.

Input: Dataset x (n), n = 1,…, N;
Output: W, c, b
(1)Set learning rate:, epoch:;
(2)Initial:W ⟵ 0, c ⟵ 0, b ⟵ 0;
(3)Calculate mean value and variance of vectors in dataset;
(4)for t = 1 …T do
(5) for n = 1…N do
(6)   choose an input vector ,calculate by using Equation (4),and randomly choose a hidden vector according the distribution;
(7)  calculate positive gradient
(8)  according to , calculate by using Equation (5), obtain
(9)  according to , calculate by using Equation (4), obtain ;
(10)  calculate reverse gradient
(11)  WW + 
(12)  cc + 
(13)  bb + 
(14) end
(15)end
3.3. Model Prediction and Invocation
3.3.1. Development of Prediction Scripts

The two DNNs are optimized by GB-RBM, followed by fine-tuning. After training, the saved model structure and parameters are used to regenerate the prediction script. The essence of the prediction script is a function that calls the neural network model and gives the prediction results according to the input variables.

LevelPredictor.py is a python script used to evaluate the effects of waypoint behavior, whose main function is to call the saved .h5 classification model file to predict. LDPredictor.py is a python script used to predict the waypoint behavior parameters, and its main function is to estimate the LOS parameters.

3.3.2. Invocation of DNN Model by Using “pDeepLearning”

A new MOOS app is developed to perform the computing of the trajectory tracking control prediction model, so as to establish the interface between the trained deep neural network and the MOOS-IvP platform. “pDeepLearning” is the key module for all types of information interaction. It is a C++ program inherited from the CMOOSApp class in MOOS. On the one hand, pDeepLearning is used to publish and subscribe the data in MOOS. On the other hand, DNN implemented as python scripts is called by pDeepLearning.

We deployed a set of MOOS apps and MOOSDB, for the USV, that perform waypoint behavior. The DNN is then integrated into the MOOS-IvP by loading “pDeepLearning.” Figure 11 shows the system structure of USV with DNN model.

During the voyage, “pDeepLearning” first receives the speed and planned steering angle of USV from MOOS-IvP. Then, “pDeepLearning” calls the LevelPredictor.py and LDPredictor.py scripts that are used in predicting the waypoint behavior effect and LOS parameters. Finally, the WPT_UPDATE variable described in Section 2.3 is used to publish the parameters lead distance and lead damper into MOOS-IvP. Section 4.3 reveals the performance of pDeepLearning.

4. Results and Discussion

The experimental results are carried out in two aspects: Firstly, we take the previous research as the benchmark. The classification accuracy is improved after the DNN model adopts the “pretraining and fine-tuning” method by GB-RBM. Secondly, an experimental platform is constructed based on MOOS-IvP, and the effect of running deep learning application pDeepLearning in MOOS-IvP for trajectory tracking control can be seen.

4.1. The Effect of Training GB-RBM

Figure 12 shows that the first layer of the classification model and regression model used GB-RBM matter to carry on the pretraining process, respectively, in which the horizontal axis shows the epoch times, and the vertical axis represents the reconstruction error between the visible layer and the hidden layer. Besides, we choose that mean square error which is commonly used in deep learning training. The experimental results show that the reconstruction error of the two GB-RBM are smaller through training; the reconstruction error of the classification model tends to be 0.09, as shown in Figure 12(a)), and that of the regression model tends to be 0.03, as shown in Figure 12(b), which suggests that the artificial neural network has the ability to restore the original data after transformation between the visible layer and the hidden layer. The GB-RBM proposed has learned the features of the waypoint behavior.

4.2. The Effect of Classification Accuracy

The accuracy varies for different depths and widths of the DNN structure. We compare the accuracy between initialization parameters using GB-RBM pretraining and no pretraining phase. The results are shown in Tables 3 and 4, respectively.

Experiments show that the structure of DNN, which is 6-6-7-7-8-7-6-5 nodes in each layer, has a maximum accuracy of 91.3% and 88.9% on the verification set and test set separately after pretraining.

Then, we employ the well-trained DNN to predict the effect of USV waypoint behavior at different speeds and steering angles. The results of the experiments are presented below.

Figures 13 and 14 show the prediction of the DNN for different steering angles and speeds, respectively, without any changes in other parameters. Figures 13(a) and 14(a) present reference values, Figure 13(b) show the predicted values without pretraining, and Figures 13(c) and 14(c) display the predicted values with pretraining.

It can be seen that the same trends are present in the predictions and the ground truth, and the unsupervised learning process using GB-RBM can improve the predicted accuracy.

4.3. The Effect of Trajectory Tracking
4.3.1. Simulation Preparation

As mentioned above, the goal of this study is to predict the waypoint behavior of USV through DNN to optimize its trajectory tracking effect. A comparative navigation simulation experiment based on MOOS-IvP platform is conducted. Two USVs with the same type and length (7 m) are deployed. The first USV named alpha does not use DNN, and the second one named Alder uses dual-DNN prediction model. Other than that, the configuration parameters of the two USVs are identical. The two USVs start from the same initial point and track a planned route consisting of five waypoints.

The dual-DNN prediction model adopted by the USV Alder is implemented by running pDeepLearning application to predict the behavior of the waypoint in real time. The experimental results are recorded by pLogger application in MOOS and extracted and analyzed by alogview toolbox.

The MOOS-IvP simulation platform is shown in Figure 15, in which the small window shows the situation of USV Alder running pDeepLearning for prediction. The main configuration parameters of two USVs and their waypoint behavior configurations are given in Table 5.

After experiment, we compare the tracking effect and performance index of alpha and Alder, to give the overall evaluation and analysis of the behavior effect of waypoint, and then we analyze the prediction effect by using DNN model and the influence of the model on the speed and course stability.

4.3.2. Trajectory Tracking

Figures 16 and 17 show the trajectory of USV sailing towards five waypoints from the initial position. Figure 16 is the overall picture and Figure 17 is a larger version of each waypoint.

As shown in Figure 16, the center of five red circles is the five waypoints in the planned route, among which the waypoint behavior effect evaluation circle is drawn from the inside to the outside with the radius of 1 to 5 times length of USV. The black dashed line is a planned route connected by five waypoints. The green line is the trajectory of USV Alder which uses DNN to predict. The blue line is the trajectory of USV alpha without using the prediction model. It can be seen that after using the model prediction, the USV deviated from the waypoint by a smaller distance with each turn.

Figure 17 is an enlarged picture of USV sailing to each waypoint. It can be seen that, in every waypoint, the green track line is closer to the waypoint than the blue track line, which meets the standard of waypoint behavior effect, sometimes by up to one level. This indicates that the USV with prediction model has learned the relationship between the parameters and effect level in waypoint behavior. However, in Figure 17(d), the effects of two USVs are not good, because the steering angle of the waypoint is an acute angle, which falls outside of the dataset, so the overall deviation is large for both.

4.3.3. The Analysis of Trajectory Tracking Error

The error between the actual track and the planned track of two USVs is depicted in Figure 18, where the horizontal axis represents the time in seconds, and the vertical axis represents the tracking error in meters. The green line signifies the trajectory tracking error of USV Alder with using DNN for prediction. The blue line signifies the trajectory tracking error of USV alpha without using the model prediction. The black dashed line signifies the course of advance. It can be observed that the tracking error is smaller after adjusting by the prediction model. In particular, the error is smaller when the course changes.

Table 6 shows the performance quantitative indicators. From the perspective of mean error and variance, the values of USV by using the prediction model are smaller. From the view of the integrated absolute error (IAE) and the time integrated absolute error (ITAE), the USV with prediction model has better transient and steady-state performance.

4.3.4. The Effect of Waypoint Behavior

Figure 19 shows a comparison of waypoint behavior effect after modifying parameters by the prediction model. Among them, the green line is the waypoint behavior effect after optimization of LOS parameters by the prediction model, and the blue line is the waypoint behavior effect without modification of the prediction model. It is clear that the waypoint behavior effect is better after optimization. The black line shows the changes of steering angle, which serves as a reference, indicating that once the steering angle changes, the model will predict a new waypoint behavior effect level value.

The effect of waypoint behavior is shown in Table 7; the effect was improved by 1 level after LOS parameters were adjusted by the prediction model.

4.3.5. The Effect of Prediction for LOS Parameters

Figure 20 shows the lead distance and lead damper values which are related to the guidance law predicted by DNN-2. The green line and blue line signify the predicted value of lead distance and lead damper, respectively. The black line signifies the change of steering angle and speed, respectively. The black line and the red line are used as the reference, indicating that when the steering angle changes, the model will predict new lead distance and lead damper. In addition, during the voyage, predicted lead damper values have been affected by the change of speed.

4.3.6. The Influence on Speed and Course

Figure 21 shows the changes of speed and course during the voyage. The scale of Figure 21(a) is the entire navigation process, and Figure 21(b) is the amplification of the second steering process. The green line is the speed and course of USV Alder by using the prediction model, the blue line is the speed and course of USV alpha without prediction model, and the black dashed line is the course of advance. It can be seen that there is little difference between the changes of the two vehicles’ speed and course, which indicates that the intervention of waypoint behavior by using DNN does not cause rapid increase or decrease for speed and course. The method does not have too much influence on the control loop. It is reliable.

From the above results in Section 4, we can make a summary:(1)The ability of DNN model for trajectory tracking control based on GB-RBM optimization to evaluate the waypoint behavior effects has been greatly improved compared with the previous model. The accuracy of the test set has been improved by 5%, reaching 88.9%.(2)After correction, the average trajectory tracking error of USV is reduced by 19.0%, and the waypoint behavior effect level has been raised by one level.(3)The DNN prediction model which is applied to the trajectory tracking control of USV can evaluate the waypoint behavior effect before steering and adjust the parameters of guidance law in real time.

5. Conclusions

In this study, two prediction models based on DNN have been constructed, i.e., “DNN-1: waypoint behavior effect evaluation model” and “DNN-2: real-time LOS parameter valuation model.” The models are connected in parallel with the LOS guidance process for trajectory tracking of USV, which improves the effect of trajectory tracking obviously. The experimental results have demonstrated the positive effect of deep learning method on autonomous navigation of USV. The DNN has learned the mapping relationship between different features and effect levels in the waypoint behavior through the training of dataset. Through the real-time prediction of LOS parameters, the trajectory tracking error is reduced by about 1 times length of the vehicle.

We have developed a new MOOS application. To our knowledge, this is the first time that DNN is integrated dynamically into MOOS-IvP, a well-known marine autonomous platform. Although it only predicts the guidance process of USV, the enhancement is significant. In the future, it can further improve the overall capability of USV’s self-driving.

In addition, a dual-DNN model has been in collaboration with the prediction of trajectory tracking control process, which is also our first attempt. The experimental results have proved its feasibility, and we believe that the way of evaluating the waypoint behavior effect firstly and then executing the maneuvering according to the prediction is more in line with the regular mode of ship maneuvering and control in marine domain. It will be beneficial to improve the reliability of real-world scenarios.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.