Abstract

With the continual enhancement of the onboard avionics, the minimum flight crew has been downsized from five to two-person crew mode, and reduced crew operation has drawn extensive attention from aviation experts. Single-pilot operation (SPO) mode warrants careful account and research. This study investigated the intention modeling of commercial aviation single pilot based on the bidirectional long short-term memory (BiLSTM), mining the intention tendency of pilots’ behavior through artificial intelligence technology. This was done to avoid safety hazards caused by different intents and inconsistent operations of the single pilot and the cockpit automation system. The classification task of a single pilot’s behavior is the core of intention recognition. Various operation items contribute differently to the classification. To construct the interaction dataset and encode it into time series features, a single-pilot experiment is specifically performed, wherein the experience of an expert is summarized into single-pilot intent labels. The deep information in the feature vector of a single-pilot operation item is captured by the BiLSTM network, and the neural weight is adaptively assigned by the training mechanism. The operation sequence with the feature data is finally loaded into the softmax layer for intention classification. The proposed method is evaluated against long short-term memory (LSTM), term frequency-inverse document frequency (TF-IDF), convolutional neural network (CNN), Naive Bayesian (NB), and distributed representation’s intention modeling techniques. Because the proposed methods have higher F1 scores, the model can effectively share real-time information about the single-pilot intention with the cockpit automation system.

1. Introduction

Single-pilot operation (SPO) is a key research concept for next-generation aviation technology [1]. For the SPO mode to be as effective and of high quality as the dual-pilot operating mode, the capabilities of the avionics system must be enhanced [2]. With the continuous development of the commercial SPO concept, the collaborative flight organization structure with automation at its core inevitably increases the overall complexity [3], which increases safety risks.

The National Aeronautics and Space Administration (NASA) is studying the relevant aspects of SPO [4], which relies on developing and integrating a higher level of automation than the two-person crew mode of the past. Recently, a decline in crew numbers in the cockpit of commercial aircraft and an increase in air traffic density have decreased aviation accident rates [5]. Automation is the main factor addressing these challenges [6]. Despite all stakeholders agreeing on the effectiveness of high automation on safety enhancement, they are also concerned that extreme automation may cause profound changes in human-machine cooperation or introduce new human errors [7]. Automation has particularly shown many forms of bad behavior, like automation bias [8], automation surprise, mode confusion [9], unexplained decisions, and decision errors [10], that have intensified stakeholders’ concerns. Since SPO will necessitate increasingly autonomous systems to carry out missions previously performed by the copilot [11], aircraft automation is defined as a cooperator that would act with intention. A key discovery is that conflicting intentions may arise from insufficient support of partnerships between the pilot crew and aircraft automation [12]. The transfer and execution of control rights between the single pilot and the cockpit automation systems will inevitably conflict when their intended goals diverge [13]. Consequently, automation must be user-centered [14] and adapt to the single-pilot intentions, which must establish passive or implicit perception capabilities to encourage pilot-automation partnership.

Human factor research has shown that because of one flight control operator under the SPO mode, the piloting skills and physical condition of the sole pilot directly affect the safety of the flight. With the evolution of SPO mode technology, automation pushes single pilots into the role of a supervisor rather than detaching them out of control. Generally, human pilots are required to monitor the automation partner to ensure its efficient assistance during airplane flights. However, bidirectional monitoring forms a crucial constituent of team collaboration, which enables the pilot and automation partner to examine each other, similar to interpersonal cooperation. In particular, SPO human-machine interaction can be facilitated when real-time monitoring objects are dynamically migrated between a single pilot and an automation partner. The overall objective of the European commission-funded project A-PiMod (applying pilot models for safer aircraft) [15] is to develop a pilot model that enables the advanced cockpit assistance system to learn about the single-pilot cognitive state. The critical function performed by the automation system in charge of supervision behaviors continues to provide feedback regarding pilot cognition or intention. The automation system must find the implicit intention data from the behavior characteristics of a single pilot with time series changes, which enables active recognition and complete understanding of the single-pilot intention. This will support free flight oriented toward the single-pilot intention.

The intention recognition process directly drives the generation of situation awareness. Situational unawareness may be the result of poor intention recognition. Statistics show that human error accounts for about 70% of aviation accidents [16]. How to avoid the potential threat caused by action errors and perceive the unsafe decision intention or manipulation behavior of a single pilot has hence become one of the research goals of developing SPO mode technology [17].

For the pilot intention modeling, the disquisition difference of relevant technical basis primarily exists in the selection of data type and study approach. Mcruer [18] examined the pilot transfer function model based on classical and modern control theories. The model was used to describe actual patterns of pilot behavior. The pilot intent and error recognition (PIER) module of the crew assistant military aircraft (CAMA) is another method described in [19]. The method was based on fuzzy logic algorithms, and intent recognition was first defined as classification problems. Hayashi [20] used instrument scanning to simulate the aircraft pilot’s attention switching behavior. To analyze the pilots’ eye-movement statistical data, the researcher employed a hidden Markov model (HMM). However, our focus was on the physical interactions rather than the scanning behavior. In [21], NASA launched a human performance modeling project, analyzing pilot performance using five cognitive modeling tools. The project supports the prediction of pilot errors and behavior. The cognitive architecture for safety critical task simulation (CASCaS) was proposed in [22]. CASCaS was defined as an operator model that perceives associated intentions through discrete action sequences rather than classifying intentions with probability. The A-PiMod project designed a pilot model based on the adaptive automation concept in [15]. The A-PiMod assessed the pilot state through multimodal interactions between pilots and the cockpit. The literature suggests that to model single-pilot behavior, a fully adaptive automated approach is required. The HMM-based intent recognition module described in [12] could be viewed as a knowledge engineering method based on rules. Annotated interaction data were used to learn the HMM parameters. The module only inferred eight basic flight tasks, including “set flaps,” “set approach,” and “change heading,” owing to the model’s performance limitations.

According to a domestic and foreign research foundation, the current research on pilot intention modeling primarily focuses on system modeling and analysis verification based on classical algorithms, while there are few studies on pilot intent recognition based on intelligent system. The single-pilot intention is implemented through a series of pilot manipulation behaviors that exhibit the typical characteristics of the time sequence change. Nevertheless, the majority of the above approaches are employed to recognize static characteristics; consequently, there is no apparent superiority in handling process events with timing features like aircraft piloting. With the rapid advancement of deep learning, this study proposes an intention modeling method for commercial aviation single pilots based on bidirectional long short-term memory (BiLSTM), which treats intention recognition as a timing classification problem. The proposed methods, whether from the evaluation of human-machine interaction design or the model capability, can handle the increasingly complex challenges of aircraft pilot intent recognition.

This study is structured as follows: Section 2 describes the relevant theoretical basis; Section 3 provides an overview of a single-pilot intention model; Section 4 defines the intention space and characteristics and designs the SPO test; Section 5 details the single-pilot intention modeling method based on the BiLSTM network; a case study is conducted in Section 6 to demonstrate the interaction dataset and perform model evaluation and comparative analysis; and finally, discussion and summary are made.

2. Preliminaries

The classification task of time series information is the core of single-pilot intent recognition. Different data preprocessing and feature capture techniques contribute differently to the classification. Learning a vector representation for an operation item is crucial. The one-hot vectors created using one-hot encoding are the most used representation. Deep learning’s BiLSTM is used to better capture the sequence information because of the vector’s sparse feature.

2.1. Long Short-Term Memory (LSTM)

The traditional neural network model is inefficient in processing sequence learning because it cannot explain the correlation between the front and back of the sequence [23]. Recurrent neural networks (RNNs) are a type of sequence learning model that can dynamically learn sequence features. However, owing to the gradient disappearance or the explosion problem [24], traditional RNNs are difficult to train. In 1997, Hochreiter and Schmidhuber proposed LSTM [25], which introduced a gating structure and solved the aforementioned problems. The LSTM unit diagram is shown in Figure 1.

The cell state at the time step contains the data that the LSTM unit learned from the previous time steps, and the hidden state at the time step contains the output of the LSTM unit for the time step. The gate structure, which does not provide information but is used to limit the amount of information, must control the information update of the cell state. The gate structure is a type of multilevel feature selection technique.

The input gate regulates the frequency of cell state update. The forget gate regulates the frequency of cell state forget. The output gate manages the frequency of adding cell state to the hidden state. The cell candidate is responsible for adding information to the cell state. The formulas are as follows:

In above formulas, is the input of the time step . , , , and are the components of the input weights . , , , and are the components of the recurrent weights . , , , and are the components of the bias . The denotes the sigmoid function, and denotes the hyperbolic tangent function.

At the time step , the cell state and the hidden state are given by where denotes the Hadamard product.

2.2. Bidirectional Long Short-Term Memory (BiLSTM)

In the traditional RNN and LSTM model, information can only be propagated forward, causing the state of the time step to depend only on the data before the time step [23]. The BiLSTM network is proposed to make each instant include the data of the upper and lower time steps.

The BiLSTM network uses both forward and backward observations to enhance performance. Thus, the BiLSTM network concatenates the forward and backward hidden states before passing them to the next network layer [26]. The following equation defines the corresponding calculation formulas [27] and represents the forward and backward processes, respectively.

When the input is , the output of the forward hidden state of the BiLSTM network is defined as , and the output of the backward hidden state of the BiLSTM network is defined as ; then, the formula of the hidden state matrix is

A graphical illustration of the BiLSTM network is shown in Figure 2.

2.3. One-Hot Encoding

One-hot encoding [28] encodes states using -digit status register, and the result is an -dimension binary vector.

Assuming that the state number of the status register is , the positive integer is , and the one-hot vector is defined as a -dimensional real column vector; there is where is the -th element of the vector .

3. Single-Pilot Intention Model

The theoretical foundation for addressing thinking deviation and intention inconsistency in the collaborative process between a single pilot and cockpit automation system is the effective modeling of single-pilot intent. To enhance the single-pilot-automation partnership, this model was developed with reference to the European project A-PiMod. Figure 3 shows the technical scheme block diagram for single-pilot intention model. The human-machine multimodal interface (HMMI) collects real-time data on interactions with the test cockpit systems. The intention inference module provides the cockpit automation system with an adequate understanding of the flight intents that the single pilot is performing. The situation awareness assessment can detect unsafe operation actions and develop explanatory mechanisms for observed and perceived single-pilot behavior. The cockpit risk assessment continuously provides information about the risk for all currently single-pilot behavior. The results of these two assessments will impact the use of a proactive risk or safety tool, supporting many functions such as cross-check of safety critical tasks, decision assistance, and dynamic assignment of flight tasks. By applying the intention recognition results to the assistance tool of airplane driving, the cockpit automation system could properly alert the single pilot in a risk status in the light of a certain flight intent to assure that the aviator is able to precisely maneuver the airplane.

4. Single-Pilot Operation Experiment

4.1. Airworthiness Certification Requirements

According to the in-depth analysis of the airworthiness requirements of FAR 25.1523 “minimum flight crew,” when approving the minimum flight crew, the airworthiness authority shall consider the operational complexity and corresponding workload level of a single pilot under the architecture in Figure 3. FAA TC-13/44 (flight deck controls and displays), CAAP 5.59-1(0) (single-pilot human factors), and CAP 737 (flight crew human factors) consider the problem of human factors in the cockpit from the aspects of human-machine interface design and multimodal input equipment. According to AC25.1523, the airworthiness authority must pay special attention when one pilot is incapacitated to work for some reason, while the other pilot must continue to safely fly and land the plane. The situation has implicitly adapted to the SPO. Additionally, it is known from the flight manual of a particular commercial aircraft that the authority and manufacturer have established standard operating for various flight tasks in the two-person crew operation. Flight monitoring, inspection, navigation, communication, and other functions related to the copilot in the standard operating procedures are appropriately simplified and assigned to the main pilot for execution, and a flight simulation system is used to conduct the SPO test. According to AC25.1523, although the task allocation method of the SPO test will increase the workload of the single pilot, it can still ensure safety.

4.2. Single-Pilot Operation Test

To gather the required data to train and validate the single-pilot intention model, we conducted SPO tests in the digital experimental cockpit (DECO) of the Civil Aviation University of China in Tianjin. The DECO is a modular flight simulation system that can simulate the most advanced cockpits that aircraft manufacturers are currently producing. Figure 4 shows an image of the DECO. DECO can fulfill the collection needs of the SPO test interactive data stream after secondary development. We invited three cadet pilots to fly simulations in the DECO. All these participants were experienced in flying a B737-800. A series of experimental flight scenarios have been chosen to evaluate the single-pilot intention modeling concept. The scenarios integrated a “theatre approach” and incorporated a variety of task elements. Each participant flew multiple times in each flight scenario and completed the given flight task. Each scenario lasted approximately one hour, commencing with the cockpit preparation phase and ending with the landing phase. Before each SPO test, the participant was randomly assigned a flight plan and required to perform several mission events (take-off, climb, descent, etc.) in order. During the test, the participant received random emergency alerts and was expected to execute emergency operating procedures in response. Both the flight path and each mission event of the single-pilot test are indicated in Figure 5.

4.3. Single-Pilot Intention Space

The single-pilot intention space varies with application scenarios and mission forms. It needs to be defined together with a unique safety requirement of the SPO mode. For the normal or emergency operation process of the SPO mode, this study established a single-pilot intention space including 11 types of flight intent labels. As shown in Figure 5, these labels include engine starting on the ground, take-off and climb, descend approach and land, go-around, cabin altitude high and make an emergency descent, make a forced landing, engine fire extinguishing, engine starting in the air, pitch trim fault recovery, aircraft hijacking, and single-pilot incapacitation. The issue of inferring the single-pilot intentions must be modeled and resolved in conjunction with the aforementioned flight intent labels. The aforementioned flight intent labels can be modified to fit a particular situation, and their quantity can be increased to meet specific requirements. Given that in the current modeling, even if the pilot operates according to an intention unlisted here, the model will estimate either of the 11 flight intents, which is incorrect. To enable a single-pilot intention model to reject examples from unknown intentions while performing -class classification tasks [29], class () is defined as the unknown intent label. Despite the lack of prior knowledge about the unknown intention, we can still use the known flight intent labels to detect the unknown [30]. The uncertainty threshold is defined based on the probability distribution of the softmax layer output, which further simplifies the issue of unknown intention recognition.

4.4. Single-Pilot Intention Recognition Feature

Reviewing a single-pilot intention necessitates using a function that enables automation to continuously perform surveillance actions with low costs but high benefits. This includes whether the information source is accessible or difficult to locate, whether the information provided is easy or hard to understand, and whether the sampled information source is current or not [31]. DECO’s main goal is to provide data collectors with maximum flexibility, to meet the aforementioned specifications.

To control the flight of the aircraft, the sole pilot must display a specific behavior, and the cockpit must support the operation of the flight crew. A single pilot must interact with the flight simulation system in certain ways, such as “start right engine,” “autothrottle off,” “select course,” “press TOGA,” and “landing gear DN.” These interactions are called operation items. This study focused on 92 of these interactions because they are essential for triggering flight mission events and are therefore closely related to task intent data. The operation items collected during the SPO test will form an operation sequence that can accurately reflect the single pilot’s decision-making intent as , where is the length of the sequence.

5. Single-Pilot Intention Modeling Method Based on BiLSTM

5.1. Vectorization of Single-Pilot Operation Item

Vectorizing the single-pilot operation items, that is, preprocessing these data into a format that the BiLSTM layer can directly accept and understand, is the first step of single-pilot intention modeling. One-hot encoding is a classical vectorization technique.

Let be a positive integer index of the operation item with , . Suppose there is a one-hot vector , according to formula (5), there is

According to the order of operation item in the operation sequence, combined to generate the one-hot matrix , there is

Natural language processing (NLP) theory states that the correlation increases as the Euclidean distance of a word vector decreases and decreases as the Euclidean distance increases. Therefore, the one-hot vector of the single-pilot operation item also includes an implicit semantic correlation. Given that the Euclidean distances between two different one-hot vectors are always equal, there is where .

The semantic correlation between one-hot encoded single-pilot operation items is the same. This means that the one-hot encoding actively ignores the semantic data while primarily preserving the time series relationship. Compared with the distributed representation method [32], the one-hot encoding results in a more concise expression of the operation item and effectively creates a one-hot matrix with high representativeness with the operation sequence, which is more conducive to feature capture. Figure 6 shows the visualization outcomes of 92 single-pilot operation item vectors in low-dimensional space using T-distributed stochastic neighbor embedding (T-SNE). Based on the one-hot encoding technique, the distance from the circle’s center to each single-pilot operation item is the same, specifically as shown in Figure 6(a).

5.2. Single-Pilot Intention Model Based on BiLSTM

As a crucial way in understanding a single pilot’s behavior, intention recognition can be abstracted as intention classification. This study uses the sequence input layer, the BiLSTM layer, the fully connected layer, the softmax layer, and the classification output layer to build the BiLSTM network architecture as a classifier.

The application flow of the BiLSTM network architecture is as follows: first input the one-hot matrix using the sequence input layer; the BiLSTM layer is then used to capture the one-hot matrix’s timing properties in both directions and obtain a hidden state matrix with bidirectional long-term dependencies between operation sequence time steps; map to all 11 neurons in the fully connected layer, with each neuron corresponding to a flight intent label; the softmax layer applies the normalized exponential to calculate the normalized probability scores for the 11 flight intent labels; finally, the classification output layer outputs the classification outcome of the operation sequence’s flight intent label. Figure 7 shows the single-pilot intention model’s BiLSTM network architecture diagram, and Table 1 presents an analysis of the network architecture.

The model training algorithm is shown in Algorithm 1.

Input: Dataset: dataset
Output: BiLSTM Model: bilstm_model
flight_intent_labels, operation_sequencesload dataset
intent_dictionaryoperation_sequences
foroperation_sequenceinoperation_sequences : do
  XwordEmbedding(intent_dictionary, operation_sequence)
  ycategorical(flight_intent_label)
end for
bilstm_modelbuild_bilstm(MiniBatchSize, NumHiddenUnits, layer_size)
loss, outbuild_output(bilstm_output, in_size, out_size)
solverAdamOptimizer(loss, LearnRate)
forepochinepochs: do
  forMiniBatch_X, MiniBatch_yinget_MiniBatch(X, y, MiniBatchSize)
  do
   bilstm_model.run(loss, feed=input: MiniBatch_X, target: MiniBatch_y)
   bilstm_model.run(solver, feed=input: MiniBatch_X, target: MiniBatch_y)
  end for
end for
returnbilstm_model

6. Case Analysis

6.1. Simulation Environment

Intel Xeon Silver 4214R processor (2.4 GHz) and 64 G of RAM serve as the experimental hardware platform. Windows 10 operating system is the experimental software platform, while the MATLAB R2022a programming language is the development environment.

6.2. Dataset of Single-Pilot Intention Modeling

A data sample is obtained by the DECO interactive sensor for every action and entered into the database. Specific time steps can effectively present the trend of the intentions underlying the pilot’s driving behaviors. Therefore, for every U time step, one operation sequence sample is annotated from the acquired time series data based on the experience of experts. The following three principles that guided data annotation are discussed: (1) interviewing field experts and noting operation sequences that might trigger flight intents. (2) To determine the flight intent label of the operation sequence, the field experts select the standard operating procedures written by CAAC or the manufacturer. (3) Multiexpert cross-check is used to reduce the impact of errors.

To implicitly establish the mapping relationship between the single-pilot operation sequence and the single-pilot intention space, this study trains the BiLSTM-based single-pilot intention model using the interactive dataset that DECO recorded. First, the intent label of the operation sequence is calibrated using the knowledge engineering method to obtain the entire dataset. The preprocessed dataset is then input into the BiLSTM network architecture for training to obtain the mapping relationship. In real time, the sensor integrates one sequence sample from the gathered single-pilot operation items every continuous time step, encoding the collected sequence into the trained single-pilot intention model. The outcome of the single-pilot intention recognition is finally obtained.

The dataset contains 173 operation sequences, of which 61 are training sets, 94 are test sets, and 18 are validation sets. The training set was used to fit the data features, the test set was used to assess the model performance, and the validation set was used to assess overfitting [33]. An example of the dataset is shown in Table 2.

6.3. Evaluation Indicators of Single-Pilot Intent Model Performance

It is essential to define the evaluation indicators to assess the generalization performance of this model. Precision, recall, and F1 are the three selected evaluation indicators.

TP presents the quantity of true positive examples, FP presents the quantity of false positive examples, and FN presents the quantity of false negative examples.

6.4. Hyperparameter Setting of Single-Pilot Intent Model

By controlling variables, the model’s hyperparameter setting obtains the most appropriate parameter values to obtain the best effect of a single-pilot intent classification: MaxEpochs, learning rate, MiniBatchSize, NumHiddenUnits, optimization function, and loss function are the hyperparameters that must be set.

6.4.1. MaxEpochs

MaxEpochs denotes the maximum number of training epochs. The generalization performance of the model gradually improves, as the epochs increase. However, if the number of epochs is too large, it may cause an overfitting issue. Figure 8 shows the corresponding F1 at various MaxEpochs. According to the figure, the classification performance F1 gradually improves as MaxEpochs increases. F1 is relatively stable when MaxEpochs is above 25 and reaches its maximum value when MaxEpochs equals 35. Hence, the MaxEpochs is set to 35.

6.4.2. Learning Rate

The learning rate is a key parameter that affects the speed at which the training algorithm updates the weight. If the learning rate is too high, training will only produce suboptimal results or diverge; if it is too low, training time will be long. Figure 9 shows the corresponding F1 at various learning rates. According to this figure, 0.002 is the best learning rate.

6.4.3. MiniBatchSize

MiniBatchSize represents the minibatch size. Setting MiniBatchSize determines how the sequence data will be padded and how the F1 will be impacted. Figure 10 shows the corresponding F1 at various MiniBatchSizes. Figure 10 demonstrates that eight is the best MiniBatchSize.

6.4.4. NumHiddenUnits

NumHiddenUnits represent the number of hidden units. NumHiddenUnits correspond to the amount of data remembered from the sequence and directly affect the classification performance F1. If NumHiddenUnits are too small, the network’s learning ability will be impaired; if NumHiddenUnits are too large, the layer is prone to overfitting the training data. The corresponding F1 at various NumHiddenUnits is shown in Figure 11. From the figure, 32 is the ideal number of NumHiddenUnits.

Cross entropy is chosen as the loss function in this model. Cross entropy, a common loss function for assessing classification performance, lowers the risk of gradient disappearance during stochastic gradient descent [34], which is often better than the classification error rate or the mean square error. To optimize the loss function, the Adam optimizer is also selected in this model.

The optimal hyperparameter combinations are shown in Table 3.

6.5. Simulation Results and Model Evaluation

Generally, maneuvering-behavior changes occur in real time and are discrete during the operation of commercial aircraft. The one-hot encoding technique is used to preprocess the operation sequences before probabilistic classification, which is a prerequisite for single-pilot intention prediction. Figure 12 presents the visualization result of one-hot encoding for the second operation sequence listed in Table 2.

The BiLSTM-based single-pilot intention model is trained for 35 epochs, and we obtain an accuracy of 96.5% on the training sets and 94.4% on the validation sets. The training progress diagram is shown in Figure 13. According to the simulation results, the F1 score of the proposed model in this study is 95.60% on the test set. A confusion matrix with the diagonal line representing the number of correctly identified samples was made to further observe the relationship between various intentions, as shown in Figure 14.

Based on the proposed single-pilot intent model’s network architecture, deep learning training is conducted in a sequence-to-sequence manner. The flight intent labels of each time step of the interaction data are determined using the newly trained model. All inputs are required in advance to determine the flight intents at each time step because BiLSTM also uses the input from the future to the past. To avoid the real-time application issue, the interaction data are forcibly cut; specifically, each () operation item sampled constitutes an entire operation sequence and is identified, which significantly reduces the waiting time for sampling subsequent operation items. For comparison, a stairstep graph is drawn, as shown in Figure 15. Calculations show that the model has a good classification effect because the classification performance F1 is 90.79%.

Figures 14 and 15 demonstrate that the model effectively expresses the data, owing to the high relevance between the current flight intent and interaction data. This corresponds with the real situation. Figure 16 shows the label meanings of the numbers in Figures 14 and 15.

6.6. Analysis and Comparison of the Different Modeling Methods on Single-Pilot Intent

The long short-term memory (LSTM), term frequency-inverse document frequency (TF-IDF), convolutional neural network (CNN), Naive Bayesian (NB), and distributed representation intent modeling techniques were compared with the proposed modeling technique to demonstrate its superiority. The hyperparameters of LSTM are shown in Table 3. TF-IDF refers to the weighted distributed word vectors with TF-IDF, which embodies the contribution of various single-pilot operation items to the classification task [23]. The CNN technique employs five-sized three sliding convolution filters. Based on known prior probabilities, Naive Bayesian methods use Bayesian formulas to calculate posterior probabilities. The VEC refers to the distributed vector representation created by word2vec, which converts a single-pilot operation item into a 9-dimensional real vector.

Figure 17 shows the F1 scores from 10 repeated simulations using various modeling techniques. The F1 scores of each method are relatively stable in multiple simulations. The CNN, Naive Bayesian, and VEC have poor classification performance, with F1 scoring below 65%. The F1 score increases above 90%, under BiLSTM or LSTM, which is suitable for sequence modeling.

Figure 18 shows the average results of precision, recall, and F1 of 10 repeated simulations with various modeling techniques. The proposed method has a precision of 95.82%, a recall of 95.28%, and a F1 of 95.55%. Additionally, the F1 scores of BiLSTM and LSTM are superior to those of TF-IDF, Naive Bayesian, and CNN. VEC showed the worst classification performance. A specific analysis is as follows: (1) The LSTM model, which is marginally worse than the BiLSTM model, can only transmit information from front to back and cannot capture bidirectional information of the operation sequence. (2) TF-IDF performs worse than the LSTM model because it can only learn the features of term frequency and inverse document frequency in the operation sequences but has no ability to learn the time series features. (3) The one-hot matrix is a two-dimensional matrix stacked by 0 and 1, and CNN can be used to automatically extract features from it. The classification performance of CNN is lower than LSTM because CNN can only extract local information, which cannot meet the requirements of learning presequence and postsequence association information. (4) The Bayes formula is used in Naive Bayesian to calculate the posterior probability based on the known prior probability of the research object. The method’s classification performance is poor because the dataset does not adhere to the premise of the independence hypothesis. (5) Word2vec converts an operation item into a low-dimensional real vector. Although the VEC can avoid the dimensionality curse caused by the input of a high-dimensional one-hot vector, it primarily contains the semantic information of operation items and cannot contain the behavior information. Therefore, this method is completely unsuitable for modeling flight intent.

7. Discussion of Single-Pilot Intention Modeling

The proposed intention modeling concept can support a single-pilot operation. A fully adaptive approach of human-machine coordination requires robust modeling and detailed analysis of a single-pilot intention. The single-pilot cognitive status confirmation may require integration between the flight simulation system DECO and crew monitoring. This model’s simulation output enables a thorough examination of single-pilot task intents. Analysis of task intents demonstrates differences in task complexity and automation assistance requirements under different scenarios. Thus, the results of intention recognition could serve as the basis for a cockpit automation system to optimize human-machine coordination and enhance cooperation based on the pilot’s intention, particularly when a single pilot’s behavior or intention is abnormal. However, the real scenario of single-pilot intent modeling is often more complex than the ideal one, and the impact of real-time model performance and multimodal input on system development remains to be discussed. The real-time dynamic recognition of the single pilot’s intent is instructive for aviator error action prediction and calibration. The cockpit automation system must stand ready to offer additional safety recommendations with reference to risks/hazards and effective action plan. Because of the time delay of entire sequence sampling, the real-time intention perception of the BiLSTM model lags slightly behind the single-pilot action/sequence change. Nonetheless, even in certain emergency operating procedures or under difficult operating conditions, a minor delay in intention information does not typically cause severe consequences. Thus, the proposed model can still be applied to rapidly check and refresh the potential intents, thereby satisfying the real-time and effective requirements of intent recognition. Further, careful attention must be given to the techniques and measures applicable to identifying interaction. This involves assessing the crew cognitive status via multimodal input, such as touch, gesture, voice, and eye-tracking. The action of gazing can directly reflect the pilot’s attention allocation and plays a vital role in optimizing the intention inference capacity. In the future, we plan to append eye-tracker data as operation sequence input, which may contribute to a classification rate increase and will require extra development.

8. Conclusion

This study proposes a method for modeling a single pilot’s intention in commercial aviation based on BiLSTM. To identify the single-pilot intentions, we design a quantity of experiments to extract the single-pilot operation sequence from the flight simulation system DECO records. More unsafe decision intentions or manipulation behaviors could be discovered through the mining and analysis of these single-pilot behavior data, which is significant for enhancing the pilot-automation partnership. Additionally, the BiLSTM model considers the time-dependent data and can determine the classification of intentions using a feedforward neural network and softmax linear variation. The above research findings expose the correlation mechanism of a single pilot’s behavior and intention coupling to a limited degree. Compared with the other five single-pilot intention modeling methods, the proposed model achieves a better effect of single-pilot intention identification. However, theoretically, the flight intents can also proceed in a manner with alternating actions, which involve the single pilot interrupting one flight intent, executing another, and next resuming the previously performed flight intent. When the intent within the operation sequence migrates, it will not only affect the reasoning logic of the model itself but also cause identification confusion. These will be the upcoming significant works.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This paper was sponsored by the National Key Research and Development Program (2021YFB1600600), the Tianjin Education Commission Scientific Research Project (2022KJ058), and the Fundamental Research Funds for the Central Universities (3122022044).