Abstract

Electroencephalography (EEG) is a reliable method for identifying the onset of sleepiness behind the wheel. Using EEG technology for driving fatigue detection still presents challenges in extracting informative elements from noisy EEG signals. Due to their extensive computational parallelism, which is similar to how the brain processes information, neural networks have been explored as potential solutions for extracting relevant information from EEG data. The existing machine learning frameworks suffer from high computing costs and slow convergence, both of which contribute to low classification accuracy and efficiency due to the large number of hyper parameters that need to be improved. It is necessary to automate this micronap detection process before it can be used in real-time scenarios. To distinguish between micronap and non-micronap states, a deep neural network (DNN) framework is developed in this research using different EEG representations as input. Additional EEG representations utilized in this investigation include cleaned EEG as a time series, log-power spectrum, 2D-spatial map of log-power spectrum, and raw EEG. Finally, traditional machine learning algorithms are evaluated for their effectiveness in detecting micronaps from these EEG inputs. The findings suggest that micronap detection can be greatly improved by combining cleaned EEG with DNN.

1. Introduction

In many professions and everyday activities, the ability to sustain concentration is essential. When a process or system is semiautomated, the importance of this element increases. Automation has permeated all professions in today’s world. Even though maximum blood alcohol levels and the enforcement of speed limits have recently decreased, traffic accidents continue to be a major source of fatalities and injuries [1]. Among the many circumstances where it is essential for the individual affected and those closest to them to concentrate, the brief amount of time during which the drivers involved in these incidents lost awareness or concentration is frequently unknown to them at the time of the accident. The fatigue masked by the accident disguises their drowsiness. A micronap is an unintentional, brief spell of unconsciousness that can last up to 30 seconds. Micronaps can be distinguished from tiredness by their complete lack of visuomotor responsiveness and partial eye closure. Micronaps can occur without warning. A person might not be aware of having experienced these light sleep stages [2].

Micronaps are more likely due to physical exhaustion, mental exhaustion, circadian rhythm problems, and boredom from repetitive work. However, even in subjects who are not sleep deprived and perform repetitive activity without warning signs such as fatigue, these types of lapses are possible [3]. The drivers involved in these collisions were unaware of the short period of time they lost consciousness or concentration or how the postcrash adrenaline masked their fatigue. Major safety issues are raised by this, especially, for those who engage in high-risk professions including driving, flying, navigating, traveling by boat, and process control, all of which call for consistent, uninterrupted visuomotor function. Therefore, accurate detection of approaching micronaps has the potential to prevent fatal mishaps and save lives. Micronaps are challenging because they occur suddenly and unintentionally [4].

Machine learning has been used in several studies to identify microsurges in EEG data. These experiments were primarily concerned with developing a model for identifying short naps and testing different feature extraction, selection, and reduction strategies. It is challenging to maintain relationships when using such algorithms with electroencephalogram (EEG), multivariate signal, and dynamic time series signal as input. Additionally, since the features provided to machine learning algorithms are hand-picked by algorithm or system designers, they have difficulty understanding the dynamics of microvoltages due to selectivity invariance. Recent innovations such as deep learning (DL) algorithms that can extract, analyze, and capture information from unstructured data offer the full answer. With DL, greater focus may be put on model creation to enhance the effectiveness of micronap detection [5, 6].

The performance of the model is significantly influenced by the feature selection. Features that are derived from data using a specific handcrafted process based on expert knowledge are referred to as handcrafted features. This might limit how the characteristics and signals the data provide can be portrayed. In contrast, learners are built from a dataset using a training approach to achieve a certain goal (e.g., gender recognition). The objective of this article was to create a dependable technique for detecting brain activity micropulses that could be used as the foundation for a real-time system that would warn the subject of his condition and avert a deadly accident.

The main contribution of this article is as follows:(1)To design deep neural networks (DNN) to classify the micronap and nonmicronap classes(2)To use various EEG representations as an input form to the designed DNN and analyze its performance

The contribution of real-time data to the identification of micronaps is another important finding in this article. After this thorough introduction, Section 2 provides a survey of the literature. The data, an illustration of a micronap detection system, the validation processes, and the performance metrics, are all provided in Section 3. The various EEG input data formats are discussed in Section 4 along with a comparison of the outcomes, followed by the conclusion in Section 5.

1.1. Related Study

We will review key concepts in EEG-based sleep, exhaustion, and sleepiness research. Forms of lapsing and an overview of several micronap detection studies are also reviewed. Understanding the nature and properties of the EEG information associated with brain activity in various contexts is required in order to accomplish this [7]. Additionally, this is required to back up the choice to employ EEG to detect micronaps. The human brain is revealed by an electroencephalogram (EEG), a measurement of potentials. It is a simple test that shows how the brain changes over time [8]. EEG is frequently used by medical professionals and researchers to study brain activity and identify neurological problems. Modern research in many fields depends on EEG. A patient’s brain death, the severity of a stroke or head injury, epileptic activity, sleep issues, and many other things can all be determined with this technique in medicine. It is useful in other studies investigating various cognitive processes such as memory or attention as well as in linguistic and clinical studies such as aphasia [9, 10].

Experts have to spend a lot of time visually checking the stages of sleep in order to assign a score. In order to diagnose and treat sleep-related diseases, automatic classification of sleep stages is preferred. A response lapse is a period of time during which a person is unable to respond to a continuous task. Depending on the underlying cognitive systems, there are several types of blackouts. Some typos cause delays in rapid response, while others lead to incorrect responses. Some errors may result in complete sensory-motor collapse. A brief interruption that causes a delay or lack of reaction in the main work without making the person unconscious is called an attention lapse [11]. In certain situations, a person could unconsciously engage in a secondary task such as walking, looking, or driving. The unwelcome loss of consciousness related to sleep happens during micronaps. The individual enters the stage of light sleep during this brief (up to 30 s) time. The behavioral signs of micronaps include head nodding, lack of facial expression, and partial eye closure. Sleep is defined as a period of inactivity lasting more than 30 seconds [12, 13].

The current benchmark for micronap state identification, which reflects the top outcomes for micronap state detection on unpruned data, was attained using a range of classifiers and an ensemble of features. Despite numerous studies on the subject, there is still no technique that performs well enough for use in practical situations [14]. The accuracy of EEG data was tested using a variety of convolutional neural network (CNN) topologies. To create learning representations that are efficient and resistant to intrinsic EEG noise as well as inter- and intrasubject variation, deep recurrent convolutional neural networks are used [15]. To avoid the need for handcrafted features, deep belief networks, an unsupervised feature learning architecture, were applied to the sleep data [16]. Compared with handcrafted features, the deep belief network (DBN) technique improved the sleep classification accuracy. The task identifies seizures using a range of formats and machine learning algorithms. Numerous applications based on EEG have shown the efficacy of CNNs and recurrent neural networks (RNNs), including epilepsy, seizures, and different stages of sleep. EEG has been successfully compressed using a convolutional auto encoder (CAE) with the best subspace reduction [17]. DNNs’ architecture permits the addition of signal processing techniques to conventional statistical data. DNNs are very extendable and flexible. From single-layer shallow patterns to numerous successive convolutional layers, they can have different numbers of convolutional layers [18, 19]. This article used deep learning to analyze scalp EEG signals to identify the phases and onsets of micronaps.

2. System Methodology

2.1. Dataset Description

This is the first behavioral and EEG dataset ever collected using deep learning. The main motivation for selecting the deep dataset (https://www.kagggle.com/) for this work’s initial experimentation is that it has been widely used in previous research projects that have shown a number of discoveries and characteristics of micronaps [20]. As a result, a basic standard was established, especially for the detection of micronaps. Ten healthy individuals between the ages of 20 and 45 were examined. None of the participants had any neurological or sleep issues in the past or present, and all of them had enhanced visual acuity. In addition, all subjects reported a sound sleep the previous night (mean = 7.35 h, with a minimum of 6 h) and were, therefore, not considered sleep-deprived. Sixteen electrodes (10–20 international standards) were placed on the subjects' scalp at the following locations: F7, F3, F4, F9, T7, C3, CZ, C4, T8, P7, PZ, P4, P8, and 01 during the EEG capture task at a sampling frequency of 240 Hz [21].

EEG and face images were captured at a frame rate of 15 frames per second (fps). Based on visual indications including head nodding, head jerking, and protracted eyelid closure, the facial video helped identify gaps brought on by micronaps. The gold standard has been developed through the use of visual signals and performance tracking. The validation of training and test data was made easier by this designation [22, 23]. All subjects took part in the experiment, which took place in a 4 m2 space equipped with a PC screen and all necessary devices for EEG data collection. The distance of the object from the screen was between 75 and 115 cm. All subjects were fitted with an EEG headset and pretests were performed to ensure the integrity of the entire recording pipeline. An experimenter oversaw the session from a nearby room with a one-way glass window. Subjects were instructed to refrain from excessive head movements, eye blinks, and facial muscle contractions before the study began. Subjects were instructed to silently count the number of target flashes to help maintain focus.

2.2. EEG Preprocessing

Each subject’s scalp’s recorded EEG signals completed the preprocessing illustrated in Figure 1 as follows:(i)The reference was reset to a common average reference in order to improve the signal-to-noise ratio. The EEG data were then band-pass filtered from 0.5 to 1 Hz using a high pass filter.(ii)The independent component analysis (ICA) components were projected to the calibration data space using its own covariance matrix.(iii)To take into account the EEG’s nonstationarity, it is divided into 2 min sections with 50% overlap.(iv)Each epoch’s calibration data were discovered and utilized to clean the same epoch. The original EEG data were then cleaned by concatenating the epochs. To prevent discontinuity, the overlapping portions of succeeding epochs were averaged.

A behavioral expert divided episodes of blackouts into one of four groups as follows: Type 0 is with drooping lids and periods of noticeably increased tracking imprecision but a response rate above zero (complete eye closure). Type 1: episodes persist longer than 500 ms, have drooping eyes, and have a flat or jumbled response (complete or partial slow eye closure). Micronaps are defined as episodes lasting less than 30 seconds, while naps are classified as episodes lasting over 15 seconds. Type 2 is strange incidents and forced eye closure. Type 3 is lack of droopy eyes and flat or senseless shouts. Micronap events were transformed into labels at 250 Hz (gold standard). Only Type 1 was employed to learn, recognize, and detect the onsets of micronaps in this investigation, and only the responsive states were used as the gold standard [2426].

2.3. Deep Neural Network Micronap Detection Framework

Figure 2 depicts the overall micronap detecting mechanism. Figure 3 depicts the proposed DNN framework. As previously noted, the EEG data obtained from the scalp are preprocessed to reduce artifacts and serves as an input to visualize and extract suitable features on its own. The first layer specifies the input dimensions. ReLU and max-pooling layers are inserted between a series of convolutional layers that make up the intermediate layers. To produce a single output, the pooling layer subsamples tiny rectangular blocks from the convolutional layer. The softmax layer and fully linked layers are used in the final layer to categorize patterns.

2.4. Micronap State Detection

Both intrasubject (during and between sessions) and intersubject variance has an impact on the number of occurrences and their corresponding lengths. Usually, discrete labels are applied while classifying things. This work’s gold standard was discretized or sampled at 2.5 Hz. We used a sliding window with duration of W to segment the entire 30 minutes of EEG data. Figures 4 and 5 show the EEG during sleep and the typical EEG, respectively. Figure 6 uses the EEG segments as inputs to show the micronap states at r. This procedure is used every 0.5 s up until all states of the gold standard have been recognised. The classifier can only decide whether the behavioral state that correlates with it is a micronap or a responsive state (i.e., r = 0) at the end of the EEG window. For r > 0, where r is the distance between the state under consideration and the window’s end, the same conclusions are true.

2.5. Micronap Onset Detection

It is impossible to pinpoint and mark the precise onsets of micronaps due to the existence of ambiguous labels. The first instance of a micronap state’s definite occurrence after the responsive state is considered to be the official beginning of a micronap in this article. All responsive states and micronap onsets were the industry standard for onset detection. EEG segments of W duration in the time domain or frequency domain, with or without spatial information, served as the inputs for the identification of the micronap states at r.

2.6. Classification

The classification process used a 0.5-second temporal resolution. Each individual experienced micronaps at a distinct frequency and length, which produced quite diverse imbalance ratios. Table 1 displays the frequency and length of micronaps by subject as well as imbalance ratios of states and onsets. According to Table 1, with subject 9 having the largest imbalance ratio, all subjects, with the exception of subjects 5 and 6, show considerable imbalance ratios between micronap states and responsive states. When it comes to the onset of micronaps, the imbalance ratio gets worse. The imbalance ratio between the two classes significantly worsens when it comes to onsets. Table 1: Number of states = 4 Time between events (temporal resolution of 0.5 s).

2.7. Validation and Performance Measures

To evaluate the model’s real performance in classifying the micronap and responsive classes, data from a test subject must be completely hidden from the validation and training processes. In other words, one makes an estimation of the model’s behavior with an entirely hypothetical topic. For this purpose, the planning and experimentation phases of this article employed the following strategy:(i)keep one subject out of a total of ten for independent testing.(ii)train the deep learning model with the remaining 9 subjects.(iii)utilize a leave-one-subject-out cross-validation technique (LOSO-CV). The deep learning model’s regularization as well as its hyperparameters (number of filters, size of filters, number of layers, types of layers, and number of layers) are altered in order to obtain the optimal performance.(iv)Each layer’s hyperparameters were successively swept through a set of values. Automatically, the best area under curve (AUC) is used to determine the ideal values.(v)Up until all 10 test individuals have been employed, we repeat processes 1 through 4.(vi)calculate the performance measurement averages.

The measures of evaluation serve a dual purpose in designing classifiers and evaluating their performance. The concepts present in a confusion matrix for a binary (two-class) classification problem can be used to comprehend the majority of threshold metrics. Three crucial parameters, sensitivity (Sn), specificity (Sp), and precision, can be estimated using the matrix (P). In this article, two more parameters are utilized to assess the model’s test performance in addition to the geometric mean, phi, and others. The area under the receiver operating characteristic (ROC) curve and the area under the precision-recall (PR) curve are two more curve-based measures with measurements that are independent of threshold. The performance of the models is compared using the paired nonparametric Mann–Whitney U test. Every input format that the EEG and DL models generated was compared. In order to identify the best input modalities and network architectures, the performances of various combinations of input modalities and network designs on a certain window size and dataset were examined.

3. Experimentation Results

A comparison of performance metrics between various input combinations and DNN was also conducted. The outcomes of detecting the micronap state utilizing the following EEG inputs are shown in this section.

3.1. Cleaned EEG as a Time Series

Different EEG window lengths are utilized to find the best EEG window length and matching model, including 0.5 s, 1.5 s, 2.5 s, 3.5 s, 4.5 s, and 5 s. The cross-validation ROC was used to determine which model was best for each subject. Table 2 indicates that the cross-validation ROC was the greatest for window durations of 3.5 s and 4.5 s. The 3.5 s window was found to be the ideal window length because the 4.5 s window requires more processing for the same performance.

3.2. Log-Power Spectrum

For several subjects, the Maxpool layer performed best in cross validation when the log-spectrum was used as the input, and its size and stride were both 11. This is what would have happened if the layer had not been there. Similar to this, the dropout layer had an effect on the performance of certain subjects. For state detection, the average values for Sn, Sp, P, Phi, ROC, and PR were 0.62, 0.70, 0.15, 0.22, and 0.76, respectively. Figure 7 displays a comparison of performance metrics when the log-power spectrum is the input.

3.3. 2D-Spatial Map of Log-Power Spectrum and Raw EEG

The input dimension was 56×56, where the number four represented a mixture of the frequency bands for delta (6), theta (8), alpha (α), and beta (þ). There were 4 possible band combinations that could be entered as input. By feeding the individual bands to the DNN, the data contained in the individual bands to help with micronap state detection are also evaluated. The dual combination of bands, which consisted of the bands 6 and 8, 8 and α, and 6 and α, was also investigated in addition to the examination of each individual band. Table 3 shows the average performance metrics.

Because it would include all artifacts, including EOG, raw-EEG was chosen as the time series input to the DNN. Figure 8 illustrates how well state detection works.

4. Machine Learning Approaches

The performance metrics are derived when several types of EEG representations are given as input, trained with SVM, KNN, and LSTM classifiers, and then tested. Table 4 lists the average state detection performance metrics phi, ROC, and PR. There were no appreciable improvements in phi and ROC scores between the classifiers. DNN was used as an end-to-end solution along with EEG, although the increase in phi was just marginal (from 0.59 to 0.62). However, DNN has a superior performance as shown in Figure 9 in terms of sensitivity (Sn) (0.84), compared to SVM (0.78) and LSTM (0.72). When it comes to state detection, the DNN model surpassed the LSTM model (0.52) in terms of PR (0.68). The SVM has outperformed the LSTM in terms of PR (0.62). With EEG as the input, the average performance metric for state detection had the highest values in terms of phi, ROC, and PR. EEG waves may, therefore, be a valuable source of data for identifying onset or status.

In comparison to previous EEG input transformations, state detection performance was best when DNN and cleaned EEG were combined. When identifying micronap states, the 6 and 8 band combination held important information. A DNN’s filter structure is essential for extracting information from the data. What the DNN will represent in the data throughout its learning phase depends on the filter size. The DNN was employed in this work as a series network, which imposed limitations on the design. Furthermore, regardless of whether the input was a representation in the time domain or the frequency domain, the DNN always interpreted it as a 2D input. To comprehend characteristics connected to micronaps and important aspects in the decision-making process, layer-wise feature analysis must be conducted. However, a parallel DNN framework can be created to enable the simultaneous application of several filter sizes to the same input (EEG alone). CNN will be able to learn about traits from a variety of perspectives as a result.

5. Conclusion

This article summarized the various modalities of EEG inputs and demonstrated the performance of a DNN-based micronap categorization system. The many representations of EEG employed as an input to the DNN model include cleaned EEG time series, log-power spectrum, 2D-map of log-power spectrum, and raw EEG. The hyperparameters and DNN model architectures were tweaked. The use of a cost-based error function was used to address the imbalance in the courses during training. LOSO-CV was used to test each model in order to estimate its generalized performance. Sensitivity, specificity, accuracy, phi, ROC, and PR were the performance metrics used to assess the model. The best average performance for state detection (r = 0) was a phi of 0.24, ROC of 0.64, and PR of 0.68, attained using DNN. Despite having a reasonably good sensitivity and specificity for onset detection, the average performance measure phi was extremely poor. This was brought on by a significant number of false detections and a large ratio of imbalance between the onsets of micronaps and responsive states. The DNN learned the changes in activity between the prefrontal, central, and occipital parts of the brain, as well as the delta, theta, and alpha bands chronologically. To determine which raw-EEG signal contained more specific information that the DNN could extract for identifying micronaps, experiments were also conducted with raw-EEG signals without any preprocessing. The purpose of the micronap state detection system is to continuously monitor the level of attentiveness and forecast impending micronaps before they happen. If a micronap state prediction is incorrect, an attempt will be made to identify the subsequent micronap state. The micronap onset detection system, on the contrary, works nonstop to forecast only the beginning of an impending micronap. However, the entire micronap event is lost if an onset detection is missed which can be taken for future work.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.