Abstract

The ionospheric state is becoming increasingly important to forecast for the reliable operation of terrestrial and space-based radio-communication systems which are influenced by ionospheric space weather. In this study, we have investigated and tested a multivariate long short-term memory (LSTM) deep learning model for its forecasting accuracy over different latitudinal regions during the solar quiet and solar active years. We also tested its prediction capability during the occurrence of a geomagnetic storm. Four stations qaq1 (60.7°N, 46.04°W), baie (49.18°N, 68.26°W), mas1 (27.76°N, 15.63°W), and bogt (4.64°N, 74.08°W) in the northern hemisphere were used in this study. To optimize the feature extraction process, we used heat map to find the correlation between TEC and the various exogenous parameters and finally nine correlated parameters were used as inputs to train the LSTM model. The performance of the LSTM model was validated by comparing it with the multilayer perceptron (MLP) machine learning algorithm using root mean square error (RMSE) and mean absolute error (MAE) as evaluation indices. The results showed an accuracy improvement of 70% and 64% over MLP during the solar quiet and active years, respectively. The prediction accuracy of our LSTM model was also 74% better than MLP during the geomagnetic storm event. These findings demonstrate the effectiveness of the developed LSTM model and the right selection of the exogenous parameters in estimating TEC, and suggest that this LSTM model can be used for short-term TEC forecasting.

1. Introduction

The ionosphere, found in the Earth’s upper atmosphere, plays a crucial role in the propagation of radio signals. Variations in the density of electrons in the ionosphere can impact the speed and delay the radio signals traveling from Global Navigation Satellite System (GNSS) satellites to receivers, ultimately decreasing positioning accuracy. The Total Electron Content (TEC) parameter is directly proportional to the influence of the ionosphere on satellite signals. The TEC causes delay in the radio waves traversing through the ionosphere. Thus, it is a major source of error in navigation systems like GNSS. Hence, prediction of ionospheric TEC is important in radio communications, radar systems, navigation and positioning systems. Forecasting the TEC successfully will help in correction of positioning errors caused by the ionosphere [1]. Various physical processes, such as solar radiation, geomagnetic storms, and atmospheric tides, can significantly impact the ionosphere, leading to substantial variations in TEC values. So, there is a need to comprehend the spatiotemporal variations in TEC and develop robust global and regional TEC models [2]. This requirement has been emphasized by experts in this field and has become a critical research topic in ionospheric studies. Thus, this research paper aims to investigate the parameters which cause significant variations in TEC and develop accurate model to improve the prediction of ionospheric conditions for the reliable functioning of radio-communication systems.

Ionospheric TEC forecasting methods can be classified into empirical methods, statistical methods and machine learning methods. Empirical models describe the state of the ionosphere as a function of latitude, altitude, solar cycle, day of the year, season, geomagnetic activity etc. The traditional ionospheric error mitigation approaches like the International Reference Ionosphere (IRI) model [3], the Klobuchar model [4], the NeQuick model [5] show limitations during complex ionospheric dynamics. Statistical methods like autoregressive [6], autoregressive moving average (ARMA) [7], autoregressive distributed lag (ARDL) model [8] have been developed in the past for forecasting regional short-term ionospheric TEC. Over the years, several neural network models have been developed for prediction of TEC and various related parameters at regional levels. These machine learning models use various algorithms to map nonlinear and complex relationship between input and output. Neural network [911], wavelet-based ANN [12], support vector machines [13], nonlinear radial basis function [14], and genetic algorithm-based neural network [15] are a few among them. However, these models fail to consider the time sequential feature of TEC. In deep learning, recurrent neural networks (RNNs) are very effective in modeling sequential data but vanishing and exploding gradient problems are common issues in these deep neural networks. LSTMs are a special type of RNN’s which can model long range temporal dependencies due to the presence of memory cells. The ability to learn long-term dependency in time series prediction makes LSTM models more powerful than any other type of neural networks. Several studies have analyzed and compared the performance of LSTM prediction models for Total Electron Content (TEC). LSTM was shown to outperform ARIMA and seq2seq models during a magnetic storm [16]. It was found that LSTM had the highest accuracy in predicting ionospheric delay, compared to NN and IRI models [17]. LSTM had better prediction accuracy than MLP under both quiet and stormy geomagnetic conditions [18]. The prediction of TEC using LSTM was better than back propagation (BP) at midlatitude station during different solar conditions [19]. These studies collectively provide evidence that LSTM performs better in TEC prediction compared to other neural networks and existing empirical models. However, the performance of the TEC prediction models at different latitudes and during different solar conditions needs further study. The current paper aims to investigate the performance of TEC prediction model in high latitude, midlatitude, low latitude, and equatorial region. Although literature suggests different exogenous parameters that could be used for TEC prediction, none of the literature studies have found out the correlation between TEC and other exogenous parameters which could be used for better prediction accuracy. This study focuses on determining the exogenous features which correlate well with TEC using heat map and Pearson correlation coefficients and using them for TEC prediction. Our model is tested for prediction accuracy during low solar activity year 2008, high solar activity 2014 and during the occurrence of geomagnetic storms. Additionally, this study also compares the prediction results of LSTM with MLP at different latitudinal regions during different solar and geomagnetic conditions.

The contributions of the present study are as follows:(i)A multivariate deep learning LSTM model for prediction of ionospheric TEC is developed and its performance is compared with Multilayer Perceptron (MLP).(ii)Using heatmap and Pearson correlation coefficients, the parameters that correlate well with TEC are determined and used as exogenous parameters for the proposed model to improve the accuracy of TEC prediction.(iii)Performance of the model is examined for high latitude, midlatitude, low-latitude, and equatorial region and during different solar conditions such as solar quiet year (2008) and active year (2014) and during the occurrence of geomagnetic storm event (2011).(iv)The simulation results prove that the proposed LSTM model achieved better performance than multilayer perceptron at all times and for all regions.

2. Materials and Methods

2.1. Input Parameters for the LSTM Model

In the present study, ionospheric TEC data along with other exogenous parameters are used. The TEC data sampled at every one hour was downloaded from IONOLAB site [20]. The major parameters that cause ionospheric TEC variations are the local time, season, solar and geomagnetic activities [21, 22]. The solar parameters that affect TEC are solar radio flux F10.7 and sunspot number (SSN). Planetary index (Kp, Ap) and Disturbance storm time index (Dst) are used as geomagnetic parameters in the study of ionospheric TEC variations [2326]. Also, the interplanetary magnetic field (IMF) data and along with plasma speed (Vp) and proton density (Np) have shown to yield better results in prediction of TEC maps [27]. Thus, the literature suggests various exogenous parameters that can be used for better prediction of TEC. To select the right exogenous parameters which could be used for TEC prediction, we plotted the correlation heat map which shows the correlation of the exogenous parameters with TEC (see Figure 1). Correlation coefficients quantify the relationship between variables. A heat map provides a visual representation of these correlations, making it easier to identify patterns and relationships between variables. Strong correlations may indicate a potential relationship or dependency between variables. This can help identify which variables are most influential or redundant in the dataset, allowing us to make informed decisions about feature selection. The Pearson correlation coefficients are plotted in Figure 2. Based on the correlation values we selected the input parameters for the LSTM model. The exogenous parameters, which show strong positive correlation with TEC are Np, Ap, and Kp with correlation coefficient greater than 0.6. and F10.7 show moderate positive correlation with correlation coefficient greater than 0.35. Dst, Vp, and show a moderate negative correlation with correlation coefficients in the range −0.63 to −0.41 whereas time shows a mild negative correlation with correlation coefficient −0.24. Although, time showed a mild negative correlation still it was used as one of the exogenous parameters since TEC exhibits diurnal characteristics. Auroral Electrojet (AE) shows very low negative correlation (−0.075) and hence wasn’t used in this study (see Figures 1 and 2).

Table 1 shows the exogenous parameters used along with their units and symbols. All these parameters were obtained from NASA OMNI website and were sampled after every 1 hour to maintain consistency with TEC data [28].

A brief description of these exogenous parameters is given as follows:Solar flux F10.7: TEC tends to be lower during periods of low solar activity and higher during times of high solar activity [25]. This is because solar activity affects the ionosphere, which in turn affects TEC. One commonly used measure of solar activity is the F10.7 solar flux, which is directly linked to the number of sunspots, as well as the levels of ultraviolet and visible solar radiation. Therefore, we use F10.7 since it is a reliable predictor of solar activity levels.Disturbed storm index Dst: The intensity of geomagnetic storms is often measured using the Disturbed storm index, also known as the Dst index. This index quantifies the deviation of the earth’s magnetic field from its normal variation during a quiet day. Under normal conditions, the Dst index typically ranges from 0 to −50 nT. However, during intense geomagnetic storms, the index can drop to below −200 nT [26].The Interplanetary Magnetic Field has two components that are particularly relevant for understanding its impact on the ionosphere: The east-west (By) component and the north-south (Bz) component [26, 27].IMF By: The By component of the magnetic field is oriented perpendicular to the plane of the Earth’s magnetic equator and is responsible for the formation of the equatorial ionization anomaly (EIA). The EIA is a region of enhanced ionization that forms over the magnetic equator and is responsible for the peak in ionospheric total electron content (TEC) observed in the daytime. During periods of strong fields, the EIA can extend to higher latitudes, leading to an increase in TEC at these locations as well.IMF Bz: The Bz component of the magnetic field is oriented parallel to the earth’s magnetic axis and is responsible for the formation and behavior of high-latitude ionospheric structures, such as the auroral oval. During periods of southward (negative) Bz, the solar wind can penetrate deeper into the Earth’s magnetosphere, leading to enhanced ionization and TEC at high latitudes.Plasma speed, Vp: This refers to the speed at which plasma, a gas consisting of ions and free electrons, moves in the ionosphere. The plasma speed in the ionosphere is affected by a variety of factors, including the earth’s magnetic field, solar activity, and the density of the plasma. When the plasma speed is high, it can cause irregularities in the ionosphere, which can lead to variations in TEC [27].Proton density, Np: Np refers to the number of positively charged hydrogen ions in the ionosphere. Proton density is also affected by various factors, such as solar activity and the earth’s magnetic field. When the proton density is high, it can increase the ionization of the atmosphere, which can cause an increase in TEC [27].Planetary 3-hour range index Kp: This index is a measure of the geomagnetic activity that is obtained by averaging the standardized K-index from 13 observatories located between 44° and 60° latitudes in both the hemispheres. This index serves as an indicator of the overall level of geomagnetic activity, with higher values indicating more severe activity [25].Planetary index Ap: The Ap index is a measure of the strength of geomagnetic activity that ranges from 0 to 400. Higher values of Ap index correspond to more intense geomagnetic activity [25].Time of the day: The ionospheric TEC exhibits a daily pattern where the electron density increases gradually to reach its peak at noon, then decreases to its lowest point at midnight. The sine and cos functions, and , where local time, , ranges from 0 to 23, help in normalizing the time input in the range (−1, 1) [25].

To study the impact of latitude on TEC prediction, we used four stations in the northern hemisphere each corresponding to a different latitudinal region. The stations used along with their latitudes and longitudes are shown in Table 2. We used TEC data from 1st–31st July, 2008 corresponding to the low solar activity year, 1st–31st July, 2014, corresponding to the high solar activity year and 1st July to 31st October, 2011, for the geomagnetic storm event.

2.2. Analysis of the LSTM Model

The LSTM model is made of multiple LSTM cells. Each LSTM cell has a forget gate, input gate, memory cell and output gate. The flow of information at any timestep in the LSTM cell is controlled by three gates, namely, the forget gate, input gate, and the output gate [1619]. The internal structure of each LSTM cell is shown in Figure 3.Forget gate: The previous hidden state, , and the current input data, , are fed into a forget gate, which is a neural network. This network (which uses a sigmoid function) is trained such that for irrelevant input it outputs close to 0 and closer to 1 when the input is relevant. The output of this gate is . These values are then point wise multiplied with the previous cell state to yield . By performing a pointwise multiplication, the cell state components that the forget gate network has identified as irrelevant will be multiplied by a value close to zero. As a result, these components will have minimal influence on the following stages of the process. The forget gate is represented mathematically by (1) and (2).Store/Update the current cell state: The input gate decides which new information should be added to the network’s long-term memory (cell state). The process involves two neural networks: the memory cell and the input gate. The memory cell uses the activation function to generate a new candidate vector that determines how much each component of the long-term memory should be updated based on the new input data. The input gate utilizes the sigmoid activation function to selectively determine which information from the input should be stored in the new memory vector. The outputs and are pointwise multiplied to retain only the relevant information. The resulting combined vector, , is then added to the cell state, , to update the long-term memory of the LSTM network, . This way, the network can selectively store and update relevant information while discarding irrelevant information.Output gate: The output gate decides the new hidden state . We first pass the cell state through to force the values in the interval [−1, 1]. The output gate with the sigmoid function determines what goes from this LSTM cell to the output.where W denotes the different weight matrices with the connections of each weight matrix indicated by its indices and b denotes the bias terms of each of the fully connected layers. Equations (1) to (8) mathematically represent the internal operations occurring in each of the LSTM cell.

2.3. Implementation of the LSTM Model

We built the LSTM model using Keras and TensorFlow deep learning libraries in Python. The flowchart for the development and implementation of proposed methodology is shown in Figure 4.

The various steps performed to build, train, and test the model are as follows:Step 1: Data PreparationFirst, we read the .csv file and find if there are any missing values. Next, to eliminate the influence of different measurement scales and range differences among the variables, the entire dataset is standardized using the minmax scaler between (0, 1). We then convert the time series into a supervised learning problem by using a sliding window algorithm [2]. In the sliding window approach, we have used a fixed-size window that slides over the time series data, and the model is trained on each window sequentially. At each time step, the input sequence to the LSTM model is the past observations within the window, and the output is the next value in the time series. The window is then moved by a fixed step size and the process is repeated until the end of the time series. Consider the time series data of length , window size , and the step size 1 (see Figure 4). We frame instances of length containing input and output datasets to train the LSTM model (see Figure 5). The first W samples of each instance is considered as training input data and the sample is the target data. In our algorithm, we have used past 24 values in each window as the input and the next value as the output. The step size is taken as 1. For predicting the future TEC values, the model is trained to predict for one next time step and then the state of the network is updated after every prediction. For training, validation, and testing the model for solar quiet and active years, we have used 1 month data each. The TEC data is sampled after every one hour, thus the total number of TEC samples is 744. The total number of exogenous parameters used are 10 and these are also sampled after every one hour. Thus, the total number of samples used for exogenous data equals 7440. For training, validation, and testing the model for the geomagnetic storm event, we have used 4 months data. Total number of TEC samples is 2952. The total exogenous parameters used are 10 and total number of samples used for exogenous data = 29520. This data is partitioned into 70-14-16 split. 70% of the data is used for training the model, 14% is used for validation, and then the model is tested on 16% of the data.Step 2: Design and fit the modelThe LSTM model consists of an input layer, one LSTM layer, dense (fully connected layer), and an output layer (see Figure 6). The input layer inputs the past TEC data along with the selected exogenous parameters to the LSTM layer. The model consists of a single layer LSTM with the 100 LSTM cells. The fully connected dense layer with one output unit produces a single TEC value as the final prediction. The model is trained such that it learns to predict the next time step for each iteration of the input time step. The network optimizer is chosen as Adam and the loss function is MSE. The number of epochs is set to 100, and validation loss value is monitored by the early stopping method, where patience is set to 10. The dense layer (fully connected layer) uses the Leaky ReLU activation function to avoid the dead neuron problem.Some important hyperparameters for LSTM models include the number of LSTM layers, the number of LSTM cells per layer, the learning rate, the batch size, and the dropout rate [29]. In this work, the LSTM network with 100 LSTM cells has been used to achieve an adequate LSTM model’s performance. We have used the hyperparameter grid search method for hyperparameter tuning and selecting the optimal values of batch size, dropout rate, and number of epochs. In addition, the Adam solver, a step-descent algorithm with a variable learning rate of 0.001 and the drop rate of 0.2 for 100 epochs, is considered (see Table 3).Step 3: Forecast TECThe model predicts one TEC value at a particular time step, and then the network state is updated. Thus, for each future prediction, it uses the past predicted value as an input. Since, the entire data were standardized to [0, 1] before training the model, we do the inverse normalization to recover the actual data. We calculate the MAE and RMSE. We also plot the observed and predicted TEC waveforms for the test data. The plot of training and validation loss v/s number of epochs shows that the model fit is exact and the model can be used for forecasting (see Figure 7).

The validity of our model’s results could not be compared to the existing literature in a meaningful way due to the significant variations in methodologies used across studies, including differences in station selection, amount of training data used, and the solar quiet and active years used. So, we simulated an MLP NN model for the same stations during the same solar quiet and active years, and we compare our results with the MLP NN model.

2.4. Implementation of MLP NN Model

A multilayer perceptron (MLP) is an artificial neural network that consists of an input layer, one or more hidden layers, and an output layer. Input features are normalized and passed through the network where each hidden layer applies an activation function to the weighted sum of inputs. The output of each hidden layer is passed to the next layer until the final output is produced. The weights of an MLP are updated during training using an optimization algorithm such as the stochastic gradient descent, based on the difference between the predicted and actual outputs. The objective of training is to minimize the error or loss function while avoiding overfitting or underfitting. MLP can handle nonlinear relationships between inputs and output, but the number of hidden layers and neurons can significantly affect performance. Choosing the optimal hyperparameters, such as hidden layer size and , is critical. In our algorithm, hls can take four different values, (50), (100), (50, 50), and (100, 100), while fa can take three different values, (“relu,” “tanh,” and “logistic”). The optimal hyperparameters are determined by performing a grid search to evaluate all possible combinations of hyperparameters and selecting those with the best performance metrics, such as RMSE, MAE, and R2. Table 4 shows the optimized hyperparameters selected for both the solar quiet year 2008 and the solar active year 2014.

Using these hyperparameters, we train and later test the MLP NN model.

2.5. Evaluation Metrics

To evaluate the robustness and effectiveness in predicting TEC, we use RMSE and MAE as the metrics. RMSE helps in assessing the accuracy of forecasting models, as it measures the square root of the average squared differences between the predicted and actual values. On the other hand, MAE provides the average of absolute errors, which can better indicate the magnitude of the errors in the predicted values. Therefore, using both RMSE and MAE provide a more comprehensive evaluation of the forecasting models.wheret and are the actual and the predicted TEC values and is the total number of observations.

3. Results and Discussion

3.1. For the Solar Quiet Year 2008

The original and predicted TEC waveforms for the period from 28th July to 31st July, 2008, are plotted in Figure 8. Figures 8(a)8(d) show the original and the predicted TEC for the qaq1, baie, mas1, and bogt stations in the high-latitude, mid-latitude, low-latitude region, and the equatorial region, respectively.

Figures 8(a)8(d) clearly show that that the predicted TEC is very close to the original TEC most of the times except for small deviations when TEC reaches the maximum or minimum values. The predicted TEC waveform, most of the times, exactly overlaps the original TEC for all the stations during the solar quiet year 2008. We tested the LSTM model performance by computing RMSE and MAE (see Table 5).

The average RMSE and MAE values are 1.038 and 0.719 TECU, respectively. The results indicate an accuracy improvement of 70% (using RMSE metric) and 76% (using MAE metric) over MLP.

3.2. For the Solar Active Year 2014

The original and predicted TEC waveforms are plotted in Figure 9 for the period from 28th July to 31st July, 2014. Figures 9(a)9(d) show the original and the predicted TEC for the qaq1, baie, mas1, and bogt stations in the high-latitude, midlatitude, low-latitude regions and the equatorial region, respectively.

Figures 9(a)9(d) show that that the predicted TEC is very close to the original TEC for high latitude (qaq1) and midlatitude (baie) stations compared to the low-latitude (mas1) and equatorial (bogt) regions. This is because the ionosphere is more intensely ionized near the equator during solar active years. We also tested the model performance for the solar active year (2014) by computing RMSE and MAE (see Table 6).

The average RMSE and MAE are 2.656 and 2.111 TECU, respectively, which is much lower than that predicted by MLP. The results indicate an accuracy improvement of 64% (using RMSE metric) and 66% (using MAE metric) over MLP. The comparison of the MLP and LSTM models using RMSE and MAE evaluation metrics is shown in Figures 10 and 11, respectively.

As expected and as shown in Tables 5 and 6 and Figures 10 and 11, the RMSE increases with decreasing latitude, indicating better prediction at higher and midlatitude regions compared to lower and equatorial regions for both the solar quiet and active years. The ionospheric TEC variations over the low-latitude regions are difficult to model and predict due to the equatorial ionization anomaly [17]. The presence of more sunspots on the surface of the sun during an active solar year causes an increase in the amount of solar radiation that reaches the earth, resulting in a higher TEC in the ionosphere. This phenomenon is particularly noticeable at the equator, where the ionosphere is most intensely ionized. During years of high solar activity, such as those with more solar flares and coronal mass ejections, there may be temporary disruptions in the ionosphere, leading to short-term increase in TEC. As a consequence, the RMSE and MAE also tend to be higher during such high solar activity periods (see Figures 10 and 11). This finding is consistent with the research in this field.

3.3. During Geomagnetic Storm Events

The year 2011 experienced 3 geomagnetic storms with a minimum Dst of less than −100 nT [30]. The date of occurrence of these storms along with the corresponding Dst values are shown in Table 7. We tested our LSTM model for prediction of TEC during the geomagnetic storm at the midlatitude baie station.

The geomagnetic storm with Dst values −115 nT, −118 nT, and −147 nT on days of the year (DOY) 218, 269, and 298 caused sudden fluctuations in TEC (see Figure 12) [30]. We used the data from July 1st, 2011, to October 31st, 2011, which included these three storm events. The data from July 1st, 2011, to October 12th, 2011, was used to train the LSTM model, and then it was tested for unseen data from October 13th to October 31st, 2011, which included a strong geomagnetic storm with Dst = −147 nT.

Figure 13 shows the original and predicted TEC by our LSTM model during the occurrence of the geomagnetic storm during the period from 13th October to 31st October, 2011. This includes a very strong storm with Dst = −147 occurring between 24th and 25th October, 2011. Before the occurrence of the storm, TEC increases steadily, but during the storm event, there is a sudden decrease in TEC, which is correctly and accurately predicted by our LSTM model (see Figure 13). The periodic variation is also accurately captured by the LSTM model.

Table 8 shows the comparison of the MLP and LSTM model during the occurrence of the storm event. The results indicate that the accuracy of our LSTM model is 74% better (using RMSE metric) and 77% better (using MAE metric) than MLP.

Compared to MLP, LSTM has predicted TEC with more accuracy (see Tables 5, 6, and 8). This is because LSTM has a memory cell, which allows the LSTM model to learn and maintain information about past inputs over longer periods of time. This is especially important for time series data, such as TEC, since it exhibits diurnal characteristics. MLPs do not utilize history information, so they may struggle to capture long-term dependencies in the data. As a result, MLPs fail to predict accurately in cases where the data is noisy or turbulent, such as in the case of TEC data, particularly during solar active years and during the occurrence of geomagnetic storms.

4. Conclusion

We propose a multivariate deep learning LSTM model for the prediction of TEC data. We determined the correlation of TEC with other exogenous parameters and used the ones that correlated well with TEC. We validated the performance of our model at different latitudinal regions during the quiet and active solar years. We also tested our model during the occurrence of a geomagnetic storm event. LSTM is a powerful tool for TEC prediction because it is capable of capturing long-term dependencies, handling nonlinear patterns, dealing with variable-length sequences, and mitigating the vanishing gradient problem, thus maintaining accuracy and performance while dealing with time series TEC prediction. The proposed LSTM model has better performance accuracy than the classic MLP during the solar quiet year, the solar active year, and also during the occurrence of the geomagnetic storm. The RMSE and the MAE values obtained using the LSTM model are found to be very low, suggesting that this method could be well-suited for forecasting the ionospheric TEC at all times and for all latitudinal regions. In the future, we wish to develop a global deep learning model that can capture both the spatial and temporal features of TEC data and can help in the prediction of TEC data covering a larger latitudinal and longitudinal range.

Data Availability

The TEC data used in this paper were downloaded from IONOLAB (https://www.ionolab.org) site. All the exogenous parameters were obtained from NASA OMNI website: https://omniweb.gsfc.nasa.gov/form/dx1.html.

Conflicts of Interest

The authors declare that they have no conflicts of interest.