Abstract

XOR is a special nonlinear problem in artificial intelligence (AI) that resembles multiple real-world nonlinear data distributions. A multiplicative neuron model can solve these problems. However, the multiplicative model has the indigenous problem of backpropagation for densely distributed XOR problems and higher dimensional parity problems. To overcome this issue, we have proposed an enhanced translated multiplicative single neuron model. It can provide desired tessellation surface. We have considered an adaptable scaling factor associated with each input in our proposed model. It helps in achieving optimal scaling factor value for higher dimensional input. The efficacy of the proposed model has been tested by randomly increasing input dimensions for XOR-type data distribution. The proposed model has crisply classified even higher dimensional input in their respective class. Also, the computational complexity is the same as that of the previous multiplicative neuron model. It has shown more than an 80% reduction in absolute loss as compared to the previous neuron model in similar experimental conditions. Therefore, it can be considered as a generalized artificial model (single neuron) with the capability of solving XOR-like real problems.

1. Introduction

Minski and Perpert deduced that the XOR problem requires more than one hyperplane [1]. They provide a more generalized artificial neuron model by introducing the concept of weights and proved the inability of a single perceptron for solving ‘Exclusive-OR (XOR)’ [2]. The XOR problem is symmetrical to other popular and real-world problems such as XOR type nonlinear data distribution in two classes, N-bit parity problems. [3]. Therefore, many researchers tried to find a suitable way out to solve the XOR problem [415]. Although, most of the solutions are for the classical XOR problem. They either use more than one layer or provide a complex solution for two-bit logical XOR only. Few of these used the complex value neuron model, eventually creating one more layer (i.e., hidden layer). Because the complex value neuron model requires representing the real input in a complex domain, one approach is based on the multiplicative neuron model. This is translated multiplicative neuron (πt-neuron) approach [16, 17]. They have modified the π-neuron model (which generates the decision surfaces centered at the origin of input) to an extended multiplicative neuron, i.e., a πt-neuron model for solving the N-bit parity problems by creating tessellation surfaces. However, it has limitations for higher dimensional N-bit parity problems. It is suitable for up to six dimensions. For seven and higher dimensional inputs, it has reported poor accuracy [17]. In other words, it has a convergence problem for higher dimensional inputs. It is merely because of the multiplicative nature of the model. More clearly, the infinitesimal errors in the model obtain a much smaller value after getting multiplied in case of higher dimensional inputs, consequently vanishing the gradient. Therefore, a convergence problem occurs in this model for higher-dimensional inputs.

To overcome the issue of the πt-neuron model, we have proposed an enhanced translated multiplicative model neuron (πt-neuron) model in this paper. It helps in achieving mutually orthogonal separation in the case of two-bit classical XOR data distribution. Also, the proposed model has shown the capability for solving the higher-order N-bit parity problems. Therefore, it is a generalized artificial model for solving real XOR problems. To examine this claim, we have tested our model on different XOR data distributions and N-bit parity problems. For parity problems, we have varied the input dimension for a higher dimensional dataset. Our proposed model has no vanishing gradient issues and convergence issues for higher dimensional inputs. The proposed model has accurately classified the considered dataset. Table 1 presents the list of variables used in this article with their meaning.

2. Understanding the XOR Problem

XOR is a classical problem in the artificial neural network (ANN) [18]. The digital two-input XOR problem is represented in Figure 1. By considering each input as one dimension and mapping the digital digit ‘0’ as the negative axis and ‘1’ as the positive axis, the same two-digit XOR problem becomes XOR type nonlinear data distribution in two-dimensional space. It is obvious here that the classes in two-dimensional XOR data distribution are the areas formed by two of the axes ‘X1’ and ‘X2’ (Here, X1 is input 1, and X2 is input 2). Furthermore, these areas represent respective classes simply by their sign (i.e., negative area corresponds to class 1, positive area corresponds to class 2).

There are many other nonlinear data distributions resembling XOR. N-bit parity problem is one such typical example. Both these problems are popular in the AI research domain and require a generalized single neuron model to solve them. We have seen that these problems require a model which can distinguish between positive and negative quantities. Interestingly, addition cannot easily separate positive and negative quantities, whereas multiplication has the basic property to distinguish between positive and negative quantities. Therefore, previous researchers suggested using a multiplicative neuron model for solving XOR and similar problems.

3. Translated Multiplicative Neuron (ΠT-NEURON) Model

The idea of the multiplicative neuron model was initiated by Durbin et al. in 1989 [19]. They named this model the ‘Product Units (PUs)’ and used this model to deal with the generalized polynomial terms in the input. It can learn higher-order inputs easily as compared to the additive units. This is because of its increased information capacity as compared to the additive units [19]. Though, PU has shown the capability for N-bit parity problems. However, it has issues in training with the standard backpropagation (BP) algorithm especially for higher-order inputs (more than three-dimensional input) [20]. According to Leerink et al., it is because of nonglobal minima trapping in the case of higher dimensional inputs [20]. Later, in 2004, Iyoda et al. proposed a single neuron based on a multiplicative neuron model, aka πt-neuron model, to solve the XOR and parity bit problems [16, 17]. They have modified the previous multiplicative π-neuron model to find a suitable tessellation decision surface. They incorporated a scaling factor, a threshold value, and used the sigmoid as an activation function to solve the N-bit parity problems using a single translated multiplicative neuron (the model is defined by equations (1) and (2)) [16].

Here, ‘π‒t’ represents the πt-neuron model mathematically, ‘y’ is the final output through the activation function ‘f’, ‘bπ‒t’ is scaling factor, and ‘ti’ represent the coordinates of the center of the decision surfaces [16]. Mathematically, Iyoda et al. have shown the capability of the model for solving the logical XOR and N-bit parity problems for ∀ N ≥ 1. However, this model also has a similar issue in training for higher-order inputs.

3.1. Limitations of Translated Multiplicative Neuron

The πt-neuron model has shown the appropriate research direction for solving the logical XOR and N-bit parity problems [16]. The reported success ratio is ‘1’ for two-bit to six-bit inputs in [17]. However, in the case of seven-bit input, the reported success ratio is ‘0.6’ only. Success ratio has been calculated by considering averaged values over ten simulations [17]. Also, for successful training in the case of seven-bit, it requires adjusting the trainable parameter (scaling factor bπ‒t) [17]. This is also indicating the training issue in the case of higher dimensional inputs. Moreover, Iyoda et al. have suggested increasing the range of initialization for scaling factors in case of a seven-bit parity problem [17]. Although, after the suggested increment as well, the reported success ratio is ‘0.6’ only [17]. It indicates the problem of training in the πt-neuron model for higher dimensional input.

3.2. Causes of Failure in Πt-NEURON Model

In the backpropagation algorithm, the local gradient ‘δ(n)’ accounts for the required changes in the trainable parameter at ‘nth’ iteration to obtain desired output [21]. It is equal to the product of the corresponding error signal for that neuron and the derivative of the associated activation function [21]. Backpropagation requires that the activation function should be bounded, continuous, and monotonic. Also, it should be continuously differentiable for the entire domain of the input to get optimization [22]. Sigmoid activation function ‘ϕ(x)’ is preferred in the classification problem because it has met all of the aforementioned requirements [23]. Also, it is an appropriate activation function for training multiplicative neuron models [23]. Iyoda et al. have demonstrated the error gradient (∇Ɛ) associated with the πt-neuron model [17]. Here, ‘Ɛ(n)’ is the error energy, i.e., the instantaneous sum of the error squares at ‘nth’ iteration. The error gradient has two components, one is due to the scaling factor ‘(bπ‒t),’ given by equation (3), and the other is due to the thresholds ‘ti’, given by equation (4) [17].

Here, ‘n’ represents ‘nth’ iteration, ∀ (k = 1, 2, 3, …, N). ‘xk(n)’ is the ‘kth’ input for ‘nth’ iteration, and ‘π‒t(n)’ represents πt-neuron model. Therefore, the error’s gradient obtains a much smaller value after getting multiplied for higher dimensional inputs and becomes an infinitesimally small value. Consequently, vanishing the gradient. Therefore, a convergence problem occurs in this model.

It is inferred from Figure 1 and equation (1) that the πt-neuron model has ranged between [‒1, 1] for XOR and N-bit parity problems. Here, ‘‒1’ corresponds to digit ‘0’, and ‘+1’ corresponds to digit ‘1’. Sigmoid function has a basic issue of vanishing gradient near the extremes as shown in Figure 2(a). However, about the XOR and N-bit parity problems, the input varies between [‒1, 1] only, as explained earlier. Therefore, the main region of interest is incorporated by a rectangular box of sigmoid activation function in Figure 2. Here, it is important to notice that margin between two points has been reduced by the sigmoid activation function (as shown in Figure 2(a), ϕ (‒1) = 0.2689, and ϕ (1) = 0.7311). Therefore, it leads to the smaller local gradient ‘δ(n)’ value (given by Equation (5)) which consequently results in smaller error gradients (given equations (3)–(5)), eventually leading to the gradient vanishing problem.

For higher dimensional input, the error gradient (∇Ɛ) attains further smaller values because of the presence of the factor (∏ (xk(n) + tk(n))) in the expression of error gradients (as given by equations (3)–(5)). Therefore, the possibilities of nonconvergence/nonglobal minima problems occur in the previous πt-neuron model. To overcome this issue, the model should have a larger margin for the extreme values. It is possible by introducing a compensatory scaling factor in the model. It eventually scales the sigmoid activation function, as depicted in Figure 2(b). Therefore, in [17], the author suggested using a scaling factor ‘bπ‒t’. However, it requires an optimized value of the scaling factor to mitigate the effect of multiplication and sigmoid function in higher-dimensional problems. Because the effect of multiplication and sigmoid function is severe in higher-order input, Iyoda et al. recommended initializing the scaling factor only with higher values (not to the threshold factor) for the seven-bit parity problem [17]. Convergence is not possible with a smaller scaling factor for the higher dimensional problem (results given in ‘Table 2’ of [17] follow this statement). Though, the idea of increasing the learning rate for the scaling factor is worth overcoming the vanishing gradient problem in higher dimensional input. However, an optimized value of the learning rate is not suggested in the previous πt-neuron model. Also, it is difficult to adjust the appropriate learning rate or range of initialization of scaling factors for variable input dimensions. Therefore, a generalized solution is still required to solve these issues of the previous model. In this paper, we have suggested a generalized model for solving the XOR and higher-order parity problems by enhancing the pt-neuron model.

Robotics, parity problems, and nonlinear time-series prediction are some of the significant problems suggested by the previous researchers where multiplicative neurons are applied. Forecasting involving the time series has been performed using the multiplicative neuron models [2426]. Yildirim et al. have proposed a threshold single multiplicative neuron model for time series prediction [24]. They utilized a threshold value and used the particle swarm optimization (PSO) and harmony search algorithm (HSA) to obtain the optimum weight, bias, and threshold values. In [25], Yolcu et al. have used autoregressive coefficients to predict the weights and biases for time series modeling. A recurrent multiplicative neuron model was presented in [26] for forecasting time series.

Yadav et al. have also used a single multiplicative neuron model for time series prediction problems [27]. In [28], authors have used the multiplicative neuron model for the prediction of terrain profiles for both air and ground vehicles. Egrioglu et al. have represented forecasting purposes like classical time series forecasting using a single multiplicative neuron model in [29]. In [30], Gao et al. proposed a dendritic neuron model to overcome the limitation of traditional ANNs. It has utilized the nonlinearity of synapses to improve the capability of artificial neurons. A few other recent works are suggested in [3135].

5. Enhanced Translated Multiplicative Neuron

We have seen the problems associated with the πt-neuron model. It has an issue with BP training in case of highly dense XOR data distribution and higher dimensional parity problems. In this paper, we have proposed an enhanced translated multiplicative single neuron model which can easily learn the nonlinear problems such as XOR and N-bit parity without any training limitations. We have modified the existing πt-neuron to overcome its limitations. The proposed enhanced translated multiplicative neuron model is represented in Figure 3 and described as follows:

Therefore, the final output through the proposed model for an N-input neuron is obtained by equation (8) as follows:

Further simplifying the proposed model (as given by equation (7)), we have the following:

5.1. Scaling Factor in Proposed Model

The issue of vanishing gradient and nonconvergence in the previous πt-neuron model has been resolved by our proposed neuron model. It is because of the input dimension-dependent adaptable scaling factor (given in equation (6)). The effect of the scaling factor is already discussed in the previous section (as depicted in Figure 2(b)). We have seen that a larger scaling factor supports BP and results from proper convergence in the case of higher dimensional input. The significance of scaling has already been demonstrated in Figure 2(b). Figure 4 is the demonstration of the optimal value of scaling factor ‘b’.

To illustrate the significance of the optimized value of scaling factor ‘b’, we have plotted the gradient of sigmoid function ‘ϕʹ(x)’ by considering variation in the values of ‘b’ in Figure 3. It is observed from the plot that the scaling factor, b = 1, has poor sensitivity for any change in the input. Also, the sensitivity of the ‘ϕʹ(x)’ increases by increasing the value of scaling factor ‘b’. However, as we increase the scaling factor ‘b’ more than 6, we have poor sensitivity regions again, causing gradient vanishing problems. Vanishing gradient regions are shown by encircled areas in the plot. It shows an optimal value is between (3 ∼ 6). For less than three, it has smaller sensitivity, and for more than six, it again shows the gradients vanishing problem. In our experiment, we have empirically found that initializing the scaling factor ‘b’ with the value ‘4’ for each input results in successful training. However, we require to fine-tune the scaling factor according to the input and its dimension.

Therefore, we have considered the optimization of the scaling factor depending on the dimension and value of the input in our model. Therefore, we have considered an adaptable scaling factor (bi) which is associated with each input (xi) in our proposed model (as given by Equation (6)). Further, it has another advantage in that it helps in rapidly achieving the optimized value of the scaling factor without changing the learning rate in training the model. It eventually helps in achieving convergence using the BP algorithm in training the model. Mathematically, the error gradient (∇Ɛ) associated with our proposed neuron model (obtained by equations (3)–(5)) is defined as follows:

Here, the larger scaling factor ‘bN’ accurately compensates for infinitesimally small gradient problems. Therefore, the larger scaling factor enforces a sharper transition to the sigmoid function and supports easier learning in case of higher dimensional parity problems. In the proposed model, the scaling factor is trainable and depends upon the number of input bits. It has exponent term as the no. of input bits means, for higher input we have sharper transition which compensates for infinitesimally small gradient problems. Therefore, the proposed enhanced πt-neuron model has no limitation for higher dimensional inputs.

5.2. Sign-Manipulation in the Proposed Model

The enhanced πt-neuron is based on the multiplicative neuron model. The multiplicative model suffers from a class reversal problem. It is the reversal of class depending upon the number of input bits. It is because of the sign change property of the multiplicative model according to even and odd input dimensions. This leads to severe confusion in classification. To mitigate this issue, we have multiplied a sign-manipulation factor as ‘(‒1)N+1’. Therefore, it introduces an extra negative sign for the even number of input bits to maintain the input combinations belonging to the same class. These two (scaling factor and sign-manipulation) modifications in the existing πt-neuron model have enhanced its performance for highly dense XOR data distribution and higher-order N-bit parity problems.

6. Results and Discussion

We have used gradient–decedent algorithm for training the proposed neuron model. The binary cross-entropy loss function is used for estimating loss between target and trained threshold vectors training on a single ‘Nvidia Geforce eXtreme 1080’ graphic card. The efficacy of the proposed neuron has been evaluated for generalized XOR problems. We have considered a typical highly dense two-input XOR data distribution, as shown in Figure 5. It is applied to both models (i.e., the πt-neuron model and the proposed model) to compare the efficacy of the model. There are many popular loss functions to visualize the deviation in desired and predicted values, such as L1 loss, L2 loss, and L loss. However, in our situation, data points vary between [0, 1], and L1 loss renders the best visualization in such cases. Therefore, we have considered the L1 Loss function, which is the least absolute deviation, and used it to estimate the error. The L1 loss is defined as follows:

Since random weights and biases are important in the training of the model. That is why we have considered He-initialization [36] in our approach. It is a variant of Xavier-Initialization [37]. In He-initialization, the biases are initialized with 0.0 (zero value) and the weight is initialized using Gaussian probability distribution () given as for ‘’ layer. Here, ‘’ denotes the number of connections. Further, to assess the applicability and generalization of our proposed single neuron model, we have varied the input dimension and no. of input samples in training the proposed model. We have considered three different cases having 103, 104, and 106 samples in the dataset, respectively. Results (in all three cases) have been summarized in Table 2. Results show that the loss depends upon the no. of samples in the dataset. It decreases by increasing the number of samples.

Number of samples required in the XOR dataset for appropriate training depends upon the input dimension. It is given by the following equation:

Here, ‘’ is the number of required samples for ‘N’ dimensional input. To understand this relation, consider two-dimensional datasets (i.e., N = 2). Therefore, the no. of the required sample (i.e., ) is obtained by (9) as (=22 = 4). It is the classical exclusive OR (XOR) dataset, represented as {(0, 0), (0, 1), (1, 0), (1, 1)}. Similarly, if (N = 3), then (=23 = 8), which indicates a three-input XOR dataset, and so on. Lesser samples in the training dataset cause nonconvergence and inaccuracy.

Equation (12) tells the number of samples required in the training dataset. Therefore, for ten-dimensional input, the number of samples required for training should be (p=210 = 1024). Therefore, approximately 1,000 samples are sufficient for a ten-dimensional training dataset. However, if we increase the dimension, it requires more no. of samples to train the model appropriately. Otherwise model fails to get converge. The same is shown in Table 3. To assess the accuracy of our proposed model, we repeated each experiment 25 times and provided accurate results. Here, the success rate signifies the ratio of successful simulation over total simulations for each case. In the case of ten-dimensional input for 1000 training samples, the success rate is 0.96, whereas it is reduced to 0.76 in the case of thirteen-dimensional input because of insufficient training samples. However, if we increase the no. of training samples to 10,000, the model report 100% of success ratio. Similarly, for 20 bits input (p=220 = 1,048,576), samples are required. Therefore, by training 1,000 samples, the success ratio is 0.0, while for 10,000 samples, it is 0.32. It increases further to 0.64 for one million samples. These results furnish the importance of no. of training samples for solving XOR type nonlinear problems. Also, by observing the results, we can easily understand the capability of the proposed model for generalized XOR type real problems.

Further, the proposed algorithm has been repeated 30 times to assess the performance of its training. The standard statistical indicators such as mean (μ) and standard deviation (σ) are considered the assessment parameters of the predicted values. Table 4 provides the prediction results (in terms of threshold values (t1, t2) and scaling factor (b)) obtained by the proposed models. It also showcases the mean and standard deviations of the predicted thresholds and bias values.

Table 5 provide values of the threshold obtained by both the pt-neuron model and proposed models. In experiment #2 and experiment #3, the pt-neuron model has predicted threshold values beyond the range of inputs, i.e., [0, 1]. This is because we have not placed any limit on the values of the trainable parameter. It only reflects that the πt-neuron model has been unable to obtain the desired value in these experiments.

L1 loss obtained in these three experiments for the πt-neuron model, and the proposed model is provided in Table 3. This loss function is only used to visualize the comparison in the model. As mentioned earlier, we have used the binary cross-entropy loss function to train our model.

It is observed by the results of Tables 5 and 6 that the πt-neuron model has a problem in learning highly dense XOR data distribution. However, the proposed neuron model has shown accurate classification results in each of these cases. Also, the loss function discerns heavy deviation as predicted and desired values of the πt-neuron model.

Further, we have monitored the training process for both models by measuring the binary cross-entropy (BCE) loss versus the number of iterations (as shown in Figure 6). We should remember that it is the cross-entropy loss on a logarithmic scale and not the absolute loss. It supports backpropagation error calculation which is an issue with smaller errors. It is generally considered an appropriate loss metric in classification problems. Therefore, we have used BCE as a measure to observe the trend of training to compare the πt-neuron model with our proposed model. As observed, the proposed model has achieved convergence which is not obtained by the πt-neuron model. We have examined the performance of our proposed model over N-bit parity problems. We have considered similar data distribution (as that in Figure 5) for parity problems as well. Further, we have compared the training performance of the πt-neuron model with our proposed model for the 10-bit parity problem. Training results of both models have been represented in Figure 7 (by plotting binary cross-entropy loss versus the number of iterations).

We have examined the performance of our proposed model for higher dimensional parity problems. It is to assess the applicability and generalization of our model. We have randomly varied the input dimension from 2 to 25 and compared the performance of our model with πt-neuron. Results are tabulated below. Table 7 provides the scaling factor and loss obtained by both πt-neuron and proposed neuron models.

As mentioned earlier, we have measured the performance for the N-bit parity problem by randomly varying the input dimension from 2 to 25. L1 loss function has been considered to visualize the deviations in the predicted and desired values in each case. The proposed model has shown much smaller loss values than that of with πt-neuron model. Also, the proposed model has easily obtained the optimized value of the scaling factor in each case. Tessellation surfaces formed by the πt-neuron model and the proposed model have been compared in Figure 8 to compare the effectiveness of the models (considering two-dimensional input).

This is observed here that the proposed model has formed an enhanced tessellation surface than that of the πt-neuron model. It is merely because of the optimal scaling. In the case of the πt-neuron model, the scaling factor is (bπ‒t = ‒1.7045), whereas our model has obtained the scaling factor as (b = 4.6900). As we have discussed earlier, the value of the scaling factor associated with input should be around (4) for each input (described in Figure 4). Further, because of the two-dimensional problem, the effective scaling factor in our case is (bN = 21.9961). We have plotted the effective values of the scaling factor in our proposed model and the πt-neuron model on a logarithmic scale to visualize the effect of scaling with increasing input dimension in Figure 9.

The trend of variation of the effective scaling factor with an increasing dimension of input discerns that the proposed model can rapidly increase the required value of the scaling factor to compensate for the effect of miniaturization of errors within higher dimensional input. However, the previous πt-neuron model has no such ability. This is possible in our model by providing the compensation to each input (as given in our proposed enhanced πt-neuron model by equation (6)). We have considered the input distribution similar to Figure 5 (i.e., the input varies between [0, 1]) for each dimension. Results show that the effective scaling factor depends upon the dimension of input as well as the magnitude of the input. Therefore, our proposed model has overcome the limitations of the previous πt-neuron model.

Further, the computational complexity of the proposed model is obtained from the investigation of Schmitt in [38]. Schmitt has investigated the computational complexity of multiplicative neuron models. They have used the Vapnik-Chervonenkis (VC) dimension and the pseudo dimension to analyze the computational complexity of the multiplicative neuron models. The VC dimension is a theoretical tool that quantifies the computational complexity of neuron models. According to their investigation for a single product unit the VC dimension of a product unit with N-input variables is equal to N.

7. Discussion and Conclusions

Translated multiplicative (πt) neuron model has been suggested by past researchers to solve the XOR and N-bit parity problems. However, it has an issue in backpropagation for densely distributed XOR and higher dimensional parity problems. It is an indigenous problem associated with multiplicative neuron models. Though the πt-neuron model has a scaling factor in subduing this problem, however, without suitable initialization, it is unable to obtain the appropriate scaling factor for higher-dimensional input. Therefore, a generalized solution is still required to overcome these issues. In this paper, an enhanced translated multiplicative neuron modeling has been proposed to enhance the performance of the πt-neuron model. The proposed model can obtain the optimized value of the scaling factor for any input dimension. It has solved the existing backpropagation issue of the πt-neuron model. We have considered an adaptable scaling factor associated with each input in our proposed model. This helps in achieving optimal scaling factor value for higher dimensional input. We have assessed the efficacy of our model by randomly increasing input dimensions and considered a magnitude variation between [0, 1] for each input. The proposed model has outperformed the πt-neuron model in each case. It has shown more than an 80% reduction in absolute loss as compared to the previous neuron model in similar experimental conditions. Also, the proposed model has formed a more accurate tessellation surface as compared to the previous model for two-dimensional input. Further, there are multiple real-world implementations involving the time series forecasting and classification such as trends analysis, seasonal (weather) predictions, cycle, and irregularity predictions. These real-world problems are associated with forecasting and classifications of time-series data. A multiplicative neuron model is commonly employed in such predictions and renders superior results. Our proposed single multiplicative neuron model has overcome the limitations of dimensionalities. Therefore, it can be easily employed in such prediction tasks as well.

Data Availability

The data used to support the findings of this study are included in the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Ashutosh Mishra conceptualized the study. Ashutosh Mishra and Jaekwang Cha developed the methodology. Ashutosh Mishra performed a formal analysis and investigated the study. Ashutosh Mishra wrote, reviewed, and edited the paper. Ashutosh Mishra and Jaekwang Cha provided the software and performed validation, visualization, and data curation. Ashutosh Mishra provided the resources and prepared the manuscript. Shiho Kim supervised the study and was responsible for project administration and and funding acquisition. All authors have read and agreed to the published version of the paper. For correspondence, any of the authors can be addressed (Ashutosh Mishra; ashutoshmishra@yonsei.ac.kr; Jaekwang Cha; chajae42@yonsei.ac.kr, and Shiho Kim; shiho@yonsei.ac.kr).

Acknowledgments

This work was partially supported by Brain Pool Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2019H1D3A1A01071115) and by the Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2022-0-00966, Development of AI processor with a Deep Reinforcement Learning Accelerator adaptable to Dynamic Environment).