Abstract

As Big Data, Internet of Things (IoT), cloud computing (CC), and other ideas and technologies are combined for social interactions. Big data technologies improve the treatment of financial data for businesses. At present, an effective tool can be used to forecast the financial failures and crises of small and medium-sized enterprises. Financial crisis prediction (FCP) plays a major role in the country’s economic phenomenon. Accurate forecasting of the number and probability of failure is an indication of the development and strength of national economies. Normally, distinct approaches are planned for an effective FCP. Conversely, classifier efficiency and predictive accuracy and data legality could not be optimal for practical application. In this view, this study develops an oppositional ant lion optimizer-based feature selection with a machine learning-enabled classification (OALOFS-MLC) model for FCP in a big data environment. For big data management in the financial sector, the Hadoop MapReduce tool is used. In addition, the presented OALOFS-MLC model designs a new OALOFS algorithm to choose an optimal subset of features which helps to achieve improved classification results. In addition, the deep random vector functional links network (DRVFLN) model is used to perform the grading process. Experimental validation of the OALOFS-MLC approach was conducted using a baseline dataset and the results demonstrated the supremacy of the OALOFS-MLC algorithm over recent approaches.

1. Introduction

With the dynamic expansion of the financial marketplace, enterprises might increase lower-cost deposit from the financial marketplace to quicken their improvement, and investor uses the process of the financial marketplace to finance and acquire high revenues [1]. But current companies are confronting progressively unsuccessful marketplace environments, and risk continuously provides operators a problem. The features of the current enterprise environment are mostly replicated in the rapid development of information technology, economic globalization, changes in business models, management methods, and customer orientation. This factor is influenced by technology, society, economy, and politics. The existing procedure of current companies is a method where different types of risks are endlessly produced and solved [2]. Over the last few years, the most important constraint for positioning resourceful devices is extremely pertinent to small and medium-sized enterprises (SMEs) for predicting economical faults and business loss. SMEs need business management for observing the modus operandi and inspect whether it is relevant to attain the determined objectives [3]. This model is portrayed through a series of firm rules, and some approaches are called “controls,” which guarantee the structure of the enterprise organization. At last, the requirement is stimulated for intermittent assessments [4]. As a result, detecting and estimating the development of corporate entitie make it easy to understand by the high dynamism that proposes to be a complicated process. It is important to progress that served as the inspection of efficiency in the economy [5].

Over the last few years, with the advance of the economic crisis of businesses all over the world, the enterprises are paying more interest in the field of FCP [6]. For a financial or company organization, it is vital to model earlier and reliable predictive models for forecasting the possible risk of the business status of economic failure. FCP usually yields a dual classification model that was resolved rationally [7, 8]. The outcome from the classification models is classified as a failure and nonfailure status of enterprise [9]. Previously, several classification methods are designed with different areas of concern for FCP. Usually, the proposed predictive method classified into Artificial Intelligence (AI) or statistical methods [10].

El-Kenawy et al. [11] present a modified binary grey wolf optimization (MbGWO) dependent upon stochastic fractal search (SFS) for identifying essential features with attaining the exploration and exploitation balances. Next, the diffusion procedure SFS implemented an optimum solution of modified GWO by utilizing the Gaussian distribution approach to arbitrary walk from a development procedure. Sankhwar et al. [12] establish a new predicting structure for the FCP method by the hybrid IGWO and fuzzy neural classifier (FNC). The proposed IGWO-based FS approach was utilized for discovering an optimum feature in the financial data. To classifier drives, FNC was utilized.

The authors in [13] present Bolasso (Bootstrap-Lasso) that chooses consistent and relevant features in a pool of features. The consistent feature selection (FS) was determined as the robustness of selecting features in terms of alterations in the dataset Bolasso created shortlisted feature is then executed for several classifier techniques such as K-NN, SVM, RF, and NB for testing their prediction accuracy. Kim et al. [14] present globally optimizing SVM, signified by GOSVM, a new hybrid SVM approach structured for optimizing FS, sample selection, and kernel parameters. This study presents GA for concurrently optimizing several heterogeneous designed factors of SVMs. Ghosh et al. [15] present a wrapper-filter group of ACO, whereas it can be established subset estimation utilizing a filter approach before utilizing a wrapper approach for reducing computational complexity. A memory for keeping optimum ants and feature dimensional-dependent pheromone upgrade has also been utilized for executed FS from a multiobjective approach. This presented method is estimated on several real-life datasets, obtained in the UCI-ML repository and NIPS2003 FS challenge, utilizing KNN and MLP techniques.

This study develops an oppositional ant lion optimizer-based feature selection with a machine learning-enabled classification (OALOFS-MLC) model for FCP in a big data environment.(i)To handle the big data in the financial sector, the Hadoop MapReduce tool is employed(ii)The proposed OALOFS-MLC model designs a novel OALOFS technique to choose an optimal subset of features which helps in attaining improved classification results(iii)The deep random vector functional links network (DRVFLN) model is exploited to perform the classification process(iv)The experimental validation of the OALOFS-MLC algorithm was performed using a benchmark dataset

The remaining section in this paper as follows: Section 2 describes the proposed model, and Section 3 describes the results and discussions. Section 4 concludes the paper.

2. The Proposed Model

In this study, a novel OALOFS-MLC model was established for FCP in a big data environment. Besides, the presented OALOFS-MLC model designs a novel OALOFS technique to choose an optimal subset of features which helps in attaining improved classification results. Furthermore, the DRVFLN model exploited to perform the classification process. Figure 1 depicts the block diagram of the OALOFS-MLC approach.

2.1. Hadoop MapReduce

Hadoop is a group of tools and technologies with considerable improvement; the application in the Hadoop technology solution is moderately outstanding in the public sources [16]. Map Reduce is the building block of Hadoop. It can be a corresponding program design framework. Map Reduce utilized for solving the problems of similar operations and analysis in largescale datasets. The foundation of the term Map Reduce is defined by the two fundamental procedures: the mapping process Map and the inductive process Reduce. Map Reduce implements the process simultaneously on a sequence of working nodes. Every node makes use of similar coding for processing the succeeded information without data communication. Map Reduce makes developers no longer assume the fundamental information while designing largescale dataset processing applications, understanding the consistent interface according to the operation that considerably decreases the improvement complexity and progresses the enlargement effectiveness.

2.2. Design of OALOFS Technique

In this study, the presented OALOFS-MLC model designs a novel OALOFS technique to choose an optimal subset of features which helps in attaining improved classification results. Reference [17] proposed an Ant Lion optimizer (ALO) that is a nature-inspired metaheuristic approach that simulates the hunting system of antlion in catching their prey. Constructing traps, random walking (RW) of ants, catching ants, reconstructing, and traps entrapment of ants in traps are different measures of the ALO. The antlion is generally known as doodlebugs. Larvae and adults are 2 metamorphic phases in their life cycle. ALO is stimulated by the hunting system characteristics of antlions. The steps included in the calculation of the parameter of the solar cell due to the impact of the environmental condition are given below:

Step 1. Initialization:
An initialized population of ants is represented as and antlion is referred to produced within the searching region of the parameter as for ant and correspondingly, where the size of the population can be represented as . The searching region of the parameter for Photowatt PWP201 PV module and R.T.C. evaluate the present value of each ant and antlion and describe the fitness values, discover the optimal antlions and it is represented as elite. Fix the maximal amount of iterations as max_iter.

Step 2. Constructing the trap:
For all the ants, antlion is preferred by Roulette wheel selection according to the optimal fitness of antlion for constructing the trap for ant [18].

Step 3. RW of ant:
The ant moves randomly to search for food and it can be arithmetically formulated in the following equation:In (1), cum sum can be represented as a cumulative sum. refers to the maximal amount of iterations. indicates the step of RW and denotes a stochastic function as follows:In (2), random. random is a randomly produced integer that lies within the range of . The normalization formula of the RW of ant from (1) is utilized for maintaining the location of the ant in the searching region.In (3), and are minimal of RW and are maximal of RW, and lower and upper bounds of the parameter correspondingly. An RW can be normalized for every parameter. For RW of ant , antlion is designated by the Roulette wheel and for elite antlion , and they are normalized and implemented.

Step 4. Trapping of ants:
The mathematical expression of trapping ants can be given in the following equation:In Equation (4), and indicate the indices of designated ant and antlion correspondingly.

Step 5. Sliding of ant toward antlion: the antlion throws and at the edge of the trap for sliding the ant toward the trap once an ant tries for escaping. It is formulated bywhere , indicates the present iteration and denotes the maximal amount of iterations. indicates a constant that relies on the iteration as follows:

Step 6. Catching and ‐constructing pit:
The fitness of the novel location of the ant was estimated. When the ant becomes fitter (viz., location of the antlion) when compared to the respective antlion, the ant has been trapped by the antlion and the antlion reconstructs the trap for the following hunt.In (7), represents the location of the antlion. This procedure is considered as catching prey and reconstructing the pit at the location where there is a higher probability of catching ant for the following iterations.

Step 7. Elitism:
It can be the procedure for maintaining the location of optimal antlion (elite) by optimized technique. It can be performed by the following equation:In (8), denotes the present iteration and indicates the location of ant.

Step 8. Upgrade elite when an antlion becomes fitter when compared to elite.

Step 9. End when the stopping condition is accomplished otherwise return to Step 3 to start the following iteration.
For improving the efficiency and performance of ALO, the study presents a revised edition of the technique using the concept of opposition-based learning (OBL). From the abovementioned statement, ALO, as a member of a population‐based optimization algorithm, initiates a set of primary solutions and tries to increase the efficiency toward the optimal solution. During the nonexistence of prior knowledge regarding the solution, the randomly initialized technique is applied for generating a candidate solution (rat first position). The convergence speed and performance are strongly associated with the distance of the first solution from the finest solution. In another word, the process has improved performance when the arbitrarily created solution has the lowest value when compared to the objective function. Based on the concept and to increase the chance of finding the global optima and the convergence speed of typical ALO, this study presents a revised edition of the approach named OALO. In the OALO, the initial iteration of the process afterward produces the first arbitrary solution, and the opposite position of every solution would be produced according to the conception of the opposite number. To determine the new initialized population, it is essential to describe the conception of the opposite number. Given that ‐dimension vector is defined by the following equation:In equation (9), . Then, the opposite point of , that is represented as , in the following:To employ the concept of opposite number in the initialized population of OALO, assume as an arbitrarily produced solution in -dimension problem space (that is, solution candidate). For that arbitrary solution, its opposite would be produced by (10) and represented as . Next, these two solutions are estimated by the objective function . Hence, when is superior to , then the agent would be substituted with ; or else, continued with .
The fitness function (FF) of the OALO approach assumes the classifier accuracy and the count of chosen features. It maximizes the classifier accuracy and minimizes the set size of chosen features. Thus, the subsequent-FF has been utilized for evaluating individual solutions, depicted as follows:whereas signifies the classifier error rate utilizing the chosen features. is utilized for controlling the significance of classifier quality and subset length. During the experiments, is fixed to 0.9.

2.3. Data Classification Process

Finally, the DRVFLN model is exploited to perform the classification process. The DRVFLN network is wide of shallow RVFL networks assuming deep or representation learning. An input to every layer from the stack result of the prior layer whereas every layer constructs an internal representation of input data [19]. At this point, regarding a stack of hidden layers (HL), they all have the same count of hidden nodes In order to ease representations, neglect the bias term in the equation. Figure 2 depicts the framework of the RVFLN technique.

Afterward, the resultant primary HL is defined as follows:

All the layers can be defined by (9):whereas and imply the weighted matrices amongst the input-first and inter HL, correspondingly. Such variables (bias and weight) of hidden neurons were made arbitrarily in a suitable range and retained set from the trained stage. signifies the nonlinear activation functions. Afterward, an input to resultant layers defined as follows:

This model structure corresponding to the RVFL network. Whereas input to output layers has nonlinear features in the stacked HL and novel features. Afterward, the resultant is defined as follows:

The resultant weighted (: the count of classes) has been resolved. In (14) and (15), DRVFLN occurs a linear integration amongst the feature and resultant layer weighted matrix which is the weight of the count of features from the HL containing the input layer.

3. Experimental Validation

The experimental validation of the OALOFS-MLC model is tested using two datasets namely German credit [20] and Australian credit [21] datasets. The former dataset includes 1,000 samples and 24 features. The latter dataset holds 690 instances with 14 features.

Table 1 offers the number of features selected by the OALOFS-MLC model on the applied datasets. The table values indicated that the OALOFS-MLC model that selected a total of 12 features for the German Credit dataset and 9 features for the Australian Credit dataset.

Table 2 and Figure 3 compare the best cost (BC) incurred by the OALOFS-MLC model on the test German Credit dataset. The experimental values implied that the OALOFS-MLC algorithm gained enhanced performance with the least BC values under all iterations. For instance, on iteration 1, the OALOFS-MLCapproach has obtained a lower BC of 0.114; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFSmodel have resulted in increased BC of 0.148, 0.162, 0.173, and 0.185, respectively.

In addition, on iteration 5, the OALOFS-MLC approach has obtained a lower BC of 0.129; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have resulted in increased BC of 0.153, 0.165, 0.184, and 0.181, correspondingly. On iteration 10, the OALOFS-MLC algorithm has obtained a lesser BC of 0.129; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS techniques have resulted in maximal BC of 0.154, 0.168, 0.174, and 0.184, correspondingly.

Table 3 and Figure 4 relate the BC gained by the OALOFS-MLC method on the test German Credit dataset. The experimental values represented OALOFS-MLC system has improved performance with the least BC values under all iterations. For instance, on iteration 1, the OALOFS-MLC methodology has obtained a reduced BC of 0.053; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have resulted in maximal BC of 0.082, 0.085, 0.096, and 0.105, respectively. Besides, on iteration 5, the OALOFS-MLC system has obtained a lower BC of 0.050; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have resulted in increased BC of 0.069, 0.089, 0.093, and 0.102, respectively.

Likewise, on iteration 10, the OALOFS-MLC methodology has reduced BC of 0.059; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have resulted in superior BC of 0.071, 0.081, 0.097, and 0.106, correspondingly.

Table 4 offers a detailed comparative examination of the FCP outcomes of the OALOFS-MLC model with recent models on the German Credit dataset [22].

Figure 5 provides a comparative study of the OALOFS-MLC model based on , , and . The figure indicated that the OALOFS-MLC model reached maximum classification performance. , the OALOFS-MLC model has achieved a higher of 97.36%; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained a lower of 95.43%, 90.12%, 85.73%, and 81.28%, respectively. Also, , the OALOFS-MLCapproach has gained a superior of 97.06%; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained a lower of 95.06%, 90.82%, 89.48%, and 83.02%, correspondingly. In terms of , the OALOFS-MLC system has achieved a higher of 97.31%; the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained a lower of 94.88%, 92.93%, 89.31%, and 79.17%, correspondingly.

Figure 6 illustrates a comparison study of the OALOFS-MLC model with recent techniques in terms of , MCC, and . The figure represented that the OALOFS-MLC approach has obtained maximal classification performance. In terms of , the OALOFS-MLC algorithm has achieved a superior of 98.75%, but the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained minimal of 95.23%, 90.81%, 89.31%, and 79.42%, correspondingly. Moreover, concerning MCC, the OALOFS-MLC algorithm has achieved a higher MCC of 96.13% whereas the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained a lower MCC of 95.47%, 92.12%, 87.97%, and 80.22%, correspondingly. In addition, in terms of kappa, the OALOFS-MLC system has achieved higher kappa of 96.19%, whereas the PIOFS system, ACOFS approach, GWOFS algorithm, and PSOFS methodologies have obtained lower kappa of 94.24%, 91.94%, 85.98%, and 80.36%, correspondingly.

The training accuracy (TA) and validation accuracy (VA) attained by the OALOFS-MLC approach on the German Credit dataset are demonstrated in Figure 7. The experimental outcome implied that the OALOFS-MLC system has gained maximum values of TA and VA. In specific, the VA seemed to be higher than TA.

The training loss (TL) and validation loss (VL) achieved by the OALOFS-MLC algorithm on the German Credit dataset are established in Figure 8. The experimental outcome inferred that the OALOFS-MLC methodology has been least values of TL and VL. In specific, the VL seemed to be lower than TL.

Table 5 offers a detailed comparative investigation of the FCP outcomes of the OALOFS-MLC algorithm with recent systems on the overall work.

Figure 9 provides at Table 5 comparative study of the OALOFS-MLC system with recent methodologies , , and . The figure indicated that the OALOFS-MLC model has reached higher classification performance. In terms of , the OALOFS-MLC system has achieved a superior of 97.41%, whereas the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained a lesser of 95.36%, 91.18%, 90.43%, and 83.51%, correspondingly. Also, for , the OALOFS-MLC system has achieved a higher of 96.53%, whereas the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained minimal of 94.71%, 91.28%, 85.52%, and 79.93%, correspondingly. Eventually, , the OALOFS-MLC methodology has achieved a higher of 97.92%, whereas the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained decreased of 94.61%, 90.65%, 89.07%, and 79.06%, correspondingly.

Figure 10 depicts a comparison study of the OALOFS-MLC model with recent methodologies in terms of , MCC, and . The figure represents that the OALOFS-MLC model has gained maximal classification performance. Concerning , the OALOFS-MLC system has achieved a higher of 98.50%, whereas the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained a lower of 95.06%, 93.18%, 86.65%, and 79.95%, correspondingly. Also, concerning MCC, the OALOFS-MLC approach has achieved a superior MCC of 97.53%, whereas the PIOFS system, ACOFS approach, GWOFS methodology, and PSOFS model have obtained lower MCC of 95.50%, 93.08%, 86.41%, and 79.19%, respectively. At last, concerning kappa, the OALOFS-MLC approach has achieved higher kappa of 96.22%, whereas the PIOFS system, ACOFS approach, GWOFS algorithm, and PSOFS methodology have obtained lower kappa of 94.87%, 91.64%, 87%, and 82.40%, correspondingly.

The TA and VA obtained by the OALOFS-MLC model on the Australian Credit dataset are established in Figure 11. The experimental outcome outperformed that the OALOFS-MLC methodology has gained maximal values of TA and VA. In specific, the VA seemed that superior to TA.

The TL and VL attained by the OALOFS-MLC approach on the Australian Credit dataset are established in Figure 12. The experimental outcome signified that the OALOFS-MLC system has accomplished minimal values of TL and VL. In specific, the VL seemed to be lower than TL.

From the detailed results and discussion, it can be stated that the OALOFS-MLC model has shown an effectual outcome on FCP.

4. Conclusion

In this study, a novel OALOFS-MLC model was established for FCP in a big data environment. To handle the big data in the financial sector, the Hadoop MapReduce tool is employed. Besides, the presented OALOFS-MLC model designs a novel OALOFS algorithm for choosing an optimum subset of features which helps in attaining improved classification results. Furthermore, the DRVFLN model is exploited to perform the classification process. The experimental validation of the OALOFS-MLC approach was performed utilizing a benchmark dataset and the outcomes highlighted the supremacy of the OALOFS-MLC model over recent approaches. Thus, the presented OALOFS-MLC model can be exploited as an effectual tool for FCP in the big data environment. In the future, outlier detection and data clustering approaches can be applied to FCP.

Data Availability

All data are available in the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.