Abstract

Artificial intelligence (AI) and the Internet of Things (IoT) make it urgent to push the frontier of AI to the network edge and release the potential of edge big data. The model’s accuracy in data acquisition and music genre classification (MGC) is further improved based on theater music data acquisition. First, machine learning and AI algorithms are used to collect data on various devices and automatically identify music genres. The data collected by edge devices are safe and private, which shortens the time delay of data processing and response. In addition, the deep belief network (DBN)-based MGC algorithm has better overall recognition and classification effect on music genres. The MGC accuracy of the proposed improved DBN algorithm is nearly 80%, compared to 30%–40% of the traditional algorithms. The DBN algorithm is more accurate than the traditional classical algorithm in MGC. The research has an important reference value for developing Internet technology and establishing a music recognition model.

1. Introduction

Human-computer interaction (HCI) is increasingly frequent in the light of the Internet of Things (IoT). Combining high bandwidth, reliability, and data security gives birth to the intelligent edge computing (IEC) technology in the IoT field [1, 2]. IoT promotes geological space and information space integration. More people, machines, and things are connected to the information space, generating massive amounts of data, which has higher requirements for bandwidth and timely transmission [3, 4]. In particular, edge computing (EC) performs the computing tasks close to the data source to shorten the processing time. It speeds up data transmission, improves system reliability, and protects data security and privacy [5]. Meanwhile, IoT devices’ demands for data collection are growing exponentially [6, 7]. Data transmission procedures and means of intelligent devices vary dramatically. Thus, the data acquisition of each kind of device needs targeted adaptation development, and the acquisition devices cannot be reused, resulting in the complexity of secondary development. The growth of data acquisition volume also increases transmission bandwidth, adding pressure on the cloud. Therefore, an adaptive data acquisition IoT gateway based on EC is proposed by enabling the edge IoT gateway to carry out data acquisition, analysis, and conversion on devices. As a result, the adaptability and universality of IoT gateway acquisition are enhanced [8].

In recent years, music information retrieval (MIR) and the music genre classification (MGC) have been concerned with the development of the Internet and digital audio technology. By now, MIR and MGC systems mainly extract music features manually and then train the classifier to establish the model to recognize and classify the test music samples. However, there is trouble in extracting music features manually. Different recognition and classification tasks require distinct music features; sometimes, the required music features cannot be named. As a new feature extraction technology, the DBN (deep belief network) makes great achievements in image processing, natural language understanding (NLU), and other fields and has become increasingly mature.

The popularity of multimedia technology generates more online music works than ever. Classifying and managing them becomes a challenge. Many music users are only interested in specific music genres. MGC systems can divide music into different types according to style for users to choose, retrieve, and manage their favorite music. Indeed, MGC plays a crucial role in MIR. The main contribution of this work is to explore a new MGC approach through the ensemble classification method: the DBN-integrated edge data collection algorithm. The innovation is that the MGC effect of the DBN is compared and analyzed through experiments. In addition, studying MGC can promote the accuracy of theater music data acquisition, which is very necessary for modelling. This work will provide a reference for applying MGC algorithms in the future.

Some relevant methods of EC and heterogeneous data integration are cited. Salama et al. [9] studied the heterogeneous data integration method with active learning and evaluated the proposed model through the experiment. Five heterogeneous datasets from different fields were used: health reform dataset, Sander Frandsen dataset, financial phrase bank dataset, spam collection dataset, and textbook sales dataset. According to the results, the new data analysis method performed better than the traditional method. Anthony et al. [10] studied the digital development of virtual enterprises in the coronavirus disease 2019 (COVID-19). The results showed that during the COVID-19 pandemic, enterprises used virtual platforms to carry out business. Wewerka and Reichert [11] studied robot process automation and introduced the latest technology in robot process automation through system mapping research. Reine et al. [12] examined the return on investment (ROI) of software test automation (STA). The study provided a survey and analysis to understand the ROI of test automation by industry test professionals from product and service organizations. The results showed that the most commonly used method was the graphical user interface (GUI) test automation of functional scenarios to reduce manual testing and increase repeatability. To sum up, the accuracy of model recognition could be improved using a heterogeneous data integration method with active learning for music data acquisition and model training.

3. Methods

3.1. Data Acquisition of EC

EC is a hot term in recent years, data computing at the terminal close to the data source and completing the traditional cloud data processing task at the edge. EC and storage resource processing can alleviate network bandwidth overload and network delay [13, 14]. EC can be completed locally and can be operated on large, medium, and small equipment. EC devices are widely used, including computers, mobile phones, IoT intermediate nodes, smart homes, gateways, and even municipal terminals, such as automatic teller machine (ATM) and cameras [15]. EC renders local intelligent control services, intelligent data collection, data analysis, and intelligent industrial manufacturing that gradually lag. In the absence of cloud services, the corresponding research and design purposes of data collection based on EC and IoT can also be achieved [16, 17]. Traditional data processing is mainly carried out through cloud computing, and now, it is carried out at the edge, greatly reducing the pressure on the cloud. Of course, cooperative processing is also possible. Data collection or equipment monitoring can analyze data. Suppose the computing power of the device is close to the data source. In that case, it can be carried out at the edge to reduce the pressure on the cloud and draw conclusions quickly and efficiently. EC has become an important part of the current information infrastructure, and its future development is promising [18].

In the EC system, multiple data acquisition nodes complete edge devices’ data acquisition. Each node is composed of an edge device and several external devices [19]. For physical data, data collection is done by external devices. Various devices collect data and transmit them to edge devices in real-time according to the predetermined format. As shown in Figure 1, edge devices provide various hardware interfaces, including universal asynchronous transceiver, universal serial bus, and wireless transmission interface. These functions increase the compatibility with more external devices. When multiple sensors send data to edge devices, each sensor uploads its data to edge devices through parallel transmission. The edge device must access the target web page through the network module, download the required data, and collect the network data [20]. The data exchange between edge devices and the cloud computing centre is completed through wireless transmission, with no need for other devices to collect data.

The traditional physical data acquisition module usually does not have the data analysis function but integrates and packages the data and leaves it to other related modules. Doing so increases the system computation and delay, affecting the timely response of data analysis results. Therefore, the reserved sensor interface can complete the sensor acquisition task by connecting the sensor with the edge device. The software is used to preprocess the collect data, forming the underlying data acquisition module. This can expand the application of the sensor, ensure the collected data are uploaded to the cloud in time, and shorten data processing delay [21]. The process is shown in Figure 2.

Generally, network data are acquired over web pages. The HyperText transfer protocol (HTTP)-based network application layer requires users to send requests to the Internet through the terminal modules. Here, the edge devices and the crawler technology are used together to connect to the Internet to collect the required data over the physical data collection method. Python is used to download the overall code of these web pages and complete the data collection. Network data collection is usually divided into two ways in light of their collecting order. For one case, the regular expression theory is used to extract the web address related to the current query web page. The new web page is iteratively visited to extract network data until no hyperlink is available. The data extraction process is shown in Figure 3.

3.2. MGC Based on DBN

With the rapid development of Internet technology, online music services have gradually become the most convenient and main means for people to listen to music works. For massive music works, the performance of the MIR system is related to the quality of music services. Automatic MGC technology is an important part of content-based MIR, which has recently attracted much attention. Music signal has complex frequency composition and rich semantic information. The key to MGC lies in an effective music feature expression method. Here, EC technology is used to collect music-related information. The data are preprocessed and analyzed to get relevant information. Then, the music genre is identified and distinguished by DBN technology.

MGC is to identify the music information and genres for unknown songs or music. For example, George divides music into ten genres according to content: blues, classical, country, disco, hip-hop, jazz, metal, pop, reggae, and rock. This classification plays a very important role in MIR [22]. Many music users are only interested in specific kinds of music, and the MGC system can classify music into different types according to style [23]. In this case, music can be recommended for users according to their interests, which is convenient for quick music information retrieval and efficient management. Most music works are sung by people, accompanied by the performance of various musical instruments. In addition, the structural features of the music genre vary significantly. Even the same person can sing different sounds in different ranges when performing different music genres. These factors make it difficult for people to extract the features of music signals. Thus, improving the MGC accuracy is tricky. Recently, MGC has been widely concerned and developed rapidly. In particular, how to further improve the accuracy and efficiency of MGC has become the focus of the relevant research.

MGC often follows three steps: music signal preprocessing, music feature extraction, and music genre discrimination. Music signals must be preprocessed to facilitate feature extraction. Music features are another form of expression of the music signal with many redundancies being removed. Original music signals need a substantial amount of calculation [24]. Thus, the classifier must be trained to optimal the optimal model parameters to identify and classify music samples [25]. The traditional feature extraction method is very complex. Fortunately, DBN has autonomous learning characteristics and can be used to extract more abstract music features [26].

After preprocessing, the music samples’ features must be extracted as many as possible according to the specific recognition and classification tasks. The extracted features will be used to train the classifier until the optimal model parameters are obtained to identify and classify the music samples with different genres or musical instruments. The classification flow is shown in Figure 4.

According to Figure 4, the key to MGC research is feature extraction and classifier design [27]. There is no unified standard for selecting feature quantity. To improve the accuracy of recognition and classification, some researchers start from the principle of signal generation to find new and effective features, while some fuse existing single feature quantities [28]. The above methods do improve the MGC accuracy in a sense. However, there is no universal MGC method and agreement on which music features to extract. Under the circumstance, manual feature extraction can hardly complete MGC tasks, and DL algorithms’ advantage matters [29]. DBN can simulate the structure of the human brain, store and process large amounts of information, and mine the internal correlation of data. That is, it extracts more essential data features to improve the performance of recognition and classification.

The specific steps of MGC are as follows: extract music features and label each music segment and use machine learning (ML) model to learn the relationship between music features and corresponding music genre labels [30]. Afterward, a classifier is generated through supervised training of music samples and applied to the MGC task. The classical classification methods include decision tree, K-nearest neighbor (KNN), support vector machine (SVM), and logistic regression [31].

3.3. Case Analysis and Testing

The traditional manual music feature extraction algorithm has encountered difficulties. The music features manually extracted using the above classical shallow classification method is directly recognized and classified. Its performance has not been improved. DL has made great progress as a new feature extraction technology in the MIR field. Therefore, DL is used to identify and classify western music genres. Here, DBN in DL is chosen further to learn the essential features of different music genres. SoftMax regression tests the music sample genres. This section takes genre music clips as samples, and the training set contains 1,600 samples of each genre, totaling 16,000 music genre samples. The validation and test sets contain 800 samples from each genre, and each genre has 8,000 music samples. First, all music genre samples are labeled, and the training model is trained on the sample set. In addition, the validation set is used for cross-validation. The trained model is used to predict music genre samples. Finally, the predicted music genre labels are compared with the actual music school labels to obtain the average MGC accuracy.

4. Results and Discussion

4.1. Comparison of Traditional Classification and DBN Algorithm

The classification algorithm based on DBN with SoftMax and the traditional classification method is used to train, verify, and test 10 music genres, respectively. The average accuracy is shown in Table 1.

Table 1 shows that different methods’ MGC accuracy varies on the same music genre training set, the same validation set cross-validation, and the same test set. When the same feature model predictive control is input, the MGC accuracy of the traditional classification method is very low. The decision tree algorithm’s accuracy is the lowest, only 33.5%. The highest accuracy goes to SVM, only 47.2%. By comparison, the MGC accuracy is significantly improved using the DBN algorithm without loss and momentum, as high as 70.9%. The MGC accuracy of the upgraded DBN algorithm is even 76.0% and over a 5% increase over the DBN algorithm. The first five groups of experimental results show that compared with traditional classification methods, DBN has better performance in MGC tasks. This is because DBN can independently learn and extract music features more suitable for MGC. The experimental results of the last two groups show that the upgraded DBN has a stronger feature extraction ability.

4.2. Recognition and Classification of Different Music Genres

The 480-dimensional MPC features of each music genre sample are input into various classification methods. The MGC accuracy of different algorithms on ten music genres is shown in Figure 5.

Figure 5 shows that the DBN algorithm has an excellent MGC accuracy on all music genres (over 50%) except for rock (just over 40%). These results are better than those of the DT classification. Meanwhile, the upgraded DBN algorithm is better than the traditional DBN algorithm for most music genres.

To sum up, after comparing the recognition matrix of the six classification methods, different classification methods have distinct recognition effects on the same music genre. Different classification methods’ MGC accuracy on classical music is always the highest. Probably, there are obvious differences between the musical features of classical music and other types of music. The MGC accuracy for rock music is always very low. Possibly, there are similar features between rock music and other music genres, leading to the misjudgment of other types. Understanding music genres show that rock music overlaps with other music genres. This is because heavy metal and reggae music also belong to rock music. However, rock music belongs to pop music and is influenced by blues music, so it is easily recognized as other types of music. In addition, after using DBN for feature extraction, the MGC accuracy for each music genre has improved, further showing DBN’s superiority in MGC. Ebel et al. [32] used data flow models and digital twins to design automation projects. They determined the degree of completion by considering the historical data using ML algorithms and currently uncompleted artifacts. The information was integrated from previous projects. The research helped realize the project progress measurement of automation engineering. Flechsig et al. [33] studied robot process automation in supply management. The results showed that robot process automation had attracted more attention in digital transformation. This cutting-edge technology automated robot behavior and had great potential. Therefore, the proposed MGC model can achieve a better music recognition effect.

5. Conclusion

The originality statement is provided. According to the algorithm and its practical application, the data required by the system are collected by the edge device itself and related sensors. As such, the data can be used for subsequent analysis and processing. It has been successfully applied in many fields, such as music data acquisition. DBN is widely used in image processing but less in MIR. Compared with the classical algorithm, the proposed DBN with the SoftMax algorithm directly extracts music’s acoustic features or musical features. The main contribution of this work is to train the classifier to get the classification results. While reducing the workload of manual extraction and recognition of classification features, the MGC accuracy of the proposed algorithm is also better than the classical algorithm. The proposed upgraded DBN algorithm improves the MGC accuracy and solves the problem of which features need to be extracted in manual classification and recognition. The experimental results also further verify the great value of DBN in music genre recognition and classification.

The shortcomings and prospects are stated as follows: compared with the classical recognition and classification methods, the DBN algorithm improves the average MGC accuracy by more than 20%. However, the overall accuracy is still low due to the music quality in the music database. The follow-up experiments can try to separate the song from its accompaniment and use pure song data for MGC. DBN builds the MGC system, involving many internal network parameters, such as the number of network layers, the number of neuron nodes, network optimization strategy, and the like. Subsequently, the parameters can be further optimized. Relatively, the experimental database is small regarding massive amounts of online music data. It is expected to expand the training samples and choose higher-performance hardware to handle MGC tasks.

Data Availability

The data used to support the conclusions of this study are available from the corresponding author upon request.

Informed consent was obtained from all individual participants included in the study.

Disclosure

This study was presented in the 2020 Chinese Control and Decision Conference (CCDC).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.