Abstract
In today’s society, technologies such as social network, mobile Internet, and Internet of things are developing by leaps and bounds, and data are growing at an unprecedented rate. How to manage and reasonably use such data sources has become the most popular research topic, and cloud computing technology can provide powerful computing and storage capabilities for data sources, further enhance the comprehensive processing capacity of big data, and realize the real value of data sources. Based on this point, this paper focuses on the nature of data network delivery and its requirements for real-time accuracy and performance monitoring, and studies the key computing flow technology for network data monitoring. The main achievement is to build a data processing and analysis platform based on computing flow, and design and implement the overall structure of the platform and data anomaly monitoring model. Finally, this paper uses big data technology to design and build an integrated auxiliary legal service system. The system can help judges and prosecutors fully understand the progress and research results of guiding cases, and provide them with a series of applicable laws and regulations, so as to maximize work efficiency. The system can also help implement the investment of legal education, so as to provide basic legal services for the public, integrate legal evidence, and timely implement public opinion guidance for some public opinion media on legal topics and cases, so as to minimize social risks. This paper combines cloud computing technology with network data detection and applies it to the research field of legal auxiliary services, which greatly promotes the development of legal evidence integration technology.
1. Introduction
Through the comprehensive application of big data and cloud computing processing technology, this paper analyzes and studies the data monitoring process, data processing process, and data diagnosis process of various devices. The results show that the necessary factors for building a powerful intelligent network are data integration and information interaction technology [1]. With the gradual development of the network transmission center to the integration of supervision, more and more various data will be transmitted to the data monitoring center for processing. Therefore, the monitoring and processing technology of historical data and real-time data will face great challenges. For a wide range of equipment data processing application scenarios, this paper plans a comprehensive cloud computing platform architecture [2]. The platform is based on a shared IT multi-structure computing framework. This design can not only help the computing platform save investment and maintenance costs, but also make data integration and information interaction more convenient [3]. The system can provide computing services in the optimal resource framework according to the user’s network data detection and processing requirements. Based on this point, this paper proposes a centralized computing method for parallel series processing of massive digital signal components according to Hadoop MapReduce [4]. The adaptive discharge amplitude threshold and flow interval threshold are used to double filter the local excess points in the discharge signal phase, so as to improve the computational efficiency of large data parallel processing [5]. With the progress of e-commerce and e-government, the application of electronic data is increasing, and electronic evidence has gradually attracted people’s attention [6, 7]. At the same time, the phenomena of network information crime and e-commerce crime emerge one after another. Therefore, the integration of electronic legal data has become the best way to solve this kind of conflict [8]. However, the legal form definition of electronic evidence is controversial in the current era, which will hinder its practical application to a certain extent, so it is difficult to solve the dispute in time. This paper holds that electronic evidence can be divided into two types of legal forms [9]. One is whether electronic evidence has qualified evidence qualification, especially whether its manifestation is consistent with the legal category in China; the second is the specific use of electronic evidence, that is, whether the investigation and evidence collection law and court evidence collection procedures can be connected with electronic evidence [10, 11]. Based on the above two points, this paper discusses the practice and theoretical basis of the legal form of electronic evidence in judicial proceedings. Based on this, this paper designs a complete legal service assistant comprehensive system, which can effectively provide legal assistance to law enforcement personnel and the public, thus effectively reducing social risks [12].
2. Related Work
The literature has constructed a data analysis and processing platform based on stream computing technology, classified and explained its design and implementation process, and designed a data network abnormality monitoring model [13]. Spark Streaming based on the Apache Spark platform carries out an open-source computing framework design and introduces components such as Kafka message columns and Redis memory databases to provide efficient data sources and data service interfaces for the data analysis platform, enabling it to perform real-time analysis and processing of various data [14]. This completes the network monitoring application. The literature gives a data aggregation scheme that adapts to model changes, and achieves the aggregation of state data based on the CIM model described in the SCL model; because it is necessary to achieve efficient data integration and storage at the same time, the introduction of cloud computing technology solves the above technology difficulty [15]. The literature analyzes the connotation and nature of MCCA and its data sources, and uses the theory of evolution to explain the resource integration mechanism of MCCA data sources. On this basis, the overall structure of MCCA data aggregation mechanism is designed, including data recovery and storage mechanism, data source grouping, and merging mechanism [16]. The literature proposes a parallel empirical full modal decomposition (EEMD) algorithm compatible with Spark memory computing technology, which overcomes the deficiencies of Hadoop MapReduce in the case of complex data processing. At the same time, the parallelism of the EEMD algorithm in processing time series signals is analyzed, and two parallel EEMD algorithms with different structures based on the parallel EMD process in parallel segmentation bands are designed [17]. The proposed parallel algorithm is used to extract the components of the waveform emission signal, and the calculation performance of the serial EEMD algorithm and the parallel EEMD algorithm based on Hadoop MapReduce are compared [18]. The literature describes the focus of disputes over the legal form of electronic evidence in judicial practice and theory. It is believed that electronic evidence does not conform to China’s closed evidence classification system, and supports the reconstruction of the national evidence classification system [19].
3. Design and Application of Network Data Monitoring Platform Based on Cloud Computing
3.1. Data Resource Clustering of the Mobile Cloud Computing Alliance
The cloud computing alliance data source clustering can effectively classify the alliance data source. In this process, an efficient and appropriate clustering method must be selected to share the resources in the alliance data. Using FCM clustering method in cloud computing to group alliance data sources can be described as follows: assume that the set X = {x1,x2,…, is Xi = {xi1,xi2,L,xip}, and xip is the pth attribute of the data resource xi. In the cloud computing alliance, the objective function of the FCM algorithm is as follows:
Obtain the minimum value under the constraint of
Using Lagrangian multiplication and combining the constraints of equation (2) to derivate equation (1), we get
Breadth first search (BFS) is a method of traversing graphs. It is a type of hierarchical search process that can access all nodes in the graph and has global search capabilities. To this end, this paper adopts the breadth search first idea and proposes an improved clustering algorithm with breadth priority, which not only has the ability to search the whole world, but also has the ability to exclude noisy data, including
The specific algorithm of the improved clustering algorithm in this paper is as follows: calculate the weight between any two nodes in the weight network, that is, equivalence, record it as the equivalence degree of measured objects xi and xj with respect to the attribute factors in Wk, and then consider all attribute relationship factors. Then, the similarity of the measured object xi and xj is as follows:
Among them, Wk is the weight of the K attribute factor; i = 1,2, …, n; j = 1,2, …, n; d represents the number of attribute factors; and xik is its corresponding attribute value. After constructing the undirected weight graph, the equivalent matrix is created. From equation (5), it can be seen that Sij = Sji and Sii = 1, so the equilateral matrix is symmetric about the diagonal backbone.
Based on the above, in order to obtain a more accurate cluster center and cluster number, this article aims to construct a relevant cluster analysis function and select the optimal threshold to determine the optimal cluster center and cluster number. The sample center formula is
The overall sample center formula is
Definition of firmness:
Definition of separation:
The cluster validity evaluation model is
The initial cluster centers can be expressed as
The greater the F value, the greater the distance between the class and the class; that is, the greater the difference between the class and the class, the better the classification effect. The threshold S is the optimal threshold corresponding to the F value. Relatively, its classification is the optimal classification result, where 1 < Cs < n.
In order to improve the anti-noise ability of the FCM algorithm and minimize the influence of noise data on the grouping results, the change weighting method is used to determine the contribution rate of each attribute, and the attribute weight with strong separation is emphasized, while the noise attribute weight is reduced; assuming a data set X = {x1,x2,…,xn}, the formula for calculating the coefficient of variation is as follows:
The weight formula of each attribute factor iswhere represents the weight of the kth attribute factor; p represents the number of sample attributes; and the objective function of the improved FCM algorithm is as follows:
The objective function of the introduced Lagrange multiplier is
The update formula based on the above membership degree is
The update formula based on the above cluster centers is
3.2. Design of Mobile Cloud Computing Network Data Integration Scheme
The data resource combination process of the mobile cloud computing alliance is roughly as follows:(1)The user agent automatically executes the user agent request activity according to the application parameters input by the user and then asks whether the DR service request is registered in the ontology database service and whether there is a public data source service of the user agent, so as to screen the optimal DR service. We call the service to provide the agent interface, interact with the DR service agent, and then run specific user services.(2)If the DR service request entered by the user cannot be found in the ontology database of the DR service, then it is determined whether the task belongs to the combined multi-task situation, and CDRSMA must decompose the required task. First, a combination plan is found that meets the needs of combined services in DR. If the matching is successful, the best combination plan will be screened out and returned to the DR combination service.(3)If the matching fails, the DR agent that provides the service will automatically provide the data resources that meet the requirements and return to the CDRSMA end, automatically personalize the DR service combination according to the user’s needs, and then update the DR combination service library. When filtering the best service combination plan, the DR service agent provides a dial-up interface and executes user-specific service plans. The process is shown in Figure 1.

In the process of adjusting the threshold selection, the determination process of the vertical threshold T1 and the horizontal threshold T2 is very important, which can directly affect the detection result. However, like other methods in related literature, T1 and T2 values are usually set manually based on experience. Even though the same threshold can be used for signals from the same monitoring source, more accurate detection results can be obtained by manually setting the threshold. However, if there are multiple monitoring sources, it will inevitably lead to a large amount of manual work, so it is difficult to determine the automatic processing of the output signal components. Therefore, how to adaptively determine these two thresholds is a problem worthy of attention.
In the field of statistics, variance can measure the dispersion level of a data set. The higher the variance value, the higher the dispersion level. If a specific T threshold is used to split the original data, two clusters can be obtained, the difference between the two clusters is the largest, and their variance is the largest at this time. The theoretical basis of this method will be explained in detail below.
Suppose there is a set of discrete values of length N, {xi|i = 1,2, …, N}, where the maximum and minimum values are xmax and xmin, respectively. First, the gray value of the discrete value needs to be changed; then
Secondly, count the number of each discrete value falling within the corresponding grayscale interval, where the grayscale interval with the grayscale value of l is converted according to
If we consider the number of discrete values in the grayscale range [(l − 1)·dx,l·dx] and set it to n1, then n1 is called the number of pixels with a grayscale value of l, and then the total number of pixels must be equal to the total number of discrete values; there are
Then, the probability of occurrence of the gray value l is denoted as P1, which is obtained by
First set the threshold to kdx and apply this threshold to divide {xi|i = 1,2, …, N} and divide it into C0 and C1 clusters; in the two groups, C0 represents the interval [0,kdx] discrete value, C1 represents the discrete value in the interval [(k + 1)·dx, L·dx], and then the probabilities of the two groups and their respective average gray values are
For any value of k, it is easy to verify the following equation.
So far, the threshold calculation is transformed into an optimization problem.
3.3. Network Data Monitoring Performance Test and Analysis
In this paper, according to cloud computing technology, a variety of signals are collected through simulated radiation models and experiments, including four types of corona discharge, suspension discharge, bubble discharge, and oil discharge. The sampling frequency is set to 5 MHz, and the data are stored in binary format. Each sampling point occupies 2 bytes, and each file continuously stores various forms of signals in units of power cycles. The information of the collective data used in the experiment is shown in Table 1. The size of the data set is gradually increased according to the number, and data sets 2 to 8 are integer multiples of number 1. The 4 monitoring sources are equivalent to 4 different discharge types, and the amount of data is the same. Each file stores 335 power frequency cycle partial discharge signals, and the size is about 64 MB. Test data set information is shown in Table 1.
Parallel granularity refers to the size of each small load after partition. Here, it can be considered as the size of files or blocks, which usually affects the creation and performance of parallel programs. The discharge signals with the same size as the No. 8 data set are reorganized according to different file sizes to obtain the data set shown in Table 2.
With the increase of the number of files, the program values that take the most time and the least time slowly approach. This is because the uniform granularity will become finer and finer, resulting in more and more balanced load. Therefore, the time-consuming difference when running multiple times is approaching. From the above analysis, it can be seen that the parallel granularity makes some idle computing nodes unable to be fully used in part of the time. Although the granularity is too small, it can balance the load, but the cost is to increase the general switching activity overhead. Therefore, the Hadoop size of parallel granularity should be selected according to the Hadoop cluster size.
In order to analyze the performance of data mode by combining data storage and query conditions, two kinds of experiments of data import and query are carried out on the established cluster platform, and they are compared with the integration scheme in single-machine mode. In stand-alone mode, SQL Server 2005 database management system is used to store the latest real-time detection data. The data import mainly compares the creation of HBase-based database and traditional relational database, and the query experiment mainly compares the creation of access status data of stand-alone and spark cluster systems. The data import and query experiments were carried out three times, respectively, and the average value was taken. In order to display the experimental results more intuitively, we use a broken line graph. The ordinate represents the execution time in seconds, and the abscissa represents the amount of data.
Experiment 1. The amount of data imported into HBase and SQL Server 2005 is 50000, 100000, 200000, 500000, 1 million, 3 million, and 7 million. The experimental results are shown in Figure 2.

Experiment 2. The comparison of query time between stand-alone system and cluster platform when the amount of query data is 1000, 5000, 10000, 30000, 100000, 1 million, and 3 million, respectively. The results are shown in Figure 3.
From the analysis in Figures 3 and3, the following results can be obtained:(1)If importing data, when the amount of data is less than or equal to about 300,000, the SQL Server import time is less than the import time of the HBase database. As the amount of data increases, the import time of HBase database will gradually increase, while the import time of SQL Server will increase sharply. When it reaches a certain node (7 million data), the import time of SQL Server will far exceed HBase database.(2)If data query is performed, and the amount of data is small, that is, the number of data items in the experimental query is less than 30,000, the data query time of the single-machine system is shorter than that of the cluster system. As the amount of data increases, the time of the cluster system gradually increases, while the time of the stand-alone system increases significantly. Table 3 shows the results of cluster analysis of Spark combined with parallel DBSCAN data.After grouping analysis, it can be seen from Table 3 that the data are divided into 5 categories, including normal states and 4 different types of data attack states. The calculation shows that P is the overall accuracy result of 92.48%, which is very accurate.
Train 5 traffic monitoring states, and use the classifier for classification training to check the correct rate of the classes executed through the test set. The training set for each state is used to train the appropriate classification. The training results of the classifier and the accuracy of the corresponding test set are shown in Table 4.
The classification results of all different data points are statistically recorded. The results show that the classification accuracy of the classification combination of all different data points is accuracy = 96.93%. This shows that the combination of classifiers for multi-state recognition has high classification accuracy and good generalization ability.

4. Research on Legal Evidence Integration Strategy under Big Data Environment
4.1. Design of Legal Evidence Integration Service Data Platform
With the development of national politics, society, and economy, citizens’ legal awareness is increasing, and the demand for all kinds of legal services is also rising. Generally, most legal services have the characteristics of repeatability, sharing, and versatility. In today’s real life, legal service information system has important and unique social significance and complete technical possibilities. This paper uses big data technology to design and build an integrated auxiliary legal service system, which can provide basic legal services or auxiliary services for the people and legal workers. The purpose of designing the system is to provide a new way of access for ordinary people and legal professionals, so as to save time for relevant workers, improve the efficiency of legal services, and reduce the cost of litigation process. The main modules applicable to the whole system are shown in Figure 4.

Database conceptual model is the key link of conceptual design. It will provide technical support for database design and can be used for legal case retrieval. Based on the field of mathematics, the technical core of the system is data information modeling, and the most commonly used conceptual model is E-R model. As for program design, E-R diagram is designed in the requirement analysis stage. E-R diagram is entity relationship diagram, which includes entity, attribute, and relationship. The entity in E-R diagram refers to the data object in the data model; the property of Er graph is the property of data set object itself; the relationship in the E-R diagram refers to how the data are connected to each other. The structure of Er case retrieval diagram is shown in Figure 5.

The main task of designing logical database structure is to use ER diagram to create corresponding two-dimensional database tables. Generally, the logical structure of database has four basic structures: set structure, line structure, tree structure, and graphic structure. This article chooses to use MySQL database, which usually contains 6 columns of associated data tables. The name and function of the data are shown in Table 5.
4.2. Judicial Big Data Integration and Optimization Strategy under Big Data Environment
4.2.1. Pay Attention to the Cultivation of Judicial Personnel’s Data Ability
In the process of training legal talents, we need to pay more attention to the improvement of their data ability. Due to the cognitive gap and prejudice of thought, the social masses are more willing to believe in irrational news than the so-called syllogism demonstration process. However, because the statistical data are intuitive and empirical, it can often produce better persuasion effect. In today’s society, groups with active control algorithms have more power in the ethical direction, but the initial purpose of determining data learning ability is not to obtain power, but to bear more responsibility for judicial justice in the new era. If we want to make the argument more in line with the background and requirements of the times, it is necessary to improve the legal proceedings and make the evidence more convincing through the ability of finding data, analysis, sorting, and application.
4.2.2. Development Suggestions for Judicial Big Data Development
First, we should gradually improve the judicial data research model, develop and strengthen team cooperation, and improve the proportion of legal data research talents. The problems exposed by judicial big data at the technical level will greatly affect the accuracy and reliability of the technical services finally provided by the auxiliary system of judicial big data system. Therefore, the deficiencies based on big data technology in the judicial field can truly reflect the current social expectations for the development of compound talents and judicial data technology talents.
Second, we should establish a judicial database and merge it for development. Because the quality of the sample determines the accuracy of the final prediction of the data model, the defects such as arbitrators’ disputes and arbitrators’ own shortcomings should be excluded. In reviewing the data, the number of appeals and nonrepresentative cases should be minimized.
Third, we should pay attention to cultivating compound talents. In the past, the research on judicial big data was mainly based on procedure researchers, but the obstacles found and encountered by researchers in the research process of legal theories and methods have also become the difficulties in promoting judicial big data technology. Therefore, it is necessary to cultivate more compound talents to improve the proportion of legal talents in the field of judicial big data.
The core of forensic data technology is the algorithm itself, and the core of the focus dispute of legal behavior is also the algorithm. The judiciary has power because it has rules that cannot be ignored and despised, and carries out fair trials in accordance with existing rules. Even though the algorithm can extract some laws through a thorough understanding of judicial data, as the jurist Kirchmann said, “the change of legislators can turn all documents into waste paper,” that is, the rule of law is the rule of rules rather than the rule of law. Therefore, different arbitrators may give completely different arbitration results.
Today, China’s judicial big data technology exists only as an auxiliary technology, and the judicial judgment and sentencing suggestions made according to the above methods may directly affect the judge’s subjective judgment. In addition, due to the discriminatory and opaque characteristics of the algorithm itself, it is particularly important to finalize the laws and regulations regulating the algorithm procedure as soon as possible. Therefore, it is particularly important to finalize the laws and regulations regulating the algorithm program as soon as possible. Therefore, the author suggests the following points:(1)Improve the legal regulation system of judicial big data algorithm discrimination: in order to avoid the inevitable discrimination of judicial data sets, the principle of general legislature must be established first. The first is the principle of equality. All citizens are equal in law, including the principles of gender equality and ethnic equality, which are also the basic principles established in China’s constitution. These principles must also apply fair judicial data rules to ensure their implementation. Secondly, we should adhere to the principle of openness. In order to avoid the opacity caused by the black box operation of judicial data and the unexplainability of the conclusion of judgment algorithm, the public procedure of justice must be observed to the greatest extent.(2)Formulate the development restriction and accountability mechanism of judicial big data algorithm: both judicial data producers and judicial data users shall bear the same responsibility for the civil liability caused by algorithmic discrimination. If the infringer requires the algorithm producer or the judicial user of the data algorithm (including the court) to interpret the relevant algorithms, in order to protect the victims’ rights of infringement relief, it is very important to explain the rules and responsibility mechanism of the corresponding algorithms.(3)Establish a regulatory mechanism for judicial big data application before applying a large amount of judicial data to trial, and the algorithm should be tested. The test and prediction test of big data forensics algorithm should be carried out jointly by the court, citizens, and development companies. Through multi-dimensional monitoring, we can ensure that the forensics algorithm method can improve the accuracy of the test and ensure the fairness and openness of the trial to the greatest extent.
Although new technologies have brought about the improvement of production efficiency, they are accompanied by security challenges caused by new technologies. This phenomenon is inevitable in the era of big data. The urgent demand for judicial data in China’s judicial field also confirms the importance of judicial big data. Today, in-depth research on judicial big data is imminent, and what is needed is the disclosure of a large number of judicial data. The disclosure of forensic data is the basis of forensic research and forensic data development—it provides data sources and research samples for forensic data research. The danger of using judicial data is that by exchanging a large number of judicial data, judicial data will interfere with state secrets, personal privacy, data sovereignty, and even the release of judicial information, affecting the fairness and credibility of the trial. Therefore, it is necessary to establish a self-disclosure management system of judicial data, limit private data, blur some judicial data, improve data anti-climbing countermeasures and protection mechanisms, and protect data.
5. Conclusion
This paper summarizes the research status of mobile cloud computing, cloud computing alliance, and big data at home and abroad. Through the analysis of the causes, connotation, and characteristics of the emergence and development of mobile cloud computing alliance, the organizational forms, characteristics, and differences of data sources, this paper reveals the relevant evolution theory of integrating mobile cloud computing alliance mechanism according to data sources. On this basis, the integration mechanism of data monitoring and analysis design of mobile cloud computing alliance is designed, including the acquisition and storage mechanism of mobile cloud computing alliance data resources in the cloud and parallel grouping mechanism. Then, taking a mobile cloud computing alliance as an example, this paper verifies the data integration mechanism. Today, China’s judicial big data technology is still in the entry stage, and the development of legal artificial intelligence is also in the stage of weak artificial intelligence. However, it is very necessary to set up countermeasures in advance for the problems in the development of judicial data technology in the future. In the future, the judicial field and even the whole legal field will face more challenges brought by scientific and technological progress. The correct way is to meet the challenges and find strategies to actively deal with them. What the judiciary and academia need is to jointly introduce industry restrictions, maintain industry rules in the process of program design through the joint participation of academic and judicial personnel, and complete the industry self-discipline specification of judicial big data under this background, so as to improve the management and development of legal auxiliary service system.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
All the authors have no conflicts of interest.