Abstract

Cloud computing has played a strong role in promoting the accounting informatization of small and medium-sized enterprises, which is helpful to accelerate the construction of accounting informatization. In order to improve the accounting informatization construction effect of small and medium-sized enterprises, this paper proposes an improved ISCA accounting informatization model based on cloud computing and proposes a semisupervised clustering method of accounting information based on the minimum prototype cluster to classify accounting information. Moreover, this paper uses labeled samples to measure the compactness and purity of clusters and guide the splitting of clusters and constructs a corresponding intelligent model structure. The research shows that the improved ISCA model proposed in this paper has a very obvious effect on the improvement of application of accounting informatization in small- and medium-sized enterprises in the cloud computing environment.

1. Introduction

With the rapid development of information technology, information management plays an increasingly important role in the modern management of enterprises and plays an irreplaceable role in improving enterprise management, improving enterprise management, and strengthening core competitiveness. On the other hand, as an important part of informatization management, improving accounting informatization is an inevitable requirement of informatization management. In view of the proportion of SMEs in the total number of enterprises and their role in promoting economic and social development, it is of great practical significance to improve their accounting informatization level to realize informatization management. As a new form of business service, cloud computing can provide customers with IT-related resources and services in the form of leasing, which enhances the availability of resources. Moreover, it has many advantages, including lower cost, comprehensive utilization of resources, convenient deployment, and flexible use. According to the company size and business characteristics of small- and medium-sized enterprises, cloud computing can provide convenient and efficient financial information management and overcome the inefficiency of traditional financial management. With the rapid development of the Internet, the construction of accounting informatization needs to be organically combined with the development of cloud computing and to make full use of the advantages of cloud computing to further promote the healthy and sustainable development of accounting informatization in small- and medium-sized enterprises.

With the in-depth development of the Internet industry, cloud computing can provide a new service model for the problems existing in small- and medium-sized enterprises and can better meet the requirements of the times in terms of daily maintenance costs and better meet the personalized and differentiated requirements of customers’ accounting information construction. As an innovative form of information construction, cloud computing helps to connect the relationship between various departments of the enterprise in the process of massive information calculation and processing and is particularly effective for information islands. Realize cloud computing accounting informatization in small- and medium-sized enterprises, and improve the value of enterprises. Relying on the rapid development of cloud computing, water companies can easily and effectively carry out accounting information management work, which is helpful for enterprises to improve accounting work, solve household registration, and collaborate online, which is helpful to existing problems. Through in-depth research on the development trend of accounting informatization and cloud computing, we will provide feasible development measures from a multidisciplinary perspective, provide impetus and ideas for the construction of corporate accounting informatization, and provide reference for enterprises and other enterprises to establish new information management systems It has important practical significance for enterprise accounting informatization under the new situation.

Due to the constraints of high system construction and maintenance costs, lack of attention by management, and lack of professional talents, the progress of accounting informatization in small- and medium-sized enterprises is relatively slow. This is because most small- and medium-sized enterprises are in the starting or developing state, and many are facing the problem of shortage of funds and talents. In addition, the management of enterprises pays more attention to the input-output ratio and is more willing to invest limited funds into actual business. Therefore, with the study of the cloud accounting model of accounting informatization for small- and medium-sized enterprises as the core, we strive to help small- and medium-sized enterprises in the construction and application of cloud accounting, thereby reducing their accounting informatization construction costs, improving management efficiency and market competitiveness, and thus promoting their economic development. .

This paper combines the improved ISCA model to study the application and promotion effect of accounting informatization in small- and medium-sized enterprises in the cloud computing environment and promote the construction process of accounting informatization in small- and medium-sized enterprises.

Since cloud computing is the basis of cloud accounting and is a relatively unfamiliar field, we should first understand the concept, classification, characteristics, hot issues, and how to solve them of cloud computing. Literature [1] introduces various concepts, 5 characteristics, 3 service types, and 4 deployment modes of cloud computing and other basic contents and gives a detailed description of the key technologies used in cloud computing and the current status of domestic applications. The introduction of the basic knowledge of cloud computing is relatively comprehensive, and it is a good literature as an introduction to the basic knowledge of cloud computing. Literature [2] analyzed the strategic significance and development status of cloud computing and put forward corresponding countermeasures on how to promote the development of cloud computing. Literature [3] analyzes the security problems in the application process of cloud computing and the corresponding technical countermeasures, but the limitation is that this paper only proposes countermeasures from the technical level, which is not comprehensive enough. The literature [4] expounds the security risks of cloud computing and puts forward corresponding security strategies from the aspects of technology, management, law, and security standards, and the perspective is more comprehensive. At present, cloud computing has a great impact on its development due to the lack of unified standards. Literature [5] proposes three existing cloud computing standards research by analyzing the research status of international and domestic cloud computing standards, as well as the research progress of cloud computing standardization. Big problem, but no corresponding solution has been proposed. The concept of cloud accounting has only appeared in recent years, but many scholars have published papers to discuss relevant issues. Literature [6] gives a more comprehensive and accurate description of the meaning of cloud accounting, and introduces the generation, basic architecture, advantages, operation mechanism, and security mechanism of cloud accounting, and also summarizes the problems that cloud accounting still needs to solve. This book gave me a more systematic understanding of cloud accounting. Literature [7] mentioned the data security and standard problems and solutions of cloud accounting, but the research is more in-depth. Literature [8] analyzes the problems encountered in the development of cloud accounting from five aspects and proposes corresponding solutions, but the content is relatively simple. Literature [9] expounds the advantages and dilemmas of cloud accounting, but does not propose solutions. Literature [10] studied the environmental issues of accounting informatization from multiple perspectives and demonstrated that cloud computing is an inevitable trend of accounting informatization from an environmental perspective. Development strategies are proposed. Literature [11] introduces and discusses the SaaS model and PaaS and IaaS model of cloud accounting, but due to space limitations, the research on the SaaS model that is the most widely used by small- and medium-sized enterprises is not deep enough, and the comparative study of cloud computing model and traditional model informatization is not comprehensive enough. Literature [12] analyzes the strategies and safeguards for small- and medium-sized enterprises to build cloud accounting based on the case of small- and medium-sized enterprises applying cloud accounting. The case enterprises are still in the state of computerized accounting before implementing cloud accounting. The application of enterprises is more advantageous than computerized accounting” to demonstrate that the case company directly converted from computerized accounting to cloud accounting. Literature [13] analyzes the four problems of the current informatization development of small- and medium-sized enterprises and puts forward corresponding countermeasures. Although the analysis is simple, the generalization is strong and comprehensive. The literature [14] analyzes the problems and countermeasures of the development of accounting informatization in small- and medium-sized enterprises, but from the five stages of accounting informatization planning, construction, operation, maintenance, and evaluation, the analysis is more in-depth.

Literature [15] expounds the concept of cloud computing, and believes that cloud computing is a service model based on the Internet, and provides users with infrastructure resources such as software and hardware, and users only need to pay a small amount of rent. Literature [16] believes that privacy protection is an important issue in cloud computing, mainly involving legal compliance and user trust, and proposes methods to protect user privacy. Literature [17] believes that the nature of cloud computing leads to the emergence of security problems and proposes corresponding solutions. It also analyzes the security vulnerabilities of mobile cloud computing and looks forward to future research directions. Literature [18] believes that cloud computing is an advanced concept and technology, which promotes the development of enterprises with high-quality services, and conducts research from the definition, advantages and disadvantages of cloud computing, and cloud decision-making models. Literature [19] summarizes various advantages of cloud accounting by comparing with traditional accounting informatization. Literature [20] compared cloud accounting and traditional accounting informatization to highlight the technical and development advantages of cloud accounting, and also mentioned new problems that may arise in the application of cloud accounting, but did not propose corresponding solutions.

3. Semisupervised Clustering Method for Image Scenes Based on Minimal Prototype Clusters

In the classification of accounting information image scenes, due to the complexity of accounting information image scenes, the constituent elements of different accounting information image scenes are very different. Moreover, there are intraclass differences and interclass similarities in each scene category. In the process of solving practical problems, the actual labels of the data samples divided into the same cluster after clustering may be different, that is, the problem that the results in the same cluster are not pure enough. The main reason is that the sample data has low-density separation and the clusters are not compact enough, which leads to the inability of the decision boundary to perfectly divide the dense area and the sparse area in the sample space. Based on these factors, we have the idea of splitting these less pure and compact clusters, making the divided clusters purer and more compact, and then making the results of semisupervised clustering more accurate. Therefore, the concept of “minimum prototype cluster” is proposed in this chapter, and the clusters that are not pure and compact are split until all clusters have the same label of labeled data, so as to achieve the pure effect of the final clustering result. Then, these pure and compact clusters finally obtained at this time are called “minimum prototype clusters.” In short, it is to split the cluster that is not pure and compact enough, and the cluster obtained after splitting is the smallest prototype cluster.

The basic steps of splitting the impure cluster to obtain the smallest prototype cluster are shown in Figure 1. We assume that there are 2 different accounting information image scenes in a set of sample data. There are two classes in Figure 1(a) and their labels are class 1 and class 2, namely, class 1 is represented by blue, and class 2 is represented by yellow. At the same time, there are labeled data and unlabeled data in both classes, namely, labeled and unlabeled. After traditional clustering, the result shown in Figure 1(b) is obtained; the data samples are simply divided into two circles, cluster 1 and cluster 2, namely, cluster 1 and cluster 2. According to the labeled sample data, both cluster 1 and cluster 2 contain more than one kind of labels; that is, both cluster 1 and cluster 2 are impure clusters, and the obtained classification results are also inconsistent with the correct division state. So in Figure 1(c), according to the labeled sample data, we further split cluster 1 and cluster 2 to obtain cluster 1, cluster 2, cluster 3, and cluster 4. That is, we split the original two impure clusters to obtain four minimal prototype clusters and then divide these four minimal prototype clusters into two categories class 1 and class 2 according to the labeled samples. The final classification result is completely consistent with the correct classification state.

We assume that is the sample space defined on an unknown distribution, is the labeled data sample set, is the unlabeled data sample set, is the class set, is the cluster set, and is the label set of the cluster set, and all clusters in have label . is the set of class label vectors for , and is the set of label vectors for . In order to combine labeled and unlabeled datasets, the objective function for defining semisupervised clustering is shown in where is a weight parameter and is the model of the objective function. The first item is used to measure the density of the cluster , the definition of is shown in formula (2), and the second item is the unsupervised clustering loss function. where and , respectively, represent the total number of labeled data samples in cluster and the number of correctly predicted labeled data samples, and obviously . is the number of labeled samples; is the number of samples in cluster with labels different from those in , obviously . For each cluster, is correlated with and negatively correlated with . Since and , it can be deduced that when , is the largest, and when , is the smallest; that is, is a convex function with a maximum value. At the same time, and can also be derived. Because of , formula (1) converges when approaches infinitely; that is, .

Considering that the two terms in formula (1) may belong to different scales, we normalize the two terms in the th iteration to the range [0,1], as in where and represent the values of and in the th iteration, and the final objective function of the th iteration is

The flowchart of the semisupervised clustering method of accounting information image scene based on the minimum prototype cluster is shown in Figure 2. First, the algorithm inputs the dataset and initializes the parameters. Next, the algorithm performs clustering to determine whether there are impure clusters in the clustering results. If there are impure clusters, the impure clusters are split. After splitting, the algorithm updates the model parameters, generates new clusters, and determines whether the newly generated clusters are bad clusters. If it is a bad cluster, the algorithm deletes the bad cluster and reclusters. If it is not a bad cluster, the algorithm continues clustering until there are no impure clusters, or the value of the objective function decreases. At this point, the set of minimum prototype clusters, that is, the optimal cluster set, can be obtained, and the process ends.The splitting method in the process and the criteria for judging bad clusters will be explained in detail in this section.

Some newly generated clusters may degrade the performance of the entire semisupervised clustering algorithm due to factors such as noise and outliers. Therefore, the performance of the newly generated clusters needs to be evaluated according to some criteria. This paper uses the prediction accuracy of labeled data as the evaluation criterion for the performance of the new cluster. In the newly generated clusters, the labeled data samples are used to calculate the classification accuracy of the new clusters. When the accuracy of a new cluster is lower than the average of the accuracy of all clusters, the new cluster is called a “bad cluster.” At this time, it is necessary to delete the bad cluster. The deletion mentioned here is not to delete the sample data, but to reassign a new cluster centroid for this bad cluster. The criterion for judging whether a new cluster is a bad cluster can be expressed by a formula. When the new cluster satisfies the following formula, the new cluster is a bad cluster and needs to be deleted.

Semisupervised clustering algorithms based on minimal prototype clusters can be extended to most clustering algorithms, but are especially suitable for data partitioning-based clustering algorithms such as -means. It is well known that the number of clusters to be clustered in -means is a key parameter. However, the existing clustering methods do not fully utilize the label information to find the optimal value of the number of clusters . Therefore, we use a semisupervised clustering algorithm based on minimal prototype clusters to improve this problem.

We set as the centroid, where each centroid is defined in where is the number of in cluster . The original loss function definition of is shown in where . The objective function of the semisupervised -means algorithm based on the smallest prototype cluster is redefined using the semisupervised clustering algorithm based on the smallest prototype cluster, as shown in

Considering that the two terms in formula (8) may belong to different scales, we normalize the two terms in the th iteration to the range [0,1], as in where and represent the values of I and SSE in the th iteration, and the final objective function of the th iteration is rewritten as

See Algorithm 3 for the pseudocode of the semisupervised -means algorithm based on minimal prototype clusters. First, the algorithm inputs relevant data and parameters. Then, the algorithm initializes the empty set with the labeled data sample set in step 1. Then, in step 2, the algorithm uses the labeled sample set to split impure clusters based on the objective function, delete bad clusters, and find the optimal number of clusters. Finally, in step 3, the algorithm iteratively updates the cluster centroids according to the optimal number of clusters, until the centroids do not change or the value of the objective function decreases, and the final optimal cluster set is obtained.

Figure 3 shows the framework of the method. First, the algorithm trains a semisupervised clustering model on the original dataset, and obtains the confidence that each unlabeled data sample belongs to each class under the semisupervised clustering. Then, the algorithm combines the distribution of the original data and selects the distribution of data samples with high confidence to add to the pseudolabel dataset. At this time, if the sample set is still unbalanced, that is, there is a problem that the data samples of the minority class are still insufficient, the algorithm will use the oversampling method to obtain an oversampled dataset. Then, the algorithm combines the pseudolabel dataset, the original dataset, and the oversampled dataset to obtain a new training dataset. Further, the algorithm uses the new training dataset to obtain the predicted probability results of each unlabeled data sample belonging to each class under the classification method. Finally, the algorithm combines the confidence prediction results of the semisupervised clustering and the prediction probability results of the classification method to obtain the final classification result.

The objective function of the semisupervised -means algorithm based on the minimum prototype cluster is shown in

The minimal prototype cluster-based semisupervised -means algorithm splits impure clusters using the labeled sample data and using the proposed objective function to guide the clustering and splitting process. This method not only satisfies the label information of labeled data as much as possible, but also improves the problem that the cluster value in the -means algorithm is difficult to determine and promotes the development of clustering in the optimal direction.

In order to filter out the pseudolabel data samples from the unlabeled samples, the confidence that the sample belongs to the class is defined, and the definition is shown in where is the centroid of , is the label of cluster , is the class label, , and represent the distance from the sample x to the centroid of cluster . In this chapter, this confidence formula is used as one of the reference standards for short selection of pseudolabel samples. The higher the confidence, the more likely the sample is selected to be added to the pseudolabel sample set. At the same time, according to the cluster centroids obtained in the semisupervised clustering method, each sample in is assigned to the nearest cluster using the distance formula (6), and the proportion of unlabeled samples in in each cluster is obtained. For each cluster i, we select data samples with the highest confidence in and put them into the pseudolabel dataset , where the value of depends on the specific conditions of the dataset, thus completing the construction of the pseudolabel dataset.

Before constructing an oversampling dataset, it is first necessary to determine whether the samples of each category in the new dataset composed of the original dataset and the pseudolabel dataset are balanced, that is, whether there are minority categories. Therefore, it is necessary to define a criterion for judging whether the number of samples in the th class is too small. The definition of this criterion is shown in where is the number of samples in the th class in the dataset and is the number of classes. When satisfies formula (13), that is, when is much smaller than the average number of samples in each category in the dataset , the class needs to be oversampled to supplement the training set, until the number of samples in any class does not satisfy formula (13).

This paper uses the synthetic minority oversampling technique (SMOTE) to artificially synthesize some new minority samples and then completes the construction of the oversampling dataset.

The definition of SMOTE algorithm is shown in formula (14). Its synthesis strategy is to randomly select a sample from its nearest neighbors for each minority class sample and then randomly select a point on the connecting line between and as the newly synthesized minority class sample . The principle of the SMOTE algorithm is shown in Figure 4. The blue dots are the majority class samples, the green solid star is the minority class sample, and the green hollow star is the newly synthesized minority class sample .

The basic type of SVM is a linear classifier with the largest interval defined in the feature space. Its basic idea is to solve the partitioning hyperplane that can correctly divide the features and has the largest geometric interval. The formula for dividing the hyperplane is shown in where is the normal vector, which determines the direction of the hyperplane, and is the displacement term, which determines the distance between the hyperplane and the origin. Obviously, the dividing hyperplane is determined by and . Then, the distance from any point in the sample space to the hyperplane (, ) is shown

If it is assumed that the hyperplane (, ) can correctly classify the training samples, then, satisfies

As shown in Figure 5, the sample points circled in the figure are the sample points closest to the hyperplane, and these sample points are called support vectors, and they satisfy the condition that the equal sign of formula (17) holds.

The sum of the distances from the two heterogeneous support vectors to the hyperplane is shown in formula (18), which is the interval.

To find the dividing hyperplane with the largest interval is to find the parameters and that satisfy equation (19), so that is the largest.

Obviously, in order to maximize the interval, it is only necessary to maximize , which is equivalent to minimizing . Therefore, formula (19) can be rewritten as

The basic form of SVM is to require all samples to be divided correctly. However, this requirement is often difficult to achieve in reality, and it is also difficult to determine an appropriate kernel function to make the training samples linearly separable in the feature space. Even if a certain kernel function is found to meet the requirements, it is difficult to determine whether the result is caused by overfitting. Therefore, in order to weaken the strict requirements for the accuracy of sample division and allow errors in the division of some samples, the concept of soft interval is proposed. The principle is shown in Figure 6.

In Figure 6, the red circles are the samples that do not satisfy the constraints. While using the concept of soft margin and using the slack variable as an alternative loss function, the SVM objective function is shown in formula (21). Among them, and are the relevant parameters of the SVM decision hyperplane, is the slack variable, and is the given penalty parameter.

SVM is designed for binary classification tasks. For multiclassification tasks, the one-versus method is used here. In the case of classification, the SVM multiclassification method trains classifiers; that is, a classifier is trained for every two categories. After that, the voting method is used to obtain the predicted probability that the sample belongs to the category , is a positive integer, and the definition of is shown in where is a set of classifiers with a number of . In the classifier , if a certain sample belongs to class a, then, , and if a certain sample belongs to class b, then, .

On the basis of obtaining the prediction results of semisupervised clustering and SVM classification, the final prediction results are obtained by combining the prediction probability of semisupervised clustering and SVM classification. The combination method is shown in where is the predicted probability result of the semisupervised clustering method, which is calculated according to formula (12). is the predicted probability result of the SVM multi-classification method, calculated according to formula (22). In formula (23), the weight parameter is used to adjust the influence of semisupervised clustering method and SVM classification method on the final prediction, and the value of is determined according to experience or cross-validation.

4. The Improvement Effect of the Improved ISCA Model on the Application of Accounting Informatization in Small- and Medium-Sized Enterprises in the Cloud Computing Environment

The connotation of accounting computerization work needs to be improved and clarified. The concept of accounting management informatization (referred to as accounting informatization) is proposed, and the connotation of accounting informatization work is the ISCA model. It specifically includes three aspects. One is to establish and implement the accounting information system under the environment of modern information technology or computer technology. The second is to establish an effective and sound internal control system for the information system. The third is to audit the information system. The organic combination of the three constitutes the ISCA model of AIS (accounting information system), as shown in Figure 7(a). The internal control system under the information system is shown in Figure 7(b).

Cloud computing is a model based on shared resource computing pools such as networks, servers, storage, applications, and services, which can be accessed on demand. These resource pools can be managed with minimal management or can be quickly provided and released by interacting with service providers. Cloud computing has five basic characteristics, three delivery modes, and four deployment modes. The cloud computing architecture of NIST is shown in Figure 8.

The construction mode of cloud accounting for SMEs is shown in Figure 9.

On the basis of the above research, the improvement effect of the improved ISCA model proposed in this paper on the application of accounting informatization in small- and medium-sized enterprises in the cloud computing environment is evaluated, and the results shown in Table 1 and Figure 10 are obtained.

From the above research, it can be seen that the improved ISCA model proposed in this paper has a very obvious effect on the improvement of application of accounting informatization in small- and medium-sized enterprises in the cloud computing environment.

5. Conclusion

This paper compares cloud accounting and traditional accounting informatization to highlight the advantages of cloud accounting. Moreover, this paper analyzes it from the perspectives of economy, technology, and policy, and demonstrates that cloud accounting is the development direction of accounting informatization for SMEs, which can better help SMEs reduce capital investment, improve management efficiency, and enhance market competitiveness. At the same time, this paper analyzes the existing problems of cloud accounting and the corresponding solutions, in order to promote the healthy development of cloud accounting for small- and medium-sized enterprises. This paper also provides a typical case of cloud accounting construction scheme for small- and medium-sized enterprises, so as to provide some support and reference for the construction and application of accounting informatization in small- and medium-sized enterprises. The research shows that the improved ISCA model proposed in this paper has a very obvious effect on the improvement of application of accounting informatization in small- and medium-sized enterprises in the cloud computing environment.

Data Availability

The labeled dataset used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The author declares no competing interests.

Acknowledgments

This work was supported by the Zhengzhou University of Industrial Technology.