Abstract

The adoption of blockchain technology can provide data asset management’s high security, privacy, and traceability. After a comprehensive investigation of the present blockchain-based data asset management mechanism, it was determined that it is only relevant to a portion of the blockchain system design. To address this issue, a new model of data asset management based on blockchain technology is being developed, which incorporates applications at all levels of the blockchain system. This model implements a network layer node authority control mechanism, a consensus layer consensus mechanism with customizable attributes, improved data query efficiency at the data layer by optimizing the structure and building indexes, intelligent data management, and smart contract layer management. Furthermore, at the transaction layer, sharing information encryption using customizable encryption algorithms is introduced. The experimental results reveal that, when compared to the traditional paradigm, the new blockchain-based data asset management strategy enhances the efficiency of on-chain data queries by 2.33 times.

1. Introduction

Data asset refers to the data resources that are owned or controlled by an enterprise and can bring future benefits to the enterprise and are recorded in a physical or electronic way, such as documents and electronic data [13]. Now that we have entered the era of big data, the value of data assets is self-evident in the era of big data, so data asset management methods are particularly important. In 2012, the National Science Foundation and the National Institutes of Health jointly launched the “Developing Big Data Science and Engineering Core Technologies” project, which aims to extract, manage, and analyze useful information from massive data sets. The essence of this useful information is data assets. In 2016, Google relied on its company’s unique advantages to collect a large number of users’ information and used the data assets extracted from this information to promote the company’s revenue growth. China has long realized the practical application value of data assets. At the “World’s First Data Asset Evaluation Model Release and Zhongguancun Data Asset Double Innovation Platform Establishment Ceremony” in April 2016, Guizhou Oriental Century Technology Co., Ltd. used data assets for “mortgage” and received the first payment from the Bank of Guiyang “Data lending”. Lending has enabled major companies to fully feel the actual value of data assets. In June 2019, the “Data Asset Management Practice White Paper (Version 4.0)” released at the 2019 Big Data Industry Summit pointed out that although the concept of data as an asset has been fully recognized by the industry, the management and application of data assets is still in the exploratory stage. In response to this situation, the white paper introduces the key activity steps of data asset management, aiming to better guide enterprises [4, 5].

2. Data Asset Management

2.1. Data Asset Concept Traceability

The concept of data assets was first proposed by Richard E. Peters in 1974. Data assets include government bonds, corporate bonds, and physical bonds held by individuals, enterprises, or institutions [6]. At this time, the scope of data assets is still relatively limited. In 2009, the International Data Management Association (DAMA International) [7] pointed out in the “DAMA Data Management Knowledge System Guide” that in the information age, data is considered an essential corporate asset and every company needs to be effective on it. At this time, people have gradually realized the more connotations of the concept of data assets. Then in 2013, the United States Government US Army [8] pointed out that data assets include any entity composed of data as well as services provided by applications to read data; files, databases, documents or web pages output by systems or applications etc.; services that return a single record from the database; and websites that produce specific query data. People, systems, or applications can create data assets. Compared with the past, data assets have a broader definition. Unifying information assets, digital assets, data assets, etc., into data assets, clearly defines data assets as data in cyberspace that has data ownership, which is a valuable, measurable, and readable resource. Data assets have the characteristics of both intangible assets and tangible assets, current assets and long-term assets, and are a new asset class [9]. At present, this concept is generally accepted by the industry [10, 11].

2.2. Data Resource Management

With the advent of the significant data era, data management has received more and more attention. The current data management can be divided into three levels: data management, data resource management, and data asset management. Data is defined as symbols that record and identify objective occurrences, such as physical signs or combinations of these physical indicators that record the type, status, and connection of real objects. The effective and efficient storage, processing, and use of data using computer technology is referred to as data management [12, 13]. Traditional data management focuses on the physical management of data and pays more attention to the stored data structure and the correlation between data. This management level only regards data as the form and carrier of information, and the management of related links between data is only the management of the shallow information embodied by the data. The International Data Management Association defines data management as data resource management. The two concepts are similar, but data resource management focuses more on data application than just the data processing process like data management. Data resource management is carried out based on data management and is mainly used for decision support [14, 15]. Data asset management expands data management and data resource management. Data assets should first be owned and controlled by the enterprise. In addition, not all data are data assets, and only data resources that can bring future economic benefits to the enterprise can be called data assets [16].

Data asset management treats data resources as a particular form of assets and manages them by asset management standards.

Figure 1 shows the relationship between data management, data resource management, and data asset management. Data management only focuses on the storage layer, mainly focusing on data storage and data caching. Data resource management no longer only pays attention to the storage layer because data has been regarded as a resource in data resource management, so the data resource management process is more concerned with the application layer, that is, how to use data resources to help enterprises make decisions and optimize production process. As a result, the connotation of data asset management is richer. Data management has been transformed into asset management. Data is managed as a unique asset. Data can assist enterprises or institutions in internal optimization, and data can be traded to generate income. Therefore, in addition to focusing on the storage and application layers, data asset management should also consider how to manage and share data as a unique asset securely. One of the current methods is to use block chain technology to give data assets the characteristics of safe sharing, tamper resistance, and traceability [17, 18].

2.3. Data Sanitization Process

Not all data resources are data assets. The process of realizing data resources is called data resource capitalization. The specific method of data sanitization is shown in Figure 2. During the production process of an enterprise, product-related data resources will be generated. In the next step, the data resources can be converted into data assets in two ways. The first situation is that the data itself can generate value [19, 20]. For example, medical data can help doctors study the condition of the disease in depth, thereby bringing benefits to the hospital. Trading this type of data resources under the premise of reasonableness and lawfulness is the most direct realization of data resources. The second case is that the data itself does not produce value, but it can empower the current business. For example, each application uses data mining technology to analyze user behavior to understand users’ needs in-depth. Data resources are explored to optimize production and operation methods and indirectly increase the income of existing products. This is an indirect way of realizing data resources.

Blockchain is a new type of database in which multiple nodes that do not trust each other jointly maintain a global state. Blockchain [2123] and references [24 and 25] has the advantages of data traceability, high data security, and data tamper resistance, which can solve the problem of secure data storage and multiparty noncompliance. Moreover, the issue of data sharing in the context of mutual trust is very applicable to data asset management [2630].

3.1. Safe Sharing of Data Assets

The current blockchain is mainly used for data security sharing in data asset management. Ahmed et al. [23] proposed a method to confirm the rights of transferred economic data assets based on blockchain technology [3133]. Allowing the sharing platform to disperse different services in different blocks of the blockchain [34 and 35] and empowering users of the sharing platform to dispose of data effectively, breaks the “data island” and solves the difficulties faced by shared data collection and shared data applications. The orderly and stable development of the sharing economy has provided solutions to many disputes in sharing economy. Aiming at the lack of adequate circulation of e-government data assets and hindering the sharing and efficient use of e-government data, Chen and Xue [24] proposed a series of measures to strengthen the application of blockchain technology in big government data, such as the use of blockchain technology creates big data collection channels for government affairs, combines blockchain technology with government services to protect data security, and uses blockchain technology to promote the progress of China’s information system. Combining blockchain technology with big government data can effectively solve the dilemma of independent and uncorrelated data and “data islands” among e-government data and help government service platforms carry out informatization transformation better.

3.2. Tamper-Proofing of Data Assets

There are also many studies on blockchain technology in realizing tamper-proofing of data assets. Due to the particularity of the industry, the human resources and social security departments retain a large number of data assets and critical information involving citizens in the system. This requires that every link in the flow of information cannot be lost. In human resources and social security projects, some third-party system developers or internal staff of the department uses their positions to steal and tamper with information. Shahzad et al. [25] proposed the builder community blockchain system to realize the safe sharing and credible circulation of human and social asset project data. The combination of controllable blockchain technology, secure digital service, digital resource management platform and other elements, credible circulation of social security cards, personnel information, labour contracts, and the safe and barrier-free sharing of human and social information resources is realized. In addition, when ocean data assets are shared across departments, data sharing methods using traditional centralized systems face security risks such as data being easily tampered with and illegal copying and utilization. Ding et al. [26] developed a blockchain-based ocean data sharing platform based on the alliance chain, which realized data sharing between ocean departments and better protected the rights and interests of owners in the cross-departmental use of ocean data. It prevents data from being tampered with and copied in violation of regulations, and at the same time reduces the cost of data transactions caused by the existence of third parties, improves the motivation of various departments to share and open data on the data asset value network, and further forms a healthy development ecological environment [36, 37].

3.3. Traceability of Data Assets

At present, there are also some researches using the traceability characteristics of blockchain to optimize data asset management. For example, Wang et al. [27] designed a blockchain-based food traceability scheme, which uses radio frequency identification (RFID) technology to obtain food-related data and stores the acquired data assets in the blockchain.

3.4. Intelligence of Data Assets

Intelligent management of data assets focuses on the combination of blockchain technology and data asset management in the future. The verification, storage, and synchronization of medical data assets have always been a difficult point. Patients, doctors, and researchers are subject to strict restrictions when accessing and sharing medical data assets. In this process, a lot of resources and time are required for permission review and data verification. Zheng et al. [28] designed a medical data asset management model based on blockchain, using smart contracts to realize safe and reliable automatic sharing and management of medical data assets. In addition, Gao et al. [29] used blockchain smart contracts to implement a crowd sourcing system, which can effectively realize the distribution of task assets and significantly improve work efficiency.

3.5. Privacy Protection of Data Assets

The problem of data asset privacy protection can also be solved using blockchain technology. In response to the crisis of sensitive data such as power data being reproducible and easy to be abused in the traditional centralized data transaction system, Jain et al. [30] designed a method that is limited to sharing between power grid companies, power production companies, and scientific research entities, or a platform for the security, privacy, and mutual trust of power data transactions. The platform uses Hyperledger Fabric, combined with Golang-based microframework Gin, accesses the underlying blockchain services for system development, completes the back-end business logic of the power data association and transaction system, and users can easily register, recharge, and cancel accounts. Zheng et al. [31] used blockchain technology and distributed hash table (DHT) storage methods to build a user data asset authority management system to realize the privacy protection function of data assets.

4. Data Asset Management

The existing data asset management technique applies only to one layer of the blockchain system architecture and does not fully integrate the blockchain system and data asset management. As a result, this paper analyses the aforementioned procedures and suggests blockchain-based data assets. The new model of blockchain-based data asset management employs not only a single or a few levels of the blockchain system, but also the network layer, consensus layer, data layer, and intelligence layer. Furthermore, the contract layer and the application layer are combined and optimized separately. At the network layer and the consensus layer, data security sharing is achieved through node authority classification and a custom consensus mechanism. At the data layer, data query efficiency is improved by optimizing storage structure and query methods (such as establishing effective indexes), and data is guaranteed by logging data on the chain It is traceable and privacy is ensured by using encryption algorithms to encrypt data. At the smart contract layer, part of the data is automatically managed by using intelligent contract program segments. At the application layer, the security of user sensitive information is improved by encrypting transaction information.

The role of the network and consensus layers of the traditional blockchain is to ensure that each node in the blockchain network maintains the consistency of the complete copy through the consensus algorithm. Only when the computing power of the malicious node reaches more than half of the total computing power, it is possible to modify the storage of the data in the blockchain. Hence, the blockchain has the characteristics of data security sharing. In the traditional blockchain, the network layer uses a peer-to-peer (P2P) network. The authority of chain managers in the blockchain network is not the same. The blockchain manager is responsible for managing the entire blockchain and is suitable as a super node with the highest authority. The data asset owner and the data asset being shared only have access rights to the data assets they participate in, and they can control the access rights through information encryption. Therefore, the data asset owner and the shared data asset do not need to enjoy more information at the network layer.

The blockchain network participants belong to the nodes that have not yet taken any action in the blockchain network. Therefore, these three kinds of nodes can be regarded as ordinary nodes participating in the blockchain network. In the new model of blockchain-based data asset management, while setting up the P2P network, the network layer imitates the traditional database to classify the rights of nodes. It sets up two types of nodes: super nodes and ordinary nodes. The super node is controlled by the government and its responsible for managing the blockchain network to ensure the security and lawfulness of data asset sharing. Users who need to share data assets include data asset owners, data assets being transferred, and blockchain network participants. They can join the network as ordinary nodes, and normal nodes are managed by super nodes, and common nodes are equal. In traditional blockchains, public chains generally adopt consensus mechanisms based on proof of work or proof of equity, while private chains or consortium chains generally adopt consensus mechanisms such as practical Byzantine fault tolerance (PBFT). The essence is to ensure that all the consistency of the nodes also avoids damage to the blockchain network by malicious nodes to a certain extent. Although these consensus mechanisms safeguard the consistency of nodes, they largely sacrifice the efficiency of data on-chain and are not suitable for application scenarios of data asset management. Therefore, at the consensus layer in the new model of blockchain-based data asset management, the official can design a custom consensus mechanism in the initial stage of creating a blockchain platform to meet data asset management needs. For example, because the super node is controlled by the blockchain manager and can guarantee the security of its identity, the super node verifies the uploaded data during the data chaining process, and then packages the verified data into blocks and broadcasts the notification. Every ordinary node, every normal node synchronizes to the local data copy after receiving the message.

In the data layer of the traditional blockchain, a linked list structure is used between the blocks to connect all blocks end to end. The Merkle tree structure is used to store data in the block, so the blockchain has the characteristics of data tamper-proof. In addition, the timestamp field in the data layer can ensure that the stored data is time-sequential, and all historical operations can be restored and traced based on nontamperable data. The data structure of the traditional blockchain is single, and the query method is simple, so the efficiency of data queries on the chain is not high. However, in data asset management scenarios, there are many situations of data interaction, and the efficiency of data queries on the chain is very high. Therefore, in the data layer of the new model of data asset management, blockchain technology is used to ensure that data assets are tamper-proof. Merkle-B tree is used to optimize the existing data structure, and structure such as skip table is used to establish effective query index and to improve the efficiency of data query on the chain. The Merkle tree structure in the block can be modified and combined with other balanced tree structures to improve query efficiency while ensuring that the data is difficult to tamer. In the case of a blockchain chain structure, query indexes such as skip tables can be established on the chain structure to improve query efficiency. In addition, uploading log-type data that is difficult to tamper with can be easily traced to the source of all historical operations, thereby making it easier to define responsibility for unexpected situations in the process of data asset management and circulation. The intelligent contract layer of the traditional blockchain can ensure reliable operations without human involvement. These operations are traceable and irreversible, thus guaranteeing the automation properties of the blockchain. In the new model of data asset management, officials can use intelligent contracts composed of automated script codes to formulate shared transaction rules, conduct identity reviews and data verification through super nodes in the network to ensure the legitimacy of data asset sharing transactions. Users can use smart contracts to automate data asset management through ordinary nodes, and they can also use smart contracts to realize automated data asset sharing with other users. In the traditional blockchain, the intelligent contract deployment method is used as a particular transaction on the chain. Similarly, in the new blockchain-based data asset management model, super nodes can package their released smart contracts into blocks and upload them to the blockchain for deployment. In contrast, the contracts of ordinary nodes need to pass through the super node’s only after verification can they be deployed on the chain.

5. Experiment and Result Analysis

The experiment uses Python language to build a blockchain platform for a new model of data asset management. Through multithreading, a super node is generated, and 2, 4, 6, and 8 ordinary nodes are generated, respectively, and the nodes can communicate and interact with each other.

First, this experiment uses a self-defined encryption algorithm to manage data assets. It uses a hybrid encryption method of advanced encryption standards (AES) and RSA to encrypt data assets that interact between nodes. Among them, the AES algorithm has a fast encryption speed, which is responsible for encrypting the main part of the transmitted file and improving the transmission efficiency. The RSA algorithm is responsible for encrypting the AES key and improving transmission security. When data assets are exchanged between nodes, first, the data initiator creates an AES key through the AES algorithm, encrypts the interactive data, and generates an interactive data encryption file. Then, the data receiver uses the RSA algorithm to create an RSA key, an RSA public key and an RSA private key, and broadcasts the RSA public key to the entire network. After the data sender receives the RSA public key, the RSA public key is used to encrypt the AES key. After the encryption is completed, the data sender transmits the interactive encrypted file and the AES key to the receiver. Finally, the data receiver decrypts the AES key and it retains the RSA private key. The interactive data encrypted file is decrypted by the AES key to obtain the original interactive data. In this way, the risk of data leakage in the network is effectively reduced and data transmission efficiency is also improved.

Second, the experiment uses a custom consensus mechanism. The content of the consensus mechanism is that ordinary nodes can upload data to the blockchain network, and the super nodes collect scattered upload data in the blockchain network. After a certain amount of upload data is collected, the information is verified for credibility to ensure the data is true and effective and the data source is safe. After that, the super node will package the verified data to generate blocks and broadcast the block information across the network. After the ordinary nodes collect the blocks, they will copy the blocks to copy the blockchain [38] in the node to complete the data update. The specific experiment is as follows: upload data to the blockchain network within a certain period through 2, 4, 6, and 8 ordinary nodes, and each ordinary node uploads the same amount of data. The normal node upload command is issued as the upload starts, and the generated block is added to the blockchain copy of each normal node as the upload end. The time for uploading the data to the blockchain platform is shown in Figure 3.

It can be concluded from Figure 3 that with the constant increase in the number of nodes, the corresponding increase in data upload time is relatively stable and in line with expectations and the overall linear distribution. This proves that the new model of data asset management based on blockchain technology proposed in this paper is stable. Furthermore, through the data upload time of multiple nodes, it can be found that the data upload time of the new model of data asset management based on blockchain technology is shorter, which proves the high efficiency of this model.

Four nodes are chosen for the data throughput test based on local hardware and testing methods. On the server side, four terminals are enabled and four sets of experimental data are picked at random to form Figure 4.

Transaction latency is another important performance parameter. It reflects the period between transaction submission and blockchain writing. In the same network setting, the Merkle-B tree algorithm reduces the consensus process. Figure 5 shows the transaction latency comparison histogram for the Merkle-B tree algorithm, mix, and the skip list technique.

Finally, this paper designs a new optimized query algorithm that uses the Merkle-B tree to optimize the existing data structure. At the same time, it uses the skip table structure to establish an effective query index to improve the efficiency of the data queries on the blockchain.

To evaluate the query performance of the new method, a comparative experiment was designed. The investigation contains four ways: the first method uses the traditional blockchain method to query (origin method). The second method uses the technique of optimizing the existing data structure through the Merkle-B tree for query (Merkle-B tree method). The third method uses the method of a skip table structure for question (skip list method). The fourth method uses the optimized query algorithm of Merkle-B tree and forgettable system proposed in this paper for question (mix method). The data asset information of different capacities is queried separately through these four methods, and the query time is recorded.

The query time result is shown in Table 1 and Figure 6. By comparing the query time of the four methods, it can be found that, compared with the original method, the query speed of the three methods (Merkle-B tree method, skip list method, and mix method) using optimized structures has been improved. Among them, the query efficiency of the mixing method is the highest, the efficiency of the skip list method is the second, and the efficiency of the Merkle-B tree method is the lowest. The reason is that although the Merkle-B tree method optimizes the block data structure, the amount of data inside the block is small, so the speed is only slightly increased, only 17%. The skip list method builds an index between blocks, and the query speed is greatly improved, which is increased by 1.24 times, while the mixing method combines the advantages of the Merkle-B tree method and the skip list method, so the data query speed is the fastest, which is increased by 2.33 times.

6. Conclusion

Data asset management is crucial in today’s big data era. Emerging blockchain technology may be applied to data asset management to assure data asset security, privacy, and traceability. Many data asset management systems are based on blockchain technology, but each is tailored to a certain layer of the blockchain system architecture. This article provides a novel model of data asset management based on blockchain, which incorporates applications at all levels of the blockchain system, by summarising and analyzing these mechanisms. It employs them at the network, consensus, data, and smart contract layers. Furthermore, the layer and transaction layers have been optimized.

Data Availability

The data used to support the findings of this study are available from the author upon request (k.aldahlan@uoh.edu.sa).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Kawther A. Al-Dhlan wrote the paper, Hamad A. Alreshidi simulated the work, Shahbaz Pervez proof read the paper, Zahida Paraveen validated the method, Akram M. Zeki and Nada M. O. Sid Ahmed added the results, and Eid J. Alshammari and Velmurugan Lingamuthu suggested the changes.

Acknowledgments

The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at University of Ha’il for funding of this Research group Number (RG–191308).