Abstract

In recent years, innovative positioning and mobile communication techniques have been developing to achieve Location-Based Services (LBSs). With the help of sensors, LBS is able to detect and sense the information from the outside world to provide location-related services. To implement the intelligent LBS, it is necessary to develop the Semantic Sensor Web (SSW), which makes use of the sensor ontologies to implement the sensor data interoperability, information sharing, and knowledge fusion among intelligence systems. Due to the subjectivity of sensor ontology engineers, the heterogeneity problem is introduced, which hampers the communications among these sensor ontologies. To address this problem, sensor ontology matching is introduced to establish the corresponding relationship between different sensor terms. Among all ontology matching technologies, Particle Swarm Optimization (PSO) can represent a contributing method to deal with the low-quality ontology alignment problem. For the purpose of further enhancing the quality of matching results, in our work, sensor ontology matching is modeled as the meta-matching problem firstly, and then based on this model, aiming at various similarity measures, a Simulated Annealing PSO (SAPSO) is proposed to optimize their aggregation weights and the threshold. In particular, the approximate evaluation metrics for evaluating quality of alignment without reference are proposed, and a Simulated Annealing (SA) strategy is applied to PSO’s evolving process, which is able to help the algorithm avoid the local optima and enhance the quality of solution. The well-known Ontology Alignment Evaluation Initiative’s benchmark (OAEI’s benchmark) and three real sensor ontologies are used to verify the effectiveness of SAPSO. The experimental results show that SAPSO is able to effectively match the sensor ontologies.

1. Introduction

In recent years, innovative positioning and mobile communication techniques have been developing to achieve Location-Based Services (LBSs) [1, 2]. With the help of sensors, LBS is able to detect and sense the information from the outside world to provide location-related services. To implement the intelligent LBS, it is necessary to develop the Semantic Sensor Web (SSW) [3, 4]; as the kernel technique of the SSW, sensor ontology is a standard information exchange model, which serves as the basis for different machines to understand semantics and implement the sensor data interoperability, information sharing, and knowledge fusion among intelligence systems.

Due to the subjectivity of sensor ontology engineers, they might make use of various concepts to mean the same thing, or one concept might have more than one meaning, yielding the problem of heterogeneity that affects semantic interoperability between ontologies. Ontology matching [57] can be seen as a powerful tool to face this challenge, which has been widely applied in different application domains, such as Artificial Internet of Things (AIoT) [8, 9] and biomedical domain [10]. Sensor ontology matching can be used to discover the semantic relationships of different sensor ontologies, which is capable of determining the correspondences between concepts of heterogeneous sensor. The similarity measure is critical for a sensor ontology matching technique. Due to the complicated semantic relationships among the sensor data, a single similarity measure cannot ensure that it is able to distinguish all the semantically identical entities in any matching context. Thus, several different similarity measures are usually aggregated to enhance the result’s confidence. Ontology matching is generally interpreted as how to find a set of appropriate weights and threshold to achieve high-quality ontology alignments.

Particle Swarm Optimization (PSO) [11] is a contributing methodology for determining high-quality ontology alignments [12]. Although PSO converges fast, it is apt to fall into the local optima, which makes it unable to find the global optimal solution. To overcome this drawback, in this work, aiming at various similarity measures, a Simulated Annealing PSO (SAPSO) is proposed to optimize their aggregation weights and the threshold. Particularly, in the process of evolving process, SAPSO introduces a Simulated Annealing (SA) strategy to further enhance the quality of solution. The innovation points of this work are as follows:(1)An approximate evaluation metric on ontology alignment is proposed, and an optimization model for the sensor ontology meta-matching problem is constructed.(2)To effectively solve the problem of sensor ontology meta-matching, an ontology meta-matching framework and a SAPSO algorithm are proposed.

This paper is organized as follows. Section 2 presents the related work. Section 3 gives the formal definitions on the sensor ontology and similarity measure. Section 4 constructs the optimization model for sensor meta-matching problem. Section 5 presents the SAPSO. Section 6 shows the experimental results and the corresponding analysis. Finally, Section 7 draws the conclusions and puts forward the future research directions.

2. Swarm Intelligence Algorithm-Based Ontology Matching Technique

In different sensor ontologies, due to the subjectivity of the designer, conceptual name in the sensor system may have different naming methods and definition methods, thus causing the problem of communication inconvenience between different sensor ontologies [13, 14]. Due to the complex intrinsic nature of matching two ontologies, swarm intelligence algorithms, such as PSO, Parallel Compact Cuckoo Search Algorithm (PCCSA) [15], Artificial Bee Colony (ABC) algorithm [16], Firefly Algorithm (FA) [10, 17], and Evolutionary Algorithm (EA) [18, 19], have become effective methods to determine the ontology alignments.

Bock et al. [20] used a discrete PSO algorithm to optimize the results of ontology entity matching, which does not require the computation of large similarity matrices. He et al. [16] used the ABC-based matcher to solve the ontology meta-matching problem, whose results can be proved more effective. Xue et al. [17] proposed a Compact Cooperative Firefly Algorithm- (CCFA-) based ontology matching system, which can improve the search efficiency effectively by using a new mechanism. Xue et al. [12] also proposed a compact multiobjective PSO to solve the matching problem of large-scale biomedical ontology. In addition, they [10] also proposed a Compact Firefly Algorithm (CFA), which greatly reduced the running time and memory consumption by two compact movement operators. Chu et al. [21] first built an ontology model in vector space and proposed a Compact Evolutionary Algorithm (CEA) to solve the ontology matching problem. In this work, we further introduce SA into PSO’s evolving process to trade off its exploration and exploitation, which is able to effectively help the algorithm to jump out of the local optima.

3. Sensor Ontology and Similarity Measure

3.1. Sensor Ontology

In the computer and information science field, ontology is a formal list of all the concepts and their relationships in a particular domain [18]. With respect to the SSW, a sensor ontology is used as the most important and extensive model for describing the concepts related to sensors and the IoT [22, 23], such as the sensor’s output, observations, observation characteristics, and so on. For ease of description, a set of triples [24] is used to represent a sensor ontology, where , , and represent the sets of class or concept, property, and instance, respectively. An example of sensor ontology is shown in Figure 1, where an ellipse represents a class and the arrows between the ellipses represent the class’ properties. A class is a collection of instances, and each element in is an instance of a class. Generally, classes, properties, and instances are collectively called entities.

The goal of sensor ontology matching [25] is to establish correspondences between heterogeneous entities and find the set of entity correspondences, the so-called sensor ontology alignment [26]. Here, an entity correspondence is a five-tuple <,,,, >, where refers to the identifier of entity correspondence; and are the entities of two ontologies, respectively; is the degree of confidence between and that can be matched, usually at [0, 1]; and is the equivalence relationship between and . The process of matching two sensor ontologies is shown in Figure 2, where and , respectively, represent the two sensor ontologies to be aligned, is the input alignment, is a set of parameters, represents some external resources, and is the obtained alignment.

A similarity measure uses particular information to calculate to what extent two entities are similar. Generally, the similarity measures can be composed of three types, which are described in detail in Section 3.2.

3.2. Similarity Measure
3.2.1. Syntax-Based Similarity Measure

A syntactic measure calculates the string distance between entities of different ontologies. In our work, we use the N-Gram distance, which is an effective syntactic metric in the ontology matching domain. N-Gram has an obvious advantage in comparing the similarity between two strings [27, 28]. Given two strings, their N-Gram distance is calculated by measuring the number of common substrings they have. To be specific, the N-Gram distance is defined as follows:where , are two strings to be computed, respectively; stands for the length of each substring after splitting the original string, which is generally set to 2 or 3 (the lower the value, the higher their similarity; the value of in this work is 3); is the number of their common substrings; and and are their lengths, respectively.

3.2.2. Linguistic-Based Similarity Measure

Semantic similarity calculates the similarity between entities according to the semantic context. In our approach, we use the Wu–Palmer [29] similarity measure, in particular, it returns a fraction to indicate the degree of similarity between the two words. In this work, we use the WordNet [30], which is an English dictionary based on cognitive linguistics, to calculate the related variables in . Here, we choose because it is the most popular WordNet-based similarity measure, which calculates the semantic similarity between two strings by considering not only the conceptual depth in the hierarchical semantic structure of WordNet but also their context information. To be specific, it is defined as follows:where denotes the depth of the word in WordNet’s the hierarchical semantic structure and is the closest common parent concept of and .

3.2.3. Structure-Based Similarity Measure

The main idea of structure-based similarity measure is to determine two entities’ similarity through neighborhood entities (superclass and subclass relationship). In general, matched entities have similar structures, that is, they have the same number of superclass and subclass; conversely, if two entities have the same number of superclass and subclass, they are considered similar. In our work, the structure-based similarity measure that we use is called Out-In degree, which calculates the similarity according to the number of superclasses and subclasses of entities in different ontologies, which is defined as follows:

Based on three similarity measures, we can get three similarity matrices, respectively. The similarity matrix is defined as a matrix of , where and are, respectively, the number of entities in the original ontology and the target ontology. Each element of the matrix is the similarity value of two corresponding entities determined by the similarity measure. After that, through assigning an aggregating weight for each similarity matrix, we can obtain an aggregated matrix, which is filtered by using a similarity threshold to determine the final matrix. The ontology meta-matching problem can be defined as determining the optimal aggregating weights and the threshold to get a high-quality ontology alignment, which will be formally defined in the following.

4. Sensor Ontology Meta-Matching Problem

In general, optimization problems can be divided into unconstrained optimization problems and constrained optimization problems; the classification criteria are whether there are constraints. In this paper, the problem of sensor ontology meta-matching is modeled as a constrained continuous optimization problem, and its constraints are the sum of the weights and the threshold of the similarity measure, which is explained in more detail in the following. There are three points to consider when building an optimization model: constraint conditions, decision variables, and objective function.

4.1. Constraint Conditions and Decision Variables

For convenience, the process of sensor ontology meta-matching can be described as a seven-tuples (, , , ,), where and represent the source ontology and target ontology, respectively; represents the number of similarity measures; represents a set of similarity matrices; is the set of aggregating weights; is the similarity threshold; and is the obtained sensor ontology alignment. In particular, M and are, respectively, defined as follows:

The framework of sensor ontology meta-matching is shown in Figure 3, where are the similarity measures; are the similarity matrices; are aggregating weights on the similarity matrices, respectively; is the aggregated matrix; is alignment determined by ; and is the threshold. As can be seen from the figure, the ultimate goal of sensor ontology meta-matching is to find a suitable weight for each similarity matrix and a suitable threshold value for the comprehensive similarity matrix, which is able to ensure the quality of the alignment.

4.2. Objective Function

The quality of the results of sensor ontology meta-matching is usually measured by f-measure, whose value is related to both recall and precision. Traditional recall, precision, and f-measure [9] are defined in equations (5)–(7):where is the standard alignment; is the alignment determined by some matching techniques; divides true positive correspondences we find by the number of all correct matching pairs, which represents whether the matching results found by us are complete or not; and divides the number of true positive correspondences we find by the cardinality of found alignment and represents whether our match is accurate. Their values are between 0 and 1, and the quality of the results is judged by these values, but neither nor can evaluate the alignment effectively because a high recall value does not mean that our results are accurate and a high precision value does not mean that our results are complete. Therefore, in order to consider the evaluation results of recall and precision, we use to combine these two indicators. But the traditional evaluation index needs to work with reference matching results, which is impossible to obtain in advance in most cases. To overcome this drawback, in the following, we propose three new quality evaluation metrics [31] on sensor ontology alignment, i.e., , ,, to approximate traditional recall, precision, and f-measure:where M represents the composite similarity matrix and is the value of row i, column j of the composite similarity matrix M.where M represents the composite similarity matrix and is the value of row i, column j of the composite similarity matrix M.and finally, the objective function we need to optimize is defined as follows:

5. Sensor Ontology Meta-Matching with Simulated Annealing Particle Swarm Optimization

5.1. Particle Swarm Optimization

PSO is an algorithm based on swarm cooperation, which is developed by simulating the birds’ foraging behavior [32]. PSO initializes a set of random particles (stochastic solutions) and iteratively searches the optimal solution; in each iteration, the particles update themselves by tracking two extremes. The formula for updating the speed and position of PSO is as follows:where means the th particle, , is the size of population, is the number of iterations, is the speed of particles, and are learning factors, is a random number in [0, 1], is the extremum of an individual, the best solution found by the particle itself, is the global extremum, and is the current position of the particle. Compared with other swarm intelligence algorithms, PSO has such advantage as only one-way information flow, i.e., all the particles are able to converge quickly, but it tends to fall into the local optima. To solve this problem, the SA strategy is introduced into the evolutionary process of PSO to make it better optimized.

5.2. Encoding Mechanism

A decimal encoding method is used in this work to encode a solution, which encodes a set of weights and a threshold into each particle. With respect to the encoding process on n aggregating weights and one threshold, first, n real numbers are generated in [0, 1] randomly, which are, respectively, denoted as , represents the encoding information of a particle. Then, the first n − 1 numbers are sorted in the ascending order, and we get . In particular, the final number rn is the threshold for filtering the final alignment. Finally, aggregating weights are obtained as follows:

Each particle in the population contains a set of weights and a threshold. An example of encoding process on aggregating weights is shown in Figure 4.

This encoding mechanism on the aggregating weights meets their constraints defined in equation (4), and it is also of help to reduce the solution’s dimension, and it ensures that different groups of numbers correspond to different aggregating weights.

5.3. Simulated Annealing

Simulated annealing algorithm is an algorithm that introduces random factors into the search process. The simulated annealing algorithm does not completely reject the worse solution, which greatly improves the probability of getting rid of the local optimal solution. Generally, SA contains two parts, which are metropolis algorithm and annealing process. Metropolis algorithm aims at helping the solution jump out of the local optima, which accepts new solutions with a certain probability. Annealing is a process in which , the parameter of the probability of accepting the worse solution, decreases with the iteration, so that as the iteration proceeds, the probability of accepting a worse solution gradually decreases. Assuming that a system’s previous solution is denoted as and the current solution is denoted as , where is the current iteration number, the acceptance probability of the system on changing from to iswhere and are the fitness of the previous solution and current solution, respectively. is a parameter that represents the annealing temperature. Here, the initial temperature should be large, and as the iteration goes on, the temperature would be gradually reduced, so as to ensure that the probability of state transition is gradually reduced from 1. In such situation, any solution can be accepted at the beginning of the iteration, and the current solution stays unchanged at the end of the iteration. Therefore, SA not only avoids the algorithm falling into local optimization too quick but also guarantees the algorithm’s convergence.

For the sake of clarity, the pseudocode of SAPSO is presented in Algorithm 1.

(1)Input: Source and target ontologies and , number of iteration , initial temperature ,
(2)population size
(3)for ( = 0;  < n; ++)
(4)for ( = 0;  < 3; ++)
(5)[i] = (0, 1)
(6)[i] = (0, 1)
(7)[i] = 
(8)calculate fitness [i]
(9)f [i] = fitness[i]
(10)end for
(11)end for
(12) =  {[i]}
(13) =  {[i]}
(14) =  {[i]}
(15)while < do
(16) = 
(17)for ( = 0;  < n; ++)
(18)for ( = 0;  < 3; ++)
(19)[i] = [i] +  ([i]) + 
(20)[i] = [i] + [i]
(21)update fitness[i]
(22)if (fitness[i] f [i])
(23)P = 1
(24)[i] = [i]
(25)else
(26)P = 
(27)if (P(0, 1))
(28)[i] = [i]
(29)end if
(30)end if
(31)f [i] = fitness[i]
(32)end for
(33)end for
(34) =  {[i]}
(35)={[i]}
(36)={[i]}
(37)Gbest ={f [i]}
(38)end while
(39)Output Gbest

First, the particles are initialized, and each particle generates three random numbers , , and on the [0, 1] interval, representing the cut points of the two weights and a threshold, respectively. And each particle also generates three initial velocities. Consider the cut points of weights and a threshold contained by the particle as the best cut points and threshold of individual history for each particle, denoted as , , and , respectively, and calculate the fitness values for each particle (line 8). The two cut points and one threshold of each particle are denoted as the three dimensions of the particle. Find out the best one of each dimension of all particles, denoted as ,, and , respectively. Initialize the temperature and calculate/update the annealing temperature (line 16) at the beginning of the iteration based on equation (16), update the cut points and threshold for each particle with the PSO formulas (lines 19 and 20), and get the updated fitness values based on the formulas in Section 4.2 (line 21). Then, there is the key to simulated annealing: if the updated particle has greater fitness than its predecessor, then the solution transition probability is set to 1 (line 23), and the new particle is considered as the pbest; otherwise, it is accepted at a certain probability according to equation (15). If the probability condition is satisfied, the new particle is considered as the pbest in the next generation to update the velocity and position with PSO (lines 19 and 20). Finally, the pbest particle whose fitness value is the largest is treated as the global best particle of the population for the next generation of PSO updating. If the end condition is not met, the loop executes the program until the end condition is met and the globally optimal fitness value, f-measure, is output.

5.4. The Flowchart of SAPSO

SAPSO is a method using annealing strategy to avoid the local optimal solution of PSO algorithm. The flowchart of SAPSO is shown in Figure 5.

First, we initialize the entire population, including the parameters of each particle. The second step is to obtain the fitness value for each particle and then judge whether the iteration has reached the max iteration; if the max iteration is reached, the iteration process ends and the results are output; otherwise, the entire population will be optimized using PSO algorithm according to equations (12) and (13), and obtain each particle’s new fitness; at this point, we use “state” to represent all the information that the particle contains, including its fitness value and encoding information, and the particle’s fitness is used to indicate the particle’s state; state and states are, respectively, the information of an individual’s corresponding local best and the population’s global best during the evolutionary process. And then it is going to judge whether the new state is better than that of the previous generation. If the new state is better than the previous generation state, the new state is accepted, which satisfies equation (15), which is the formula for simulated annealing. Using simulated annealing, if the particle accepts the new state, the particle treats the new state as state; otherwise, the particle treats the original state as state; then, state is obtained by comparing the state of each particle. The annealing temperature needs to be recalculated according to equation (16) before the next iteration, and then the process is looped until the end condition is met.

6. Experiment Results and Analysis

In this experiment, to verify the effectiveness of SAPSO, we use the OAEI’s benchmark and three real sensor ontologies, i.e., SOSA [33] and new SSN and old SSN ontology [22]. The test results of SAPSO and PSO shown in Tables 1 and 2 are the mean values of 30 independent runs.

6.1. Configuration

Similarity measures used in this experiment:Syntactic-based measure: N-GramLinguistic-based measure: Wu–PalmerStructure-based measure: Out-In Degree

The configuration on SPSO and PSO is as follows:Population size: 60Maximum number of iterations: 300Learning factor , : 2Initial temperature: 10.0

These parameters are determined in an empirical way, which is able to ensure the quality of the alignments in all testing cases.

6.2. Results and Analysis
6.2.1. OAEI Benchmark

The brief description of OAEI’s benchmark is presented in Table 3. The first column in Table 3 is the ID of the testing cases, each corresponding to a testing ontology. We divide these test ontologies into five groups according to their specific characteristics, which is described in the second column of the table. We compare SAPSO with PSO-based ontology matching technique and OAEI’s participants, i.e., edna, AML [34], LogMap [35], LogMapLt [35], XMap [36], and LogMapBio [35].

In Table 1, SAPSO’s results are outperforming all the competitors except XMap on the testing cases 221–247. The reason is that on testing cases 221–247, the source ontology and the target ontology are identical in terms of lexical and semantic features but differ in terms of structural features, and our structure-based similarity measure is not effective, which reduces the f-measure. In particular, on all testing cases, SAPSO’s results are all equal to or better than PSO, which shows that the introduction of SA is able to improve PSO’s searching ability and improve the solution’s quality. From the average of f-measure, SAPSO performs better than others, which shows that SAPSO plays an effective role in improving the quality of ontology matching.

6.2.2. Real Sensor Ontologies

SOSA (http://www.w3.org/ns/sosa/), the basic class and property of SSN (http://www.w3.org/ns/ssn/) ontology, represents the lightweight core of new SSN ontology. These sensor ontologies describe the function and performance of the sensor. They support many applications and use cases, such as signal detection in large-scale scientific exploration, home infrastructure monitoring, livelihood services, observation-driven ontology engineering, the World Wide Web, sensor data service system, and more [37]. The new SSN differs from the original SSN in that it simplifies the relationship between the device, platform, and system classes on the old SSN. We tested SAPSO on three real sensor ontologies with our sensor ontology meta-matching system and got their f-measure, recall, and precision values. Table 2 shows the matching results of SAPSO.

In Table 2, the first column refers to two matched sensor ontologies, and second, third, and fourth columns are, respectively, f-measure, recall, and precision of the alignments. It can be seen from the table that on the task of matching new SSN and SOSA, SAPSO is able to determine the perfect alignment. With respect to the other two matching tasks, SAPSO’s f-measure is also close to 1.0. Since there exist some complex correspondences in the reference alignment, i.e., one source concept corresponds to several target concepts, SAPSO fails to find them, which reduces its f-measure. In general, SAPSO is able to effectively match various sensor ontologies.

7. Conclusion and Future Work

LBS’s architecture is widely used in the fields of vehicle speed estimation [38], vehicle travel time prediction system [39], and bus arrival time prediction system [40]. Technologies and applications of LBS cannot be separated from sensors. To implement the intelligent LBS, different sensor ontologies need to be integrated on SSW. To this end, in this work, the new quality evaluation metrics are proposed to evaluate the traditional three evaluation metrics. And a mathematical model on sensor ontology meta-matching problem is constructed; finally, a SAPSO is presented to address the problem, which uses SA to help the algorithm avoid the local optima. To verify the effectiveness of SAPSO, we use the OAEI’s benchmark and three real sensor ontologies. Finally, the experiment proves that SAPSO is an effective method.

In the following work, the quality of the sensor ontology matching results would continue to be enhanced by taking into consideration those complex correspondences. At present, SAPSO still has some defects in determining the entity mappings with heterogeneous characteristics, which makes its f-measure relatively low in those testing cases with heterogeneous structure; at the same time, SAPSO has some limitations, for example, its performance is related to initial value and parameters are sensitive. Last but not least, it is necessary to improve the approximate evaluation metrics on ontology alignment to better guide the algorithm to search for the global optima.

Data Availability

The data used to support this study can be found in http://oaei.ontologymatching.org.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (nos. 61801527 and 61103143) and the Natural Science Foundation of Fujian Province (no. 2020J01875).