Abstract
Developing sensor ontologies and using them to annotate the sensor data is a feasible way to address the data heterogeneity issue on Internet of Things (IoT). However, the heterogeneity issue exists between different sensor ontologies hampers their communications. Sensor ontology matching aims at finding all the heterogeneous entities in two ontologies, which is a feasible solution for aggregating heterogeneous sensor ontologies. This work investigates swarm intelligence (SI)-based sensor ontology matching techniques and further proposes a competitive binary particle swarm optimization algorithm (CBPSO)-based sensor ontology matching technique. In particular, a guiding matrix (GM) is proposed to ensure the population’s diversity and a competitive evolutionary framework is presented. The experiment uses ontology alignment evaluation initiative (OAEI)’s benchmark and three real sensor ontologies to test CBPSO’s performance. The experimental results show that the competitive evolutionary framework is able to help CBPSO effectively optimize the alignment’s quality, and it significantly outperforms other SIs at 5% significant level.
1. Introduction
Since most Internet of Things (IoT) [1] devices work in diverse environments, the heterogeneity problem of underlying devices makes it difficult to provide a uniform way of representing sensor data. Developing sensor ontologies [2] and using them to annotate the sensor data is a feasible way to address these issues. Currently, various sensor ontologies, such as the CSIRO sensor ontology (CSIRO) [3], semantic sensor network ontology (SSN) [4], and MMI device ontology (MMI) [5], have been widely used in the IoT domain. These sensor ontologies own lots of overlapped information, but there also exists the heterogeneity problem between them since one sensor concept might be represented with different terminologies, granularities, or contexts. Sensor ontology matching technique is able to map the identical ontology entity pairs, which enables the processing, interpretation, and sharing of sensor data that are organized with different ontological schemes [6].
Inspired by the success of metaheuristics-based matching technique in ontology matching domain, this work further proposes a competitive binary particle swarm optimization algorithm (CBPSO) and uses it to address the sensor ontology matching problem. To better trade off the algorithm’s searching performance, CBPSO uses guiding matrix (GM) to describe the solutions’ distribution and guide newly generated particle’s initialization; then, two competitive subpopulations are introduced to, respectively, focus on the exploitation and exploration. The contributions made in this work are as follows: (1) we present the mathematical formula for the sensor ontology matching problem; (2) we propose a CBPSO-based sensor ontology matching technique, which uses the GM to ensure the population’s diversity, and two competitive subpopulations to trade off the algorithm’s exploitation and exploration; (3) we employ CBPSO on ontology alignment evaluation initiative (OAEI)’s benchmark and three real sensor ontologies’ matching tasks; the results reveal that CBPSO is able to determine high-quality sensor alignments.
The rest paper is organized as follows. The sensor ontology matching problem is defined in Section 3; CBPSO is presented in Section 4 in detail; the experimental results are shown in Section 5; finally, Section 6 draws the conclusion.
2. Related Work
Determining high-quality sensor ontology alignment is a challenge since there exist rich semantic relationships among sensor concepts [7]. To face this challenge, in recent years, different researchers have proposed various intelligent matching techniques. FuzzyAlign [8] integrates the semantic sensor web with a fuzzy theory-based method, and it uses evolutionary algorithm (EA) to find the optimal threshold for filtering the final alignment’s quality. Differential evolution-based ontology matching (DEOM) [9] first uses the neural network to train the ontology matcher, and then, DE is used to find the correct correspondences. Co-evolutionary algorithm-based ontology matching (cEAOM) [10] uses multiple populations’ trade off the algorithm’s performance and improves the searching efficiency. Evolutionary tabu search algorithm (ETSA) [11] improves the algorithm’s converging speed by combining the tabu search algorithm [12] as the local search strategy with EA. Due to the limitation of f-measure [13], the classic EA might optimize the solution’s quality by improving recall (or precision) while sacrificing the other [14]. To overcome this drawback, multiobjective evolutionary algorithm-based ontology matching (MOEAOM) [15] is proposed to simultaneously optimize two objectives. Comparing with single-objective EA, MOEAOM is able to provide the decision makers with more nondominated options.
Besides EA, another category of metaheuristics, i.e., swarm intelligence (SI), is also applied to match the sensor ontologies. Borrowing the idea from cEA, a co-Firefly Algorithm (cFA) [16] is proposed to optimize the sensor ontology alignment’s quality. It uses two subpopulations, i.e., the better subpopulation with higher-quality elite, and the worse subpopulation whose elite’s fitness value is lower. These two subpopulations, respectively, use the exploitation and exploration, and when the worse subpopulation’s elite’s quality outperforms the better subpopulation, they adaptively switch the searching strategy. More recently, particle swarm optimization (PSO) [17] is also presented to aggregate different sensor ontology matchers, which borrows the idea from ETSA and introduces the simulated annealing (SA) [18] to execute the local search process.
3. Sensor Ontology Matching Problem
A sensor ontology is a 3-tuple , where , , and , respectively, represent the sets of concept, data-type property, and relationship between two concepts [19]. Matching sensor ontologies aim at finding the sensor ontology alignment, which is a set of sensor entity correspondences. Each correspondence is a 4-tuple , where and are, respectively, two sensor ontologies’ entities, is two entities’ relationship, and is a real number in [0, 1] that denotes to what extent the correspondence holds [20]. Often, is measured by the similarity value between two entities. If , two entities are the same, and if , they are different.
There are three categories of ontology matcher for measuring the entities’ similarity, which are, respectively, based on string, linguistic, and ontology structure [21]. This work uses a hybrid similarity measure that is proposed by Xue and Wang [22], which combines three kinds of ontology matchers to improve the result’s confidence. Given an alignment , the more correspondences it has () and the higher mean similarity value it owns , the better quality it has. On this basis, we use the following metric to evaluate the quality of [23]
To optimize the alignment’s quality, the sensor ontology matching problem is defined as follows:where is the matching matrix, means the th entity in ontology is mapped with the th entity in ontology , and use the function () to measure ‘s corresponding alignment’s quality.
4. Competitive Binary Particle Swarm Optimization Algorithm
For trading off the algorithm’s exploitation and exploration, CBPSO uses two subpopulations, where the one with better (worse) global elite particle is called the better (worse) subpopulation. In this work, the better subpopulation mainly focuses on the exploitation, and the worse one tends to explore the unknown domain.
4.1. Initialize Particle
CFPA uses the binary encoding mechanism to encode a particle, which is denoted by a 0-1 matrix, whose row and column are, respectively, two entity sets, and its element 1 means two corresponding entities are mapped, while 0 means not. When the scale of entities is huge, this matrix is a sparse one, i.e., most of its elements are 0. Therefore, we need to uniformly assign the elements with value 1 in each subpopulation’s matrices on the whole. To this end, we introduce a guiding matrix, which is of help to ensure the diversity of each subpopulation. For the sake of clarity, given a guiding matrix and a matching matrix [24], we show the pseudocode of initializing a particle in the subpopulation in Algorithm 1.
|
In Algorithm 1, we initialize a new particle according to , whose elements represent the distribution of current subpopulation’s particle positions. The larger value of ‘s element is, the more likely this dimension has been explored by the existing particles, whose probability of being 1 in newly generated particle should be smaller. In this way, we are able to ensure the diversity of subpopulation to the maximum extent.
4.2. Update Particle
In each iteration, particle is updated according to the following equations:where and are, respectively, the particle’s local optima and its subpopulation’s global optima and is the inertia weight in th generation, which is determined by the following equation:
and are two learning factors, whose values are, respectively, calculated by the following equations:where , , , , , , and is the maximum generation. Here, we use different annealing strategies on , , and to trade off the algorithm’s exploration and exploitation. All these parameters are configured empirically, which enables the algorithm determine the highest f-measure in average on all testing cases.
4.3. The Pseudocode of Competitive Binary Particle Swarm Optimization Algorithm
The pseudocode of CBPSO is presented in Algorithm 2. The initialization of CBPSO consists of three steps: (1) two subpopulations and ’s particles are initialized according to Algorithm 1, (2) and ’s global optimal particles are initialized by selecting the particles with the best objective function values, and (3) each particle’s local optima is initialized with newly generated particle. During each iteration, each particle is first updated according to equations (3)–(8); then, their corresponding local optima and each subpopulation’s global optima are updated accordingly. After that, and are competed through their local optimal particles, and if wins, we will switch and . After the competition, two subpopulations’ guiding matrices and are calculated by summing all the subpopulation’s particles’ matching matrices together. Finally, we use and to, respectively, reinitialize 10% and 50% particles in and according to Algorithm 1. When reaching the maximum iteration number , the algorithm terminates and returns .
|
5. Experiment
5.1. Experimental Setup
In this work, ontology alignment evaluation initiative (OAEI)’s benchmark and three real sensor ontologies are used to test CBPSO’s performance. In Tables 1–3, we compare CBPSO with five SI-based sensor ontology matching techniques, i.e., differential evolution algorithm (DE) [9], co-evolutionary algorithm (cEA) [10], evolutionary tabu search algorithm (ETSA) [11], co-firefly algorithm (cFA) [16], and simulated annealing particle swarm optimization (SA-PSO) [18] in terms of recall, precision, and f-measure [25], respectively. Their results are the average of thirty independent runs. Table 4 briefly describes OAEI’s benchmark and three real sensor ontologies.
5.2. Statistical Experiment
We utilize statistical testing method T-test [26] to compare different competitors’ performance in terms of recall, precision, and f-measure, respectively. Tables 1 and 2 show six SI-based sensor ontology matching techniques’ mean, recall, precision, and f-measure and the corresponding standard deviation on all the testing cases, and Table 3 presents the -value on recall, precision, and f-measure.
In Tables 1 and 2, alignments determined by CBPSO are much better than other sensor ontology matching technique. Due to subpopulations’ competitive mechanism, CBPSO is able to better trade off the algorithm’s exploitation and exploration, which ensures not only the solution’s quality but also the algorithm’s stability. In Table 3, the degree of freedom is 61 (two samples’ scales are both 33), the significant level is 5%, and except those results that are the same, the -value of the rest are all smaller than 0.005, which show that CBPSO significantly outperforms other competitors on 5% significant level.
6. Conclusion
Building sensor ontologies and using them to annotate the sensor data is a feasible way to address the IoT data heterogeneity issue. However, before using these ontologies, we need to address the heterogeneity problem among them. Recently, SI-based matching techniques have been emerging as a popular method of matching heterogeneous sensor ontologies, and this work proposes a CBPSO-based sensor ontology matching technique. In particular, CBPSO uses the GM to ensure the diversity of the population, and two competitive subpopulations to trade off the algorithm’s exploitation and exploration. The experiment compares CBPSO with five SI-based sensor ontology matching techniques, and the experimental results show that CBPSO’s results outperform the competitors.
In the future, we will further improve CBPSO by introducing more subpopulations with different searching strategies. In addition, the competing mechanism can be further improved by referring to binary fish migration optimization algorithm (BFMO) [27]. We are also interested in further improving CBPSO to match biomedical ontologies [28], which is the large-scale matching task. Some efficiency-improving strategies will be used, such as the compact encoding, and the efficient compact evolutionary paradigm will be one of our future studies.
Data Availability
The data used to support this study can be found at http://oaei.ontologymatching.org.
Conflicts of Interest
The authors declare that they have no conflicts of interest in the work.
Acknowledgments
This work was supported in part by the Scientific Research and Technology Development Project of Yulin City (no. 20204012), Natural Science Foundation of Guangxi Province (no. 2021GXNSFAA220076), and Science Research Foundation for High-level talents of Yulin Normal University (no. G2021ZK17).