Mathematical Problems in Engineering

Research Article

Matching Large Scale Ontologies Based on Filter and Verification

Algorithm 1

Pseudocode description of the framework.

Inputs: Source ontology (SO) and Target ontology (TO)
Outputs: Alignments (A)
Variables:ik
hashmap<(length, prefix), ID>//save length and prefix of entities in SO and TO
typicalList < ID_eso, ID_eto>//save typical entities in SO and TO
blockSetSO//save the set of blocks of SO
blockSetTO//save the set of blocks of TO
subOntoSetSO//save the set of sub-ontologies of SO
subOntoSetTO//save the set of sub-ontologies of TO
(1)	//Step 1: generate typical entities
(2)	for entity ei in SO, TO:
(3)	length = number of letters in label of ei
(4)	prefix = prefix of label of ei
(5)	hashmap.put(length, prefix, ID_ei)
(6)	end for
(7)	for key (length, prefix) in hashmap:
(8)	for IDs in values (ID_e1, ID_e2, …, ID_en):
(9)	if simT(ID_ei, ID_ej) > threshold and ID_ei ∈ SO and ID_ej ∈ TO://simT is shown in formula (2)
(10)	typicalList.add(ID_ei, ID_ej)
(11)	end if
(12)	end for
(13)	end for
(14)	//Step 2: clustering typical entities
(15)	for every entities ID_eso, ID_eto in typicalList < ID_eso, ID_eto>:
(16)	setSO.add(ID_eso), setTO.add(ID_eto)
(17)	blockSetSO = partition(setSO), blockSetTO = partition(setTO)//partition is shown in [15].
(18)	end for
(19)	//Step 3: partitioning/merging blocks
(20)	for every entity sets Si in blockSetSO:
(21)	for every entities ID_eso[k] in Si:
(22)	tempEntity = typicalList.get(indexOf(ID_eso[k])).get(1)
(23)	if k = 0:
(24)	//correSet records the corresponding setTO in blockSetTO
(25)	correSet = setTO which contains tempEntity in blockSetTO
(26)	else if tempEntity is not in correSet:
(27)	//sourceSet records the corresponding setSO in blocksSetSO
(28)	sourceSet = setSO which contains tempEntity in blockSetSO
(29)	correSet.add(tempEntity)
(30)	sourceSet.delete(tempEntity)
(31)	end if
(32)	end for
(33)	end for
(34)	//Step 4: extracting sub-ontologies
(35)	for every entity sets Si in blockSetSO, blockSetTO:
(36)	do:
(37)	n = Si.size
(38)	tempSet = null
(39)	//rH, rC, ek is shown as formula (3)∼(8)
(40)	for every entities ID_eso[k] in Si:
(41)	candidateEntity = ID_eso[k].get(rdfs:sub-ClassOf)
(42)	if rH(candidateEntity, ID_eso) > threshold:
(43)	tempSet.add(candidateEntity)
(44)	candidateEntity = ID_eso[k].get(rdfs: hasSomeValueFrom)
(45)	if rC(candidateEntity, ID_eso) > threshold:
(46)	tempSet.add(candidateEntity)
(47)	update extension factor ek
(48)	end for
(49)	Si.add(tempSet)
(50)	while(Si.size ! = n)
(51)	end for
(52)	subOntoSetSO = blockSetSO
(53)	subOntoSetTO = blockSetTO
(54)	//Step 5: matching sub-ontologies
(55)	for every pair sub-ontologies (subSO[i], subTO[i]) in subOntoSetSO, subOntoSetTO:
(56)	//matchStruc by V-DOC in [18].
(57)	structureAlignment = matchStruc(subSO[i], subTO[i])
(58)	//matchSema by GMO in [19].
(59)	sematicAlignment = matchSema(subSO[i], subTO[i], structureAlignment)
(60)	A.add(structureAlignment), A.add(sematicAlignment)
(61)	end for