Ontology Mapping and Rule-Based Inference for Learning Resource Integration

Jetinai
						Kotchakorn; Arch-int
						Ngamnij; Arch-int
						Somjit

doi:10.6109/jicce.2016.14.2.097

OA학술지
Journal of information and communication convergence engineering

Ontology Mapping and Rule-Based Inference for Learning Resource Integration

DOI : 10.6109/jicce.2016.14.2.097
Author: Jetinai Kotchakorn, Arch-int Ngamnij, Arch-int Somjit
Publish: Journal of information and communication convergence engineering Volume 14, Issue2, p97~105, 30 June 2016

ABSTRACT

Ontology Mapping and Rule-Based Inference for Learning Resource Integration

KEYWORD

Conflict resolution , Learning resources , Ontology mapping , Rule-based inference

본문

Collapse all

I. INTRODUCTION

Owing to the rapid growth of e-learning communities, several educational institutions have developed a variety of learning management systems (LMSs) independently, which they have constructed based on their own functional requirements. Therefore, there is a rapidly increasing number of duplicate learning resources (LRs) published on different sites. Metadata plays a crucial role in describing the content of such LRs and in facilitating their integration, to enable the reusability and exchangeability of existing LRs in different LMSs. The use of different types of metadata still results in a problem of interoperability across heterogeneous LRs. Furthermore, most of these metadata standards lack formal semantics and a common standard between heterogeneous metadata descriptions across domains.

Until recently, a number of researchers [1,2] have used metadata standards and ontologies to semantically annotate LRs; this can easily increase discovery and reuse, and facilitate sharing of LRs among LMSs. Several studies of LR sharing using ontology mapping have been performed [3,4]. Despite wide acceptance, however, the problems of semantic interoperability and semantic discovery across heterogeneous LMSs still have the potential to cause difficulties in sharing available LRs. The meaning of the information described and the differences in design among information systems lead to information heterogeneity problems (or semantic conflicts). In this paper, we classify the problem into two main levels, which are described below [5,6].

1) Semantic heterogeneity occurs when there is disagreement about the meaning, interpretation, or intended use of the same or related data.

2) Structural heterogeneity occurs when the same concepts are modelled with different logical structures in different systems. In most cases, no direct concept-to-concept mapping is possible.

To address the problems above, we propose a method of resolution for issues of heterogeneity using ontology mapping extended from our previous work [7-9]. We have applied SWRL to solve the problem of ontology mapping, especially in the case of structural conflicts, for which Web Ontology Language (OWL) has a limited capability.

The remainder of the paper is structured as follows: Section II illustrates heterogeneity problems with a motivating example. Section III describes the design of the common ontology that was used in most of this work to overcome the problems outlined above. The ontology mapping for integration is presented in Section IV. Section V presents experimental results obtained via the system implementation and evaluation of the proposed mapping technique. Finally, our conclusions and suggestions for future work are summarized in Section VI.

II. MOTIVATING EXAMPLE

To illustrate the information heterogeneity problems that exist in most e-learning systems, we demonstrate here how two different LR ontologies can extract only a portion of the learning-content resources from distinct LMS repositories. The ontologies are referred to as LR1 and LR2, and are shown in Figs. 1 and 2, respectively.

[Fig. 1.] The source ontology LR1.

[Fig. 2.] The source ontology LR2.

  > A. Semantic Heterogeneity

Semantic heterogeneity occurs when there is a disagreement about the meaning, interpretation, or intended use of the same or related data [10,11]. Semantic heterogeneity is classified into three types: naming conflicts, scaling conflicts, and property value conflicts. These types are described below.

1) Naming conflicts encompass two different kinds of conflict, namely synonyms and homonyms. Synonyms are semantically equivalent concepts or properties defined by different names. For example, the concept (or class) LR1:Teacher and the concept LR2:Lecturer are synonymous concepts, since they both refer to the same fact. Homonyms, on the other hand, are semantically unrelated concepts or properties defined by the same name. For example, the property LR1:name refers to the name of a Learning Resource, whereas the property LR2:name signifies the name of a Person.

2) Scaling conflicts concern semantically equivalent properties defined using different scales (or units of measurement). For example, the properties LR1:salary and LR2:salary have different units of measurement, ‘EUR’ and ‘USD’, respectively.

3) Property value conflicts concern semantically equivalent properties defined with different property values. For example, the property LR1:gender defines ‘M’ and ‘F’ to refer to ‘male’ and ‘female’, whereas the property LR2:sex uses ‘0’ and ‘1’ to represent ‘male’ and ‘female’, respectively.

  > B. Structural Heterogeneity

Structural heterogeneity occurs when the same concepts are modelled with different logical structures in different systems. Although there are several publications that classify structural heterogeneity into various types of conflicts [5,6,10,11], this paper focuses on four main kinds of such conflicts, as described below.

1) Generalization conflicts concern semantically related concepts that are defined in different systems, where the concepts in one system subsume the concepts in another system. For example, the concept LR1:Members subsumes the concept LR2:Guest since the concept LR2:Guest is a subconcept of LR1:Members.

2) Aggregation conflicts arise when a property or a concept in one system maps to a group of properties or concepts, respectively, in another system. For example, the property LR2:name of the concept LR2:Person is equivalent to a group of properties, LR1:title, LR1:firstName, and LR1:lastName, of the concept LR1:Members.

3) Property discrepancies concern semantically equivalent properties defined with different property types. For example, the properties LR1:author and dc:creator in LR2 are semantically equivalent, but the property LR1: author is a datatype property, whereas dc:creator is an object property.

4) Concept discrepancies occur when the logical structure of a set of properties and their values belonging to a concept in one system are organized to form a different structure in another system. For example, the concept LR1:Subject is equivalent to the concept LR2:Learning Resource, whose property LR2:hasStructure has the concept LR2:Course as its range.

III. A COMMON ONTOLOGY

To overcome heterogeneity problems, we designed the Common Ontology (CO) as a standard mediatory ontology for supporting the integration. The CO was defined using basic terms based on standard metadata for e-learning. The two common standards, namely DC and LOM, are included. These standards are aimed at enabling usability, aiding discoverability, and facilitating interoperability, usually in the context of online LMSs. Terminology outside this common vocabulary must be translated to the terminology of the metadata standard; otherwise, a comparison of data semantics will not be possible. In this paper, the common terms that cannot be described with the DC and LOM vocabularies are prefixed with the namespace CO, as depicted in Fig. 3.

[Fig. 3.] The common ontology.

The CO structure consists of standard concepts and standard properties. The standard concepts include, for example, CO:LearningResource, CO:Course, CO:Unit, CO:LearningResourceFile, dcterms:IMT, and so on.

The standard properties are the datatype property and the object property. A datatype property, such as dc:description, has a literal as its range and defines the datatype using a built-in XML schema. A property which has a concept as its range, such as dc:format, is called an object property. The arrows in Fig. 3 labelled ‘is-a’ (subClassOf) establish a relationship between concepts in the form of a subsumption hierarchy. Other concepts and properties can be extended into the CO because of its scalability. The standard CO provides an abstract view for users to access information about each local ontology.

IV. ONTOLOGY MAPPING

  > A. Mapping Rules

This section describes the method of conflict detection and resolution for overcoming the problems mentioned in Section II. The conflict detection method is presented in the form of rules and algorithms, whereas the resolution method is presented as mapping rules, which are used to map between any local ontology and the CO. In our approach, LR1 and LR2 must be mapped into a standard-related entity in the CO. Thus, both ontologies have already conducted sharing of their LRs with the CO. In view of this, after mapping has been performed from LR1 and LR2 to the CO, LR1 can map the semantic entities of LR2 automatically, because it knows about their mapping to a common schema from the CO. This will become increasingly beneficial as more educational content providers begin to use the CO for global sharing of a standard mapping to their repositories.

The following definitions are general notations for the CO components. They are represented on the basis of object-oriented programming and set theory.

DEFINITION 1. CO is a set of learning-resource metadata based on the Common Ontology, defined as a tuple CO = <C_co, P_co, I_co, O_co, RC_co, RP_co, σ_co>, where C_co, P_co, I_co, O_co, RC_co, RP_co, and σ_co are defined as below.

DEFINITION 2. C_co is a finite set of common concepts, C_co = {c_ci |∀i = 1…n}.

DEFINITION 3. P_co is a finite set of common properties, P_co = {p_ci |∀i = 1…n}.

DEFINITION 4. I_co is a set of individuals (or instances) of the common concepts,

DEFINITION 5. O_co is a set of range concepts or literal values of the common properties, called objects, O_co = {o_ci ∈ I_co or o_ci = Literal value |∀_i = 1…n}.

DEFINITION 6. RC_co is a set of mappings from a concept to a concept and is defined as a set of axioms, where

DEFINITION 7. RP_co is a set of mappings from a property to a property and is defined as a set of axioms, where

DEFINITION 8. σ_co is a set of atoms conforming to the SWRL axioms. Atoms can be formed as σ_co= {C_co(x), P_co(x, y), sameAs(x, y), differentFrom(x, y), builtIn(r, x,…)}, where x, y are either variables, individuals or literal values. An atom C_co(x) holds if x is an instance of the concept C_co; an atom P_co(x, y) holds if x is related to y by the property P_co; an atom sameAs(x, y) holds if x is interpreted as the same object as y; an atom differentFrom(x, y) holds if x and y are interpreted as different objects; and builtIn(r, x,…) holds if the built-in relation r holds on the interpretations of the arguments.

The following functions define the domain, range, and type of the common properties.

Γ: P_co → T_p gives the set of property types (T_p) of a property p_ck ∈ P_co.

If then Γ(p_ck) = ObjectProperty. If then Γ(p_ck) = DatatypeProperty.

DEFINITION 9. An ontology mapping, denoted by OM, is defined as a tuple where

Τ_co = {C_co ∪ P_co} is a set of CO terms consisting of common concepts and common properties.

Τ_lo = {C_lo ∪ P_lo} is a set of local ontology terms consisting of local concepts and local properties.

Φ is a set of rules defined to detect the conflicts of semantics and structure.

Ř is a set of mapping rules to enable semantic mapping for each kind of conflict.

The conflict detection rules and the mapping rules are defined in the following sections.

  > B. Conflict Detection and Resolution

1) Resolution for Semantic Heterogeneity

(a) Naming-conflict resolution: To resolve the synonyms conflict, the resolution procedure applies the WordNet similarity measure [12] of Wu and Palmer [13] to compute the degree of similarity between two terms and to suggest identical terms in the two ontologies based on an accepted threshold specified by the system.

Rule 1: If the similarity score of two terms is equal to 1, then the two terms are equivalent, i.e.

Sim: T₁ × T₂ →S, where

T₁ = {t_{lk_i} ∈ T_lo | ∀ i =1…n}, T₂ = {t_{lk_j} ∈ T_co |∀ j =1...m}, and

S = {s_i | ∀ i =1…n and 0 ≤ s_i ≤ 1, with s_i being the similarity score}.

A term t_lk ∈ T_lo is mapped onto t_ck ∈ T_co if and only if both t_lk and t_ck are semantically equivalent terms, denoted as t_lk ≅ t_ck, and then s_k ∈ S is equal to 1, i.e. Sim_wup(t_lk, t_ck) = 1. This means that the terms t_lk and t_ck are in the same synset.

The mapping rules for resolving the conflict in other cases are as below, where t_lk and t_ck are the terms involved:

Mapping Rule 1:

Case 1: t_lk and t_ck are terms for concepts:

Case 2: t_lk and t_ck are terms for properties:

p_lk ≡ p_ck iff Γ(p_lk) ≡ Γ(p_ck),

(b) Scaling-conflict resolution: The property LR1: salary can be converted to the same currency unit as the standard property in the CO.

Rule 2: If the two scaling units are in conflict, then the local unit needs to be converted to the standard unit as in the CO.

A local literal value o_lk of LR1:salary has a scaling unit (such as EUR), i.e. p_lk(?x, ?y) = LR1:salary(?x ?unit1), where ?unit1 refers to the o_lk in the local unit. The standard literal value o_ck of CO:salary has a different unit (such as USD), i.e. p_ck(?x, ?y) = CO:salary(?x, ?unit2), where ?unit2 refers to the o_ck in another unit. In this resolution, we apply the multiply built-in function of SWRL to convert o_lk to the unit o_ck in the mapping rule below.

Mapping Rule 2:

The property p_lk' is generated to receive ?unit2 of the property p_lk executed by the swrlb:multiply function, and ‘?const_rate’ is the variable representing the exchange rate for converting from ?unit1 to ?unit2.

The following mapping rule applies Mapping Rule 2 to resolve the scaling conflict occurring in LR1:salary, where the 1.23 is the rate exchange between the EUR and USD.

(c) Property-value conflict resolution: The values of LR1:gender and LR2:sex can be converted to the same value as the standard property in the CO.

Rule 3: If the values, o_lk and o_ck, of the equivalent properties are in conflict, then the local value o_lk needs to be converted to the standard value o_ck as in the CO.

In this resolution, we apply SWRL to convert the o_lk of LR1:gender = “M” to o_ck = “Male” and to convert o_lk = “F” to o_ck = “Female” in the mapping rule below.

Mapping Rule 3:

where “a” and “b” refer to the literal values of o_lk and o_ck, respectively.

The resolution of the conflicts in the case study is presented below:

2) Resolution for Semantic Heterogeneity

(a) Generalization conflict resolution: This solution considers the association between concepts, where each concept is associated with another concept as a superclass (superconcept) or a subclass (subconcept). Two possible cases can be considered, as follows.

Case 1: A standard concept c_ck has no subconcepts (it is a single class), and it subsumes a local concept c_lk.

Rule 4: If a concept c_ck has no subconcepts but it subsumes a concept c_lk, then c_lk can be assigned as a subclass of c_ck.

Mapping Rule 4: The concept c_lk must be assigned as a subclass of a concept c_ck, c_lk ∈ c_ck, when c_ck has no subconcepts. The conflict resolution is performed by Algorithm 1 below. An image concept c_sm of c_lk is copied, denoted by COImageChild(c_lk, c_sm), and has the namespace CO attached to it. Thus, the new concept c_sm is equivalent to c_lk, and then the concept c_sm is assigned the standard concept c_ck as a subclass of (subClassOf), so that users can view and invoke this concept from the CO.

[]

Case 2: A standard concept c_ck has subconcepts and subsumes a local concept c_lk.

Rule 5: If a concept c_lk has a semantic relationship as a subClassOf a concept c_ck that has some subconcepts, then c_lk is defined as a subClassOf c_ck when c_lk is equivalent to a subconcept of that c_ck.

Mapping Rule 5: If a concept c_lk has a semantic relationship as a subClassOf a concept c_ck that has subconcepts, c_lk is defined as a subClassOf that c_ck when there is an equivalence between c_lk and a subconcept of c_ck. The conflict resolution is performed after Algorithm 2 below is executed.

[]

After Mapping Rule 5 has been applied, the concept c_lk is automatically assigned as a subconcept of the superconcept c_ck, which is denoted as subClassOf(c_lk, c_ck). This is inferred from equivalentClass(c_j, c_lk). If the class description c_lk is defined as a subClassOf c_ck, then the set of individuals of c_lk must be a subset of the set of individuals in the class extension of c_ck, such that Ĭ_lk ∈ c_ck.

(b) Aggregation conflict resolution: When these conflicts occur, a vCard format for standard metadata is used to resolve the problem by allowing the system to define semantic equality.

Rule 6: If a property in one ontology is similar to a group of properties in another ontology, then it can be mapped to that group of properties in the other ontology, as follows.

Since a group of properties LR1:title, LR1:firstName, and LR1:lastName are semantically related to vCard:FN in the CO, the group of these properties can be mapped to the vCard:FN property in the CO by the stringConcat function. The resolution is shown in Mapping Rule 6.

Mapping Rule 6:

(c) Property discrepancy resolution: To resolve the conflict, the type of the local property needs to be transformed to a standard type.

Rule 7: If two equivalent properties have different property types, then the type of the local property Γ(p_lk) is mapped into a standard type Γ(p_ck) as the equivalent property in the CO.

Mapping Rule 7 resolves the difference between Γ(LR1:author) and Γ(dc:creator). Here, Ř(LR1:author) = Literal, i.e. Γ(LR1:author) = DatatypeProperty, whereas Ř(dc:creator) = o_ck ∈ C_co, i.e. Γ(dc:creator) = Object Property. Therefore, we need to map LR1:author into a standard type as dc:creator. The resolution of this conflict depends on the designated mapping functions, as illustrated in the following example. The value of LR1:author can be separated into three standard properties, vCard:honorific-prefix, vCard:given-name, and vCard:family-name, using the swrlb:substringBefore and swrlb:substringAfter functions to split the space in the value of LR1:author.

Mapping Rule 7:

Note that the concept LR1:TempName is created to receive the instance (?b) that holds the three vCard properties. We have equivalentClass(LR1:TempName, vCard: Name), and equivalentProperty(LR1:author, dc:creator) because Γ(LR1:author) = ObjectProperty.

(d) Concept discrepancy resolution: The resolution procedure applies a value constraint in OWL DL to link a restriction class to either a class description or a data range.

Rule 8: If a set of properties and values of a concept in one system is similar to that of a concept in another system, but they have different structures, then the values of the properties in one ontology can be organized as a set of instances of an identical concept in another ontology using a specific restriction, as follows.

Mapping Rule 8 below is used to map LR2:Learning Resource, whose property LR2:hasStructure has the value LR2:course as its range, into the related concept CO:Course.

Mapping Rule 8:

V. EXPERI MENTAL RRESULTS

  > A. System Implementtation

In our experiments, we used LRs from the Moodle LMSs of two different universities as a set of instances for testing. One of these data sources originated from the Department of Computer Science, Khon Kaen University, Thailand, and the other from the Faculty of Computer Science, Ubon Ratchathani Rajabhat Univ ersity, Thailand. The first source contained six undergraduate and six graduuate courses, while the second contained eight undergraduate and three graduate courses. We implemented a system called ‘OWLGenerator’ to extract database tuples from the LMSs into the ontological instances. The first source was mapped to LR1, and the second source was mapped to LR2. The instances of each local ontology were integrated into the CO through the ontology-mapping process.

To demonstrate how the rules resolved problems, a system named the Learning Resource Integration System (LRIS) was developed to evaluate the proposed approach. The system provided basic and advanced search capabilities. It allowed users to search for LRs and retrieved the results based on the inference rules. We used the advanced search module of the LRIS to execute SPARQL commands to query the LRs across heterogeneous ontologies. The query was to select the LRs that contained the keyword ‘program’ in the resource title, as shown in the following:

Fig. 4 deppicts an excerpt from the list of LRs and properties retrieved from the LR1 and LR2 sources. The results were inferred by use of Mapping Rules 1 and 3 and represent the LR properties corresponding to the query.

[Fig. 4.] LRs and properties retrieved from the LR1 and LR2 sources.

  > B. System Evaluationn

To validate the system, three techniques which are considered standard information retrieval metrics [14], namely precision, recall, and F-measure, can be used to evaluate the results of different approaches to information retrieval; they are widely accepted in the evaluation of ontology mapping [15]. The F-measure is a combination of the precision and recall measures. The formulas are as below:

The LRIS was tested by four service requesters (users), i.e. authors, teachers, learners, and guests, who searched for LRs across different LMSs. The results retrieved were then evaluated using the three metrics. Each user executed 30 queries to retrieve the results. The results are shown in Fig. 5.

[Fig. 5.] Results of the evaluation of integration mapping.

VI. CONCLUSIONS AND FUTURE WORK

In this paper, we have proposed a methodology for the integration of LRs for semantic search using ontology mapping and rule-based inference. The proposed approach focuses mainly on addressing the problem of heterogeneity, including both semantic and structural conflicts. The standard CO was designed for a global shared ontology for overcoming the heterogeneity problems among different LMSs. Reasoning rules to cope with the problems were defined using SWRL. The LRIS application for LR discovery was developed in order to evaluate the proposed approach. Experimental results have shown that our proposed approach performs effectively. Moreover, the proposed rules also allow other approaches to be incorporated, despite the existence of distinct platforms and data heterogeneities. The application of a metadata-based ontology enables us to achieve a higher level of interoperability and greater practicability in e-learning domains.

However, the ability of our approach to resolve more complex problems is still limited. The mapping process is not fully automatic. Some conflicts require a domain expert to detect and resolve them manually. Modelling of mediator ontologies has been investigated mainly in the context of its application to e-learning, and does not cover other domains. In future work, we intend to improve the mediator ontology in order to support different educational systems. The semi-automatic mapping tool must be enhanced to support the mapping process. Moreover, more conflict resolution needs to be performed in future work.

참고문헌

1. Brut M. M., Sedes F., Dumitrescu S. D. 2011 “A semantic-oriented approach for organizing and developing annotation for e-learning,” [IEEE Transactions on Learning Technologies] Vol.4 P.239-248
2. Cuellar M. P., Delgado M., Pegalajar M. C. 2011 “A common framework for information sharing in e-learning management systems,” [Expert Systems with Applications] Vol.38 P.2260-2270
3. Gasevic D., Hatala M. 2006 “Ontology mappings to improve learning resource search,” [British Journal of Educational Technology] Vol.37 P.375-389
4. Bouzeghoub A., Elbyed A. 2006 “Ontology mapping for web-based educational systems interoperability,” [Interoperability in Business Information Systems] Vol.1 P.73-84
5. Ram S., Park J. 2004 “Semantic conflict resolution ontology (SCROL): an ontology for detecting and resolving data and schema-level semantic conflicts,” [IEEE Transactions on Knowledge and Data Engineering] Vol.16 P.189-202
6. Lu H., Li Q. Z. 2004 “Ontology based resolution of semantic conflicts in information integration,” [Wuhan University Journal of Natural Sciences] Vol.9 P.606-610
7. Banlue K., Arch-int N., Arch-int S. 2010 “Ontology-based metadata integration approach for learning resource interoperability,” [in Proceeding of the 6th International Conference on Semantics Knowledge and Grid] P.195-202
8. Arch-int N., Arch-int S. 2013 “Semantic ontology mapping for interoperability of learning resource systems using a rule-based reasoning approach,” [Expert Systems with Applications] Vol.40 P.7428-7443
9. Jetinai K., Arch-int N., Rungworawut W., Arch-int S. 2013 “Ontology reconciliation for learning resource interoperability,” [International Journal of Digital Content Technology & Its Applications] Vol.7 P.191-200
10. Naiman C. F., Ouksel A. M. 1995 “A classification of semantic conflicts in heterogeneous database systems,” [Journal of Organizational Computing] Vol.5 P.167-193
11. Kashyap V., Sheth A. 1996 “Semantic and schematic similarities between database objects: a context-based approach,” [The VLDB Journal] Vol.5 P.276-304
12. Miller G. A., Beckwith R., Fellbaum C., Gross D., Miller K. J. 1990 “Introduction to WordNet: an on-line lexical database,” [International Journal of Lexicography] Vol.3 P.235-244
13. Wu Z., Palmer M. 1994 “Verbs semantics and lexical selection,” [in Proceeding of the 32nd Annual Meeting on Association for Computational Linguistics] P.133-138
14. Euzenat J. 2007 “Semantic precision and recall for ontology alignment evaluation,” [in Proceeding of the 20th International Joint Conference on Artificial Intelligence] P.348-353
15. Kaza S., Chen H. 2008 “Evaluating ontology mapping techniques: an experiment in public safety information sharing,” [Decision Support Systems] Vol.45 P.714-728