The exponential growth of information on the web far exceeds the capacity of present day information retrieval systems and search engines, making information integration on the web difficult. In order to overcome this, semantic web technologies were proposed by the World Wide Web Consortium (W3C) to achieve a higher degree of automation and precision in information retrieval systems. Semantic web, with its promise to deliver machine understanding to the traditional web, has attracted a significant amount of research from academia as well as from industries. Semantic web is an extension of the current web in which data can be shared and reused across the internet. RDF and ontology are two essential components of the semantic web architecture which support a common framework for data storage and representation of data semantics, respectively. Ontologies being the backbone of semantic web applications, it is more relevant to study various approaches in their application, usage, and integration into web services. In this article, an effort has been made to review the research work being undertaken in the area of design and development of ontology supported information systems. This paper also briefly explains the emerging semantic web technologies and standards.
The advent of high-speed networks for data transmission, the growing usage of the internet for scholarly publishing and dissemination, and the quantum of information being produced is exponentially increasing. The amount of information that is being produced far exceeds the capacity of present day Information Systems (ISs) used for information processing, storage, and retrieval (IRS). Information seekers use search engines on a daily basis for locating, combining, and aggregating data from the internet in their quest for information. Huge volumes of data availability in every subject field makes users incapable of data integration on the web using traditional information systems and search engines. The reason for this situation is that these systems are based on keyword searching, which is an unresolved technique for precise information retrieval. Significant differences in results stem from trivial variations in search statements. However, the problem is not only with the traditional information retrieval mechanism, but also with the current web, which has been primarily designed for human readers. Even though the web has become much more interactive in recent years using the Web 3.0 and social networking platforms, the underlying standards remain unchanged, forcing traditional IRS to do keyword matching only. The challenge lies in first upgrading the current web, in which machines are only able to present the data stored and not capable of understanding it. We must try to make the machine understand the content of documents and to also understand the user query so that it will be able to link them in a better way.
The solution lies in embracing and adopting the semantic web technologies and standards introduced by the inventor of the World Wide Web, Tim Berners-Lee. The semantic web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation (Lee et al., 2001). The semantic web is a collection of technologies and standards that allow machines to understand the meaning (semantics) of information on the web. Since 2004, the field has been dominated by formal languages and technologies comprising the so-called “Semantic Web Stack,” namely W3C recommendations. The semantic web stack, the architecture of the semantic web, is an illustration of the hierarchy of languages, where each layer exploits and uses the capabilities of the layers below. The following are the major semantic web technologies and standards.
Resource Description Framework (RDF) is the backbone of the W3C’s semantic web activity. It is the standard for encoding metadata and other knowledge on the semantic web. RDF provides the language for expressing the meaning of terms and concepts in a form that machines can readily process. An RDF statement contains 3 main parts: Subject, Property, and Object, respectively.
Resource Description Framework Schema (RDFS) and Web Ontology Language (OWL) provide languages to express the ontology. RDFS allows the representation of classes, class hierarchies, properties, property hierarchies, and domain and range restrictions on properties. OWL is an ontology language that extends expressiveness of RDFS. OWL is a declarative knowledge representation language which formally defines meaning for creating ontology. OWL has three sublanguages, OWL-Lite, OWL-DL, and OWL-Full. OWL-Lite and OWL-DL are based on description logics.
Ontology makes metadata interoperable and ready for efficient sharing and reuse. It provides shared and common understanding of a domain that can be used both by people and machines. Ontology helps in data integration. It is now widely accepted that ontologies will play an important role for the next generation of information systems (ISs). The use of ontologies for ISs will not only enable “better” and “smarter” retrieval facilities than current ISs based on the predominant relational data model, but will also play a key role in supporting data and information quality checks, IS interoperability, and information integration. Ontologies inherently work on a semantic rather than on a syntactic level and thus support a seamless incorporation of conceptual domain constraints into the mechanism of an information system.
Tool support is necessary both for ontology development and ontology usage in various applications. A large number of environments for ontology construction and ontology use are available. There exist a large number of tools that support the development of ontologies. The earliest developed tools still in use were created more than a decade ago. These tools include Ontolingua, Protégé, WebODE, OntoEdit, WebOnto, OILEd, DUET, KOAN, and OntoSaurus. There are also many other tools that have different purposes.
Inferences on the semantic web are used to derive a new relationship. For the semantic web, an inference is a process to infer a new relationship from existing resources and some additional information in the form of a “set of rules.” Inferences are required for processing given knowledge available on the semantic web. Inference base techniques are also used to check data inconsistency at the time of data integration. Inference engines include FaCT, FaCT++, Racer, Pellet, Hoolet, JESS, JENA, and F-OWL.
SPARQL query language is used to retrieve information from the web of data. The web of data, typically represented using RDF as a data format, needs its own RDF-specific query language and facilities. SPARQL, as a declarative query language similar to SQL, allows for specifying queries against data in RDF.
This paper reviews the various approaches used in the design and development of ontology supported information systems in general and food domains in particular, and highlights important related issues. The remaining part of this paper is organised as follows. Section 2 gives a general introduction to context in ontology-based information systems; Section 3 outlines methodologies to build ontologies for information systems; Section 4 focuses on ontology-based information systems in food science & technology (FST), and conclusions are presented in section 5.
Wand and Weber (1989) first proposed the representational model of an information system in the mid to late 1980s. They explored theories of ontology in terms of a grammar to describe the real-world for the purpose of IS modeling. Wand and Weber’s (1990) approach was to model information systems within the context of a theory of ontology which is a modification and extension of one developed by Bunge. An interesting approach to understanding how ontology can support modeling was the framework introduced by Wand and Weber in 2002. They used a set of constructs and rules that they term conceptual-modeling grammar, along with a set of procedures and conceptual-modeling methods that guide the information systems designer in building conceptual schemas, viz conceptual-modeling script. Wand and Weber (2004) hold the view that a good conceptual model is the key to a good information system.
An ontology based on common-sense realism, which has received some attention in information systems, was the work of Chisholm (1996). The ontology is robust, located in the common-sense realism school of thought, and deals with static and dynamic aspects. His view on ontology was that there are only two kinds of entities, attributes and the individual things that have these attributes. Everything else, including propositions, states of affairs, possible worlds, and sets, can be understood in terms of these two categories. He stated that attributes are possible objects of thought—more specifically, what we can attribute, either directly (to ourselves) or indirectly (both to ourselves and other things). He suggests that his ontology may provide an appropriate framework for analysing the reality of information systems research.
The Framework for Information Systems Concepts (FRISCO) ontology produced by the FRISCO task group within Working Group 8.1 of the International Federation of Information Processing (IFIP) is a first step towards a well founded, general theory of Information Systems (IS) (Falkenberg et al., 1998). Four features make the FRISCO approach unique in this field: i) its close relation to philosophy, ii) its semiotic basis, iii) its “world view” (general ontology), and iv) its layered structure.
Guarino (1998) coined the term Ontology-Driven Information Systems (ODIS) whereby he envisioned the use of ontologies in two distinct stages of Information Systems: 1) at development time, and 2) at run time. At development time, an ontology can be used in the conceptual modeling phase of IS, representing the knowledge of a given domain and supporting the creation of IS components. At run time, an ontology can be used as another part of the information system driving all of its aspects and components, that is, the system runs in accordance with the content of the ontology (Uschold, 2008).
Milton and Kazmierczak (2004) developed qualitative methods, the method of conceptual comparison, for conceptually evaluating individual data modelling languages through ontologies. They carry out a comparison of ontological meta-model for a data modelling language using a philosophical ontology as an external reference. Their theory was based on ontology in order to further understand the fundamental nature of data modelling languages (conceptual modelling grammar). They applied their own methods to analyse grammar and use Chisholm’s ontology as a foundation (Chisholm, 1996). They find a good degree of overlap between all of the data modelling languages analysed and the core concepts of Chisholm’s ontology, and conclude that the data modelling languages investigated reflect an ontology of commonsense-realism.
The extension, critique, and discussion of the BWW (Bunge-Wand-Weber) ontology have contributed to advancing the field and to driving the interests of researchers toward ontology at the conceptual modeling of IS (Lyytinen, 2006). The increasing use of ontologies in information systems (IS) and the proliferation of conceptual modeling methods lacking theoretical foundations have caught the attention of researchers and guided them to investigate theories of ontology in information systems (Rosemann & Wyssusek, 2005).
Fonseca (2007) offers a slightly different but complementary viewpoint of the use of ontologies in Information Systems. The author discusses the distinction between the creation and the use of ontologies in IS in terms of the purpose of the ontologies. In the context of ODIS, ontologies of information systems represent the underpinning theories and structures used to describe the IS domain. These ontologies provide support to the creation of better modeling tools as they become references of how the IS domain is organized.
The literature reviewed below deals with OBIS that describe a given domain covering agriculture, agriculture-pests, materials science, and general frameworks. This kind of ontology on domain specific information system helps to build and support better conceptual schemas and other IS components as suggested by Guarino (1998) in his concept of ontology driven information systems.
The objective of the AGROVOC Concept Server (CS) was to provide a framework for sharing common terminology, concept definitions, and relations within the agricultural community (Sini et al., 2008). The CS offers a centralised facility where the agricultural information management community can build and share agricultural knowledge within a collaborative environment by exploiting new semantic technologies, i.e. OWL modelling language. The CS serves as a starting point for the development of specific domain ontologies which are capable of arranging complex multilingualism and terminology information.
Ontology-based information retrieval systems designed to capture semantic relations in Indian Agricultural Research domain ontology for providing value added information services were developed by Angrosh and Urs (2007). They considered the case of Agricultural Electronic Theses and Dissertations (ETDs) present in Vidyanidhi Digital Library. Their aim was to facilitate the derivation of semantic metadata by capturing contextual information pertinent to the domain. The authors also examined the use of description logics and facet relations for developing ontology-based knowledge management systems for digital libraries. The knowledge representation languages identified were web ontology language (OWL) and Karlsruhe Ontology (KAON) language. Their results showed that the use of description logics and facet relations was very promising. Ontologies, specifically when based on knowledge representation formalisms, added value and enhanced the search experience.
Yuanyuan (2010) presented a knowledge representation method for agricultural intelligent information systems by taking an example of the Cotton Planting and Management Expert System. Based on knowledge characteristics of agricultural intelligent information systems, a knowledge representation method was explored by two levels: knowledge organization structure and knowledge describing language. The knowledge organization structure was based on a frame knowledge unit and solving knowledge unit. The knowledge representation approach adopted OWL as knowledge describing language. The knowledge organization was implemented by mapping knowledge with ontology level. He found that this approach provides a great degree of knowledge interoperability and knowledge sharing.
Angrosh and Urs (2006) have worked on creating semantic metadata based information systems for developing an agri-pest domain specific model. The agriculture-pest-disease-pesticide domain specific knowledge was utilized for mapping content related data with domain knowledge. The semantic metadata derived through domain specific ontology would be tapped by information retrieval mechanisms in order to provide search strategies to yield higher precision rates. The metadata facilitates an end-user in finding the required information. Their knowledge base was used to retrieve information through contextually relevant entry points.
The challenge associated with integrating heterogeneous databases in the material’s science domain was highlighted by Cheung et al. (2009). They found that materials scientists and nanotechnologists were struggling with the challenge of managing the large volumes of multivariate, multidimensional, and mixed media data sets being generated from the experimental, characterization, testing, and post-processing steps associated with their search for new materials. Materials scientists demand data management and integration tools that enable them to search across these disparate databases and correlate their experimental data with external, publicly available data, in order to identify new fertile areas for searching. MatOnto, a machine-processable ontology, was used to integrate data across disparate databases. MatOnto was based on OWL, which provides a single web-based search interface. The authors also developed MatSeek to provide a federated search interface over the critical materials science databases.
To overcome the problems of poor retrieval performance and low retrieval quality in traditional retrieval methods, a non-metallic pipe retrieval system based on ontology technology was designed by Rong and Jun (2012). The ontology files in html format were described in OWL language. After that, they were saved in a MySQL database based on Jena. They introduced the retrieval algorithm and retrieval process of the system, and realized the semantic analysis of query condition with two functions, keywords query and query expansion. They found that the ontology technology promoted recall and precision compared to traditional retrieval systems.
Wessel and Moller (2009) present a formal and implemented generic framework for building ontology-based information systems (OBISs). Their framework offers a means for (i) the extensional layer, (ii) the intensional layer, and (iii) the query component. Being ontology- based, they have developed a framework-based approach strongly influenced by description logics (DLs) but also support the integration of reasoning facilities for other formalisms. They claim this by using case studies—that their framework can cover regions in the system design space instead of just isolated points. The main insights gained with this framework were presented in the context of ontology-based query answering as part of a geographical information system (GIS). The focus of the case study was the DLMAPS system, which implements ontology-based spatio-thematic query answering in the domain of digital city maps.
In order to understand the research on ontologies in information systems, the major literature on the topic has been reviewed. A frequently used ontology that has been adapted and used in information systems is a scientific ontology by Mario Bunge (Bunge, 1977, 1979). It is characterised by an approach that considers the real world as known to science and proceeds in a clear and systematic way. His ontology introduced the concept of hierarchy of systems. Bunge’s ontology differentiates explicitly a hierarchy of distinct ontological levels: physical, chemical, biological, psychological, and social/technical.
Two pioneering papers have described the development of building ontologies. One was by Noy and McGuinness (2001) and another by Guarino (1997). A comparative review of the state-of-the-art in ontology design was described by Noy and Hafner (1997). Lopez and Gomez-Perez (1999) have laid down guidelines for developing a chemical ontology using two ontology building tools: MethOntology and Ontology Design (ODE). MethOntology provides guidelines for specifying ontologies at the knowledge level, as a specification of a conceptualization. ODE enables ontology construction, covering the entire life cycle and automatically implementing ontologies.
According to Nicola et al. (2009) ontology building exhibits a structural and logical complexity that is comparable to the production of software artefacts. Ontology and software development share some common qualities. Considering this feature, they proposed an ontology development methodology called UPON, i.e., Unified Process for Ontology. This system was based on a unified software development process or unified process (UP), which is a widely used standard in software engineering. The strength of the proposed approach lies in the UP being a highly scalable and customizable methodology. It can be tailored to fit a number of variables: the ontology size, the domain of interest, the complexity of the ontology to be built, the experience and skill of the project experts, and their organization. UPON, when compared with the main ontology building methodologies, showed that within its scope the features are aligned and sometimes outperform the best solutions.
Clustering-based methods are another strategy useful in ontology construction. A clustering-based method for creating cultural ontologies for community oriented information systems has been described by Srinivasan et al. (2009). This semi-automated method merges distributed annotation techniques, or subjective assessments of similarities between cultural categories, with established clustering methods to produce “cognate” ontologies. They concluded that a semi-automated method was useful in resolving the twin problems of scalability and interoperability of developing ontology.
Another approach to semi-automatically constructing domain ontology was proposed by Lin and Hanqing (2009). They adopted an approach of Non-dictionary Chinese word Segmentation techniques based on N-Gram to acquire domain candidate concepts. The method was based on natural language processing (NLP) in the recognition of domain concept property relation, extracted subject, predicate, and object values of sentences. This triangle data was treated as the triplet of data, object type, and property.
Another semi-automated system based on Chinese word partition and data mining was proposed by Dan et al. (2010). The semi-automatic domain ontology system designed consisted mainly of three parts: the extraction module of domain concepts, taxonomy, and non-taxonomy relations. A statistical analysis method, generalized suffix tree and clustering method, and association rule mining method are respectively adopted in the above three parts. By using ontologies, they could define a description base for scholarly events to enable software agents to crawl and extract scholarly event data, and to facilitate unified access to this data. The collected data was mined for non-obvious knowledge.
Conversion, traditional knowledge organisation techniques (e.g., facet analysis) and or use of existing vocabulary tools is one of the sub-systems adopted in ontology construction. The same is demonstrated by the literature reviewed below. An experiment to convert a controlled vocabulary into ontology using a traditional knowledge organization tool was reported by Qin and Poling (2001). They used the controlled vocabulary of ERIC descriptors to develop the ontology on education and educational materials using Ontolingua. The method discussed the preliminary planning and steps involved in converting the existing GEM vocabulary to an ontology. The conversion reduced the duplication of effort involved in building an ontology from scratch by using the existing vocabulary. A mechanism was also established for allowing differing vocabularies to be mapped onto the ontology. According to them, the major difference between a thesaurus and an ontology, lies in the values added through deeper semantics in describing digital objects, both conceptually and relationally.
Park (2008) demonstrated how knowledge organization approaches in library and information science would improve ontology design and present faceted classification as an appropriate method for structuring ontology. Jun and Yuhua (2009) introduced an automatic approach for ontology building by integrating traditional knowledge organization resources viz., a classification scheme, thesauri. The method first builds a primary ontology describing the classes and relationships involved in bibliographic data with OWL. Then the primary ontology is filled with instances of classes and their relations extracted from catalogue datasets and the thesauri and classification schemes used in cataloguing. Based on this ontology, they have implemented an online system to demonstrate the proposed methods and functions with thousands of bibliographic data and a subdivision of the Chinese Classified Thesaurus.
A prototype ontology using traditional knowledge organisation techniques such as content analysis, facet analysis, and clustering for the Accelerator Driven Systems (ADSs) domain was developed by Deokattey et al. (2010). Descriptors from the INIS thesaurus were used to organize the information on ADSs, and a suitable knowledge organization tool was developed. A mapping/merging system was used for knowledge organization.
Another application of ontology in the library field is presented by Liao et al. (2010). They developed a novel library recommender system, i.e. a personal ontology recommender (PORE) for English collections. In the PORE system, the traditional cataloging scheme, classification for Chinese libraries, is used as the reference ontology. The system offers a friendly user interface and provides several personalised services. This system was implemented and tested in the Library of National Chung Hsing University in Taiwan.
Jiang and Tan (2010) objected that traditional ontology construction systems employ shallow natural language processing techniques and focus only on concept and taxonomic relation extraction. To overcome this drawback, the authors proposed a system, known as Concept-Relation-Concept Tuple-based Ontology Learning (CRCTOL), for mining ontologies automatically from domain specific documents. The system i) adopts a full text parsing technique to obtain a more detailed syntactic level of information, ii) employs a combination of statistical and lexicosyntactic methods, including a statistical algorithm that extracts key concepts from a document collection, iii) uses a word sense disambiguation algorithm that disambiguates words in the key concepts, iv) uses a rule-based algorithm that extracts relations between the key concepts, and v) has a modified generalized association rule mining algorithm that prunes unimportant relations for ontology learning. They found that compared to traditional methods, their system produced ontologies that are more concise and accurate, and contained richer semantics.
The domain ontology evolution approach is yet another approach useful in constructing ontology. A study by Poli (2002) highlights ontological sub-theories and uses of domain analysis for developing an ontology. This methodology in the field of Artificial Intelligence utilises domain analysis, an integral part of Library & Information Science. Prieto-Diaz (2003) has also used a similar domain analysis with a faceted approach to build ontologies with a software tool called ‘DARE.’
In this section relevant literature on the FST domain was reviewed. The ontology related to cooking and processed foods, beverages, seafood, and health, nutrition, and diet and food safety are discussed below.
The taxonomy of food can be defined by classifying on the basis of origin, i.e. plants or animals. In addition, the cooking ontology presented by Batista et al. (2006) and by Ribeiro et al. (2006) and fishery ontology (Gangemi, 2002, 2004) can be useful for the definition of a taxonomy for processed food, since recipe concepts introduced in the ontology interconnect food concepts with each other.
For a detailed definition of particular ontologies devoted to the modeling of specific food domains, important consideration are provided by works of Chifi et al. (2007); Drummond et al. (2007); Graça et al. (2005); Heflin (2000); Noy and McGuinness (2001); Wang et al. (2012); and Yue et al. (2005). While the aforementioned ontologies deal with some smaller areas of the food sector, these ontologies could be integrated and combined with each other with the main aim of defining a new complete food taxonomy, which includes food and beverages under the same main class.
The food ontologies proposed by Snae and Bruckner (2008), FOODS and the PIPS project (Cantais et al., 2005), diabetes control (Hong & Kim, 2005; Li & Ko, 2007), and a personalized food recommender system for athletes (Suksom, 2010; Tumnark, 2013) may be considered as a guide for the development of ontologies for specific health problems/nutrition.
An ontology for the cooking domain, integrated within a dialog system, was developed by Ribeiro et al. (2006). Ontology building activities like brainstorm sessions, knowledge validation and disambiguation, conceptualization and formalization, or evaluation were done through coordination of experts. The ontology comprehends four main modules covering the key concepts of the cooking domain (actions, food, recipes, and utensils) and three auxiliary modules (units and measures, equivalencies, and plate types). The ontology building process was influenced by METHONTOLOGY, which followed the phases of specification, knowledge acquisition, conceptualization, implementation, and evaluation.
Another ontology for a cooking domain was developed by Batista et al. (2006). The ontology building process was influenced by the methodology proposed by Lopez et al. (1999) which mainly consisted of specification, knowledge acquisition, conceptualization, implementation, and evaluation. The knowledge model was formalized using Protégé, which was also used to generate the ontology code automatically. The ontology comprehends four main modules covering key concepts of cooking domain (actions, food, recipes, and utensils) and two auxiliary modules (units, measures, and equivalencies).
Drummond et al. (2007) have focused their attention on the definitions of taxonomy and ontology related to the knowledge of pizza. They divided the domain into pizza topping and pizza bases. Different types of toppings were proposed with the main element of the topping (cheese, meat, seafood, vegetable, pepper). This ontology contains all constructs required for the various forms of pizza made around the globe.
A beer ontology was developed by Heflin (2000). His ontology models brewers and types of beer based on a SHOE (Simple HTML Ontology Extension) framework. In this ontology, relationships are declared between one or more arguments. Relationship arguments are either type or categories.
Graca et al. (2005) proposed a wine ontology. They found that many times, winemakers use their methods to produce their product. The experts composed a vocabulary for wine characteristics which was supported by their senses and descriptive capability, which is ambiguous and not measurable. Their wine ontology covered maceration, fermentation processes, grape maturity state, wine characteristics, and classification according to country and region where the wine was produced. They stated that the development process followed four phases: i) knowledge acquisition, ii) conceptualization, iii) formalization, and iv) evaluation. Their inference was based on queries defined using F-logic, taking advantage of OntoEdit’s built-in-inference engine. They concluded that the outcome was a useful and comprehensive ontology in the enology field, which they plan to use in a system that allows searching for wines using any combination of features.
In the fisheries domain, an ontology supporting semantic interoperability among existing fishery information systems was presented by Gangemi et. al. (2002). As different fishery information systems provide different views of the domain, they consider the paradigm of ontology integration, namely the integration of schemas that are arbitrary logical theories and hence can have multiple models. A thesaurus, topic trees, and reference tables used in the systems to be integrated were considered as informal schemas conceived to query semi-formal or informal databases such as texts and tagged documents. In order to benefit from the ontology integration framework, they transformed informal schemas into formal ones by applying the techniques of three methodologies: OntoClean, ONIONS, and OnTopic. They found that integration and merging are shown to benefit from the methods and tools of formal ontology.
Gangemi (2004) focused on the Core Ontology of Fishery (COF) and its use for the reengineering, alignment, refinement, and merging of fishery Knowledge Organization Systems (KOSes). They have reengineered and aligned legacy thesauri by using formal ontological methods, and are deploying the resulting ontology library for services dedicated to fish management document repositories and databases. A UML activity diagram was defined that summarized the main steps of the methods that were followed to create the Fishery Ontology Library. The global lifecycle was referred to as ONIONS@FOS, since it is an adaptation of the ONIONS methodology. COF has been designed by using the DOLCE-Lite-Plus ontology, developed within the WonderWeb European project. Preliminary exploitation showed that formal ontologies give smoothness and increase control to some functionalities, such as integrated information retrieval from distributed document systems over the web, and integrated querying of distributed dynamic databases.
Snae and Bruckner (2008) proposed a food ontology related to nutritional concepts, which is described as Food-Oriented Ontology-Driven Systems (FOODS). FOODS was mainly devoted to assisting customers through an appropriate suggestion of dishes and meals with the help of individual nutritional profiles. The ontology contains specifications of ingredients, substances, nutrition facts, recommended daily intakes for different regions, dishes, and menus. FOODS comprises a) a food ontology, b) an expert system using the ontology, and some knowledge about cooking methods and prices, and c) a user interface suitable for novices in computers and diets as well as for experts. The food ontology was categorized by nine main concepts: regional cuisine, dishes, ingredients, availability, nutrients, nutrition based diseases, preparation methods, utensils, and price. The combination of bottom-up and top-down approaches was chosen to develop the ontology by using Protégé as the tool of choice for setting up the ontology in OWL-DL.
Another food ontology oriented to the nutritional and health care domain was developed by Cantais et al. (2005) for assisting in sharing knowledge between the different stakeholders involved in the Personalised Information Platform for Health and Life Services (PIPS). It is mainly addressed to the provision of nutritional advice to diabetic patients. They also provide a brief description of the food ontology development process and its main features. Their ontology was developed through a collaborative process that includes domain experts, database experts, and ontology engineers. They used the Eurocode2 coding system which provided the backbone of the class hierarchy.
Li and Ko (2007) proposed an automated mechanism that constructs a well-rounded food ontology structure that can be utilized by diabetes educators or patients themselves for various value added applications or services. Their study was based on the dataset provided from the food nutrition composition database of the Department of Health, Taiwan. The dataset was grouped into 18 major categories. They proposed methods for generating more intuitive ontology concepts, relations, and restrictions. The methods include generating an ontology skeleton with hierarchical clustering algorithms (HCA), class naming by intersection naming, and instance ranking by granular ranking and positioning.
Hong and Kim (2005) implemented a web-based expert system for nutrition counseling and management based on ontologies for diabetes control. Their system uses food, dish, and menu databases which are fundamental data in order to assess the nutrient analysis. Clients can search food composition and conditional food based on nutrient name and amount. Their system organizes food according to Korean menus and can read the nutrient composition of each food, dish, and menu.
The research work of Fudholi et al. (2009) was designed and developed for daily menu assistance in the context of a health control system for the population. Their project uses ontology to model a nutrition needs domain, implementing a rule-based inference engine. The system was implemented as a semantic web application, where users enter abstinence foods and personal information so the system can calculate several parameters and provide an appropriate menu from the database.
Suksom et al. (2010) implemented a rule-based personalized food recommender system aimed to assist users in daily diet selections based on some nutrition guidelines and ontology. Their work focused on personalization of recommendation results by adding the user’s health status information that may affect his/her nutrition need. The design of the system uses a knowledge-based framework consisting of two main components: a knowledge base and recommender engine. The knowledge engineering approach was used in modeling the relevant user profile as well as food and nutrition knowledge in an ontology form. The construction of the knowledge base was developed through collaboration of domain experts, nutrition guides, and clinical practice guidelines. It consisted of two forms of knowledge: ontology and rules. The ontology-based knowledge represented the knowledge structure of food and their relations. The rule-based knowledge represented the decision model used in generating recommendation results.
Tumnark et al. (2013) proposed ontology-based personalized dietary recommendations for weightlifting to assist athletes in meeting their requirements. They extended the personalized food ontology defined in Suksom by adding information related to athletes’ training programs that may affect their nutrition needs and unify the food and sports ontologies. The development of the ontology involved specification and definition of the four main elements of the ontology such as classes or concepts, the individuals, the properties, and all the relationships. The ontology was modelled following a top-down approach around four main concepts: athlete, food, nutrition, and sports. Their recommendations were based on sport nutrition guidelines, which were transformed into rule-based knowledge.
Salampasis et al. (2008) tried to solve the problem of developing traceability systems from a semantic web (SW) perspective. They present a traceability solution that considers food traceability as a complex integrated business process problem which demands information sharing. They propose a generic framework for traceability applications consisting of three basic components: i) an ontology management component based on OWL, ii) an annotation component for “connecting” a traceable unit with traceability information using RDF, and iii) traceability core services and applications.
TraceALL, a food traceability system based completely on solid existing standards of the semantic web initiative, was described for the first time by Salampasis et al. (2012). TraceALL is a semantic web, ontology-based, service-oriented framework that aims to provide the necessary infrastructure enabling food industries (particularly SMEs) to implement traceability applications using an innovative generic framework. It provides a formal, ontology-based, general-purpose methodology to support knowledge representation and information modelling in traceability systems. Their system provides a set of core services for storing, processing, and retrieving traceability information in a scalable way. In addition, they uniquely identify a Traceability Resource Unit (TRU) using a Uniform Resource Locator (URL) code. TraceALL facilitates the development of next generation traceability applications from a food safety perspective.
The Food Track & Trace Ontology (FTTO) devoted to managing food traceability was developed by Pizzuti et al. (2014) to connect with a Global Track & Trace Information System (GTTIS). A traceability system prototype was proposed as part of the general framework to assist the process of information extraction and unification in compliance with legal and quality requirements. The FTTO ontology supports the management of a unique body of knowledge based on natural language and its corresponding synonyms, through the integration of different concepts and terms coming from heterogeneous sources of information. It includes the most representative food concepts involved in the supply chain held together in a single ordered hierarchy, able to integrate and connect the main features of the food traceability domain. The sources of information used in the knowledge acquisition phase consist mainly of thesaurus and food databases, books, and the internet. The Codex Alimentarius Classification of Food and Animal Feeds were referred to in order to build the food classification tree. The knowledge model was formalized using Protégé, which was also used to generate the ontology code automatically. The resulting ontology comprehends four main modules covering the key concepts of the tracking domain: Actor, Food Product, Process, and Service Product. Their proposed structure solved a few existing problems related to food traceability.
Vegetable supply chain technology has shown its influence in the development of vegetable industries. The development of information technology has changed the traditional mode of vegetable supply chains profoundly. Yue et al. (2005) presented an ontology-based metadata organization for vegetable supply chain knowledge searching systems. The authors build three ontologies which are user ontology, content knowledge ontology, and vegetable supply chain domain ontology. They formalized the metadata using RDF (Resource Description Framework) and implemented the searching system on the knowledge database of the vegetable supply chains. Through the semantic searching, the users can get more suitable knowledge of vegetable supply chains.
Chifu et al. (2007) proposed an ontological model approach which allows semantic annotation of web services aiming at automatic web services composition for food chain traceability. The model was implemented in the framework of the Food-Trace project (Food Trace) for traceability in the domain of a meat industry. The model consists of a core ontology and two categories of taxonomic trees: business service description and business product description trees. The domain-specific concept of this ontology was organized into a taxonomy that is automatically built out of textual descriptions from web sites of Romanian meat industry companies. Their ontology was used for adding semantics to web service description language (WSDL) and descriptions, as the vocabulary in the automated planning of the service composition and as vocabulary for an ontological driven user interface.
Wang et al. (2012) proposed a quality and safety traceability system of fruit and vegetable products based on ontology. Their work analyses the process of fruit and vegetable products from farm to sale terminal. In their work, authors first introduce the working principles of a traceability system and the collection of traceability information. Then, the theory of ontology is introduced to build a quality and safety traceability information ontology of fruit and vegetable products. Finally, a semantic model for the traceability of fruit and vegetable products is defined, dividing this domain into a set of sub-systems.
An integrated information system to integrate data from heterogeneous resources in order to strengthen food-borne pathogen risk management, surveillance, and prevention systems was developed by Yan et al. (2011). Their system laid the groundwork for a standard interoperable protocol which serves as a nation-wide food-borne pathogen-related warning system. The technical aspects in the establishment of a comprehensive food safety information system consists of the following steps: a) computational collection and compiling publicly available information, b) development of ontology libraries on foodborne pathogens and design of automatic algorithms with formal inference and fuzzy and probabilistic reasoning to address the consistency and accuracy of distributed information, c) integration of collected pathogen profiling data, and d) development of a computational model in semantic web for greater adaptability and robustness.
An attempt was made to review the representation of information and semantic retrieval in the context of ontology-based information systems. We identified the trends used in the design and development of ontology supported information systems in general and food domains in particular.
The research on OBIS has been accomplished with the creation of ontologies that study the information system as an object by itself, with the objective of creating better modeling tools. We have identified that in most of our review a distinction was made between research on ontologies for IS and ontologies of IS. The field of ontologies for IS was well-represented by the work of Guarino on ontology-driven information systems. The work of the FRISCO, Wand and Weber, Milton and Kazmierczak, which are exemplars in their work represent models of an information system. In their ontology, they have used methods, tools, and theories developed within the philosophical discipline of ontology to find the basic constructs of information systems. The work of Agrovac, Matseek, and others show that the ontologies can be reused for terminology standardisation and interoperability.
From the literature reviewed, it was evident that a number of ontology development methodologies have evolved and been tested successfully. Some of them describe how to build an ontology from scratch or reuse other ontologies. It was observed in the literature reviewed that most of the work proposed uses i) automatic or semiautomatic methods, clustering, faceted classification, extraction, and domain analysis for ontology construction, ii) ontologies can be represented as graphs, description logic, web standards, or a simple hierarchy, iii) many ontology creation tools viz., Protégé and OilEd have been created to facilitate ontology generation in a semi-automatic or manual way.
The research in the field of FST reviews various approaches used in the design and development of ontology-based information systems in the food domain, as their primary users are food technologists. The reviewed works reveal that either a small section of the food domain was targeted or the ontology is built for a specific application. From our study, it can be concluded that food taxonomies interconnect with each other and hence the food concepts can be shared and reused across various food related application-specific scenarios. Ontologies can be integrated and combined with each other with the main aim of defining a new complete food taxonomy. We have found how a sub-ontology can be identified from an application ontology, which can act as a core ontology for the food domain, as it proved to be self-sufficient.
There are still some research challenges. Ontology tools have to support more expressive power and scalability with a large knowledge base and reasoning in querying and matching. Also, they need to support the use of high level language, modularity, and visualization.