VUB Leerstoel 2009-2010
Professor Werner Ceusters

State University of New York (SUNY), Buffalo, USA
Prof. Werner Ceusters

Biography

Werner Ceusters studied medicine, neuropsychiatry, informatics and knowledge engineering in Belgium. Since 1993, he has been involved in numerous national and European research projects in the area of Electronic Health Records, Natural Language Understanding and Ontology. From 2004-2006 he was Executive Director of the European Centre for Ontological Research at Saarland University, Germany. Since 2006 he has been Director of the Ontology Research Group in the Center of Excellence in Bioinformatics and Life Sciences in Buffalo, New York, and also Full Professor of Psychiatry, and member of the Center for Brain and Behavior Informatics in the University at Buffalo. His research is focused on the application of realism-based Ontology and Referent Tracking for data management and on the requirements which ontologies and terminologies must satisfy if they are to be useful for annotation under this framework.

Theme: Ontology for Ontologies: theory and applications

Research in intelligent software agents and the Semantic Web spawned an enormous interest in ontologies. The term ‘ontology’, in this context, is most often defined as ‘an explicit specification of a conceptualization’, or ‘a description of the concepts and relationships that can exist for an agent or a community of agents’. This, however, is in contrast to the original meaning of the term in philosophy where it denotes – as a near synonym of the term ‘metaphysics’ – the science pertaining to what sorts of entities exist, and how these different sorts of entities relate to each other. Surprisingly, few approaches to ontology design in the computational sphere take advantage of the contributions to ontology in this latter sense that have been made for over more than two millennia. The consequence is that the majority of software artifacts designed to provide a support for these artifacts in the form of some sort of understanding of the world in which they have to render their services, provide a distorted rather than faithful view of reality. This distortion does not matter, perhaps, for the specific purposes for which given software artifacts have been developed, but the prevailing approach falls short in providing representational resources that are able to cope when software needs to be applied to new purposes, or needs to work in a way which guarantees interoperability with other software.

The goal of this lecture series is to provide an in-depth understanding of what both disciplines, namely realism-based ontology and knowledge representation, have to offer each other and how a combined approach can leverage semantic interoperability in a variety of domains.

Lectures

Please note: participation is free of charge, but registration is required, also for VUB students. The number of available seats in the auditorium is limited!

The quest for semantic interoperability (inaugural oration)

Monday 17 May 16h30-19h00, Room D2.01 (Promotiezaal)
Semantic interoperability is the ability of two or more computer systems to exchange information and to have the meaning of that information automatically interpreted by the receiving system accurately enough to produce results which are useful as defined by the end users of both systems. Current attempts to achieve semantic interoperability all rely on agreements about the understanding of the so-called ‘concepts’ stored in terminology systems such as nomenclatures, vocabularies, thesauri, or ontologies, the idea being that if all computer systems use the same terminology, or mutually compatible ones, they can understand each other perfectly. The reality, however, is that the number of terminology systems with mutually incompatible definitions or non-resolvable overlap amongst concepts grows exponentially, thereby contributing ever more to the problem of semantic non-interoperability, rather than solving it. This is because these systems leave unspecified what concepts actually are, or to what, if anything, they might correspond in reality. In this lecture, we will use examples from biomedicine, property rights management, (inter)national intelligence, and corporate memories, to demonstrate where prevailing approaches go astray.
The inaugural lecture is followed by a reception.
Download slides

Prevented abortions, absent nipples and other unicorns: the need for realism-based ontology development

Tuesday 18 May, 17h00-19h00, Room D0.08

The realist orientation in ontology is based on the view that terms are to be aligned not on concepts in people’s minds but rather on entities in reality. Central to this view are three assumptions. The first is that the reality of molecules and organisms and planets exists objectively, i.e. independent of the perceptions or beliefs of cognitive beings. Not only the wide variety of entities existing in reality but also how these entities relate to each other is a matter not of agreements made by scientists or by database modelers but rather of objective fact. The second assumption is that reality, including its structure, is accessible to human beings – for example through experiments – and can be discovered: it is scientific research that allows human beings to find out what entities exist and what relationships obtain between them. The third assumption is that an important aspect of the quality of an ontology or terminology is determined by the degree to which the structure according to which the terms are organized mimics the pre-existing structure of reality.

In the context of information systems, this means that an important aspect of the quality of an information system is determined by the degree to which (1) its individual representational units correspond to entities in reality, and (2) the structure according to which these units are organized mimics the corresponding structure of reality. In this lecture, we provide an overview of the various sorts of entities existing in reality, drawing on Basic Formal Ontology, an upper ontology grounded in philosophical realism and developed and tested through collaborations with many groups of biomedical researchers.

Download slides

Ontologies in healthcare and the vision of personalized medicine

Wednesday 19 May, 17h00-19h00, Room D2.01 (Promotiezaal)

Biomedical informatics may well be the single fastest-growing specialty in the life sciences today. In its broadest sense, this discipline spans the interface between science and technology, and seeks to develop methods and tools for the acquisition, management, communication and processing of biomedical information. To accomplish these goals biomedical informatics combines knowledge about the functional, structural and technological aspects of the medical and biological sciences with mathematics, computer science and information technology – all with the aim of helping professionals and consumers, to understand and draw practically relevant conclusions from an immense, constantly growing and staggering variety of biological and clinical data.

Familiarly, biomedical information is published using multiple different sorts of terminologies, classifications and coding systems. This diversity produces silo effects, which reduce the value of annotations created in their terms in multiple ways, making data both difficult to access and resistant to integration. Ontologies such as the Gene Ontology are increasingly being used as tools to overcome these problems in biological domains by providing corridors of semantic interoperability between distinct information resources. The idea is that, if multiple bodies of relevant information can be annotated using common, non-redundant sets of ontology terms with definitions formulated using some common logical language, then the information they contain will thereby be more easily accessible and capable of being integrated together computationally.

In this lecture, we will discuss the Open Biomedical Ontologies Foundry initiative which aims to develop such a coherent set of master and domain ontologies to link together the various resources in a semantically coherent way, in contrast to the merely syntactic approaches that are being used thus far, for example in the context of the Semantic Web initiative.

Download slides

Ontologies and Natural Language Understanding

Thursday 20 May, 17h00-19h00, Room D2.01 (Promotiezaal)

At least 95% of the life science information currently publicly available resides in journals and research reports in natural language form. Even in hospitals that use an electronic medical record system, the majority of the data resides in electronic reports written in free text. This is because, as stated in the early eighties by Wiederhold, the description of biological variability requires the flexibility of natural language and it is generally desirable not to interfere with the traditional manner of medical recording. On the other hand, it is equally true that, without proper mechanisms in place, free natural language reports are incapable of being understood not only by machines, but also, in many cases, even by other human beings. Very often, medical statements are written down in a context that is obvious at the time of registration, but difficult to reconstruct later on by third parties, or even by the original source. Moreover, in order for a computer to be able to further process healthcare data, the data must be available in a coded and structured form. Making this happen in a way transparent to healthcare specialists is the ultimate goal, perhaps even the "raison d"être", of natural language understanding (NLU) applications in healthcare.

In this lecture, we will explore in the first place how natural language understanding can benefit from realism-based ontologies. We will also investigate the requirements that NLU applications have to meet if they are to assist in building better ontologies.

Download slides

Referent Tracking: why Big Brother was just a little baby.

Friday 21 May, 17h00-19h00, Room E0.04

A referent tracking system (RTS) is a special kind of digital information system which keeps track of (1) what is the case in reality and (2) what is expressed in more traditional information systems about what is believed to be the case in reality. It does this unambiguously by means of Instance Unique Identifiers (IUI), using principles and methods that ensure – modulo the occurrence of errors at the point of data entry, the resolution of which is also covered by the RT paradigm – that an IUI is (1) persistent (because once created in a RTS it is never deleted), (2) globally unique (because an IUI denotes only one entity within an RTS), and (3) singular (because within an RTS, there is only one IUI for each specific entity).

The RT paradigm was originally developed to support the entry and retrieval of data in the Electronic Health Record (EHR), where its purpose is to avoid the problems that arise when statements in an EHR refer to disorders, lesions and other entities on the side of the patient by means of logically complex descriptive phrases such as "the fracture in the leg of patient X" or "the tumor in the lung of patient Y". These problems arise because the phrases in question employ generic terms in ways which may fail to identify the relevant instances unambiguously: John may have multiple fractures in his leg; or he may have fractured his leg twice at different times in his life. In this lecture, we will discuss the theoretical foundations of the paradigm and give examples of how it has been implemented. We will discuss in particular its applicability in the context of the Globally Networked and Integrated Intelligence Enterprise which aims to integrate foreign, military, and domestic intelligence capabilities through policy, personnel and technology actions with the goal to provide decision advantage to policy makers, warfighters, homeland security officials and law enforcement personnel.