📙 Building Ontologies With Basic Formal Ontology
Author: Robert Arp, Barry Smith, and Andrew D. Spear
Full Title: Building Ontologies With Basic Formal Ontology
Ontologies represent (or seek to represent) reality, and they do so in such a way that many different persons can understand the terms they contain and so learn about the entities in reality that these terms represent.
The Systematized Nomenclature of Medicine (SNOMED),a leading international clinical terminology, defined a “disorder” in releases up to 2010 as “a concept in which there is an explicit or implicit pathological process causing a state of disease which tends to exist for a significant length of time under ordinary circumstances.” At the same time it defined “concepts” as “unique units of thought.” From this it follows that a disorder is a unit of thought in which there is a pathological process causing a state of disease, so that to eradicate a disorder would involve eradicating a unit of thought.
As Daniel Dennett notes, computer and information scientists are often desensitized to use-mention problems because the objects to which their terms refer are entities that are properly at home inside the computer (or inside the realm of mathematical entities). In this way refrigerators become identified with (are “modeled” as) refrigerator serial numbers; persons are identified with social security numbers. The following definition of “telephone” was proposed within the Health Level 7 (HL7) community in 2007: “Telephone: a telephone is an observation with a value having datatype ‘Telecom.’”
What do all the entities that a term such as “eukaryote cell” refers to have in common that makes them together form a type or universal?21 Our preferred answer to this question, which we call ontological realism, says that there is some eukaryote cell universal of which all particular eukaryote cells are instances. On this view, universals are entities in reality that are responsible for the structure, order, and regularity—the similarities—that are to be found there.
A class, on our view, is defined as a maximal collection of particulars falling under a given general term.
Each domain ontology consists of a taxonomy (a hierarchy structured by the is_a relation) together with other relations such as part_of, contained_in, adjacent_to, has_agent, preceded_by, and so forth, along with definitions and axioms governing how its terms and relations are to be understood. A domain ontology is thus a taxonomy that has been enhanced to include more information about the universals, classes, and relations that it represents.
What we have called a taxonomy is a representational artifact that is organized hierarchically with nodes representing universals or classes and edges which represent the is_a or subtype relation. Where simple taxonomies are organized in terms of the basic is_a relation only, ontologies are organized also by other relations, such as parthood.
Indeed, all ontologies, as we understand them here, consist of (1) a central backbone taxonomy, in which all the nodes of the ontology are linked together via is_a relations, together with (2) further relations defined between the nodes of the ontology. In addition, each node consists of (3) a term along with, when necessary, (4) synonyms for the term, and crucially (5) a definition of the term that makes use of the Aristotelian genus and differentia structure.
ontologies are representations of reality, not of people’s concepts or mental representations or uses of language.
Ontological realism applies equally to all branches of science, taking the view that, for example, collateralized debt obligations are no less real than electrons and planets.
The implications of perspectivalism for Ontology are that the irreducibility of different perspectives should be respected also in the design of ontologies. Ontology developers should not seek to represent all portions and features of reality in a single Ontology, but should seek, rather, a modular approach, in which each module is maintained as far as possible by experts in the corresponding scientific discipline.
Some specific implications of fallibilism for Ontology design in support of scientific research include the following: 3a. That every Ontology must have sophisticated strategies for keeping track of successive versions of the Ontology.
Adequatism is the opposite tendency, which holds that the entities in any given domain should be taken seriously on their own terms and that room must be made in our set of theories of reality for all of the different sorts of entities that reality contains, at all levels of granularity.
For the adequatist all scientific disciplines are prima facie of equal worth in providing representations of what exists in reality.
A final general principle to keep in mind is the following: when designing a domain Ontology, begin by identifying those features of the subject matter that are the easiest and clearest to understand and define.
An outline of the steps to be followed in designing a domain ontology 1. Demarcate the subject matter of the ontology. 2. Gather information: identify the general terms used in existing ontologies and in standard textbooks; analyze to remove redundancies. 3. Order these terms in a hierarchy of the more and less general ones. 4. Regiment the result in order to ensure: a. logical, philosophical, and scientific coherence, b. coherence and compatibility with neighboring ontologies, and c. human understandability, especially through the formulation of human-readable definitions. 5. Formalize the regimented representational artifact in a computer usable language in such a way that the result can be implemented in some computable framework.
Avoid as far as possible the use of acronyms and abbreviations in formulating ontology terms. The rationale for this is that acronyms and abbreviations are too easy to create locally—often, for example, by designers of databases for no reason other than to enable all column headings to fit on a single screen.
The half-life of acronyms can be very short, and it is not unusual for those who work with databases (even, sometimes, a database’s own creator) to forget what their acronyms originally meant. The goal of ontology, in contrast, is to create standard terminologies that can be employed and relied upon by anyone—in the present and in the future—working in a given discipline.
And even if only one such use of a mass noun like “tissue” were selected as the preferred label in an ontology, the mentioned ambiguities would still lead to problems of misuse of this term by human beings. It is for this reason that we recommend that mass nouns be avoided entirely when constructing ontologies. Instead phrases beginning with an appropriate prefix (such as “portion of,” “maximal portion of,” and so on) should be adopted.
To achieve this regimentation, we recommend transforming mass nouns such as “chemical substance” into count nouns by attaching “portion of” or some contextually appropriate equivalent operator to the beginning; thus “portion of chemical substance,” “portion of tissue,” and so on. Adopting this strategy makes it possible to treat seeming mass nouns as instances of either fiat parts or object aggregates (see chapter 5). The basic idea though, is that because mass nouns refer to different kinds of entities on different occasions of use, they should be avoided in favor of more ontologically transparent terminology.
Beginning with the most general types of entities determined by the specific target domain and working downward from there helps to rule out from the beginning the inclusion in the ontology of content that is not relevant to the chosen domain.
We recommend in particular that when building ontologies negative terms should be avoided entirely. That is, the ontology builder should assume that the universals are in every case positive, and so terms such as “nonrabbit” or “nonheart”—defined in accordance with (*)—should not be used, since there are no corresponding negative universals.
Each ontology should incorporate an is_a hierarchy having the structure of a directed acyclical graph with a single root.
Our own experience with domain experts who are not ontologists and are building ontologies in a variety of different contexts has taught us repeatedly that, when scientists find it difficult to select between multiple parents for a term needing to be included in an ontology, the discipline imposed by the single inheritance principle is welcomed because it repeatedly leads to greater clarity of thinking on the part of those involved.
Examples of object aggregates are a heap of stones, a group of commuters on the subway, a population of bacteria in your blood, a flock of geese, the collection of patients in a hospital.
Examples of specifically dependent continuants include the color of this tomato, the pain in your left elbow, the mass of this cloud, the smell of this piece of mozzarella, the disposition of this fish to decay, your role of being a doctor, the function of your heart to pump blood, and the quality of a specific pixel array on your screen. The mass of this cloud could not exist without this cloud and the color of this tomato could not exist without this tomato.
Examples of realizable entity include the role of being a doctor, the functions of the reproductive organs, the disposition of a portion of blood to coagulate, the disposition of a portion of metal to conduct electricity. Entities in each of these types are in each case associated with entities of corresponding process types in which they are realized (executed, manifested, actualized). Thus, for example, the role of a doctor is realized when he examines or treats patients; the function of a reproductive organ is realized in copulation or insemination.
A role is an externally grounded realizable entity, that is, it is a realizable entity that is possessed by its bearer because of some external circumstances (for example, the bearer has been assigned the role by some other persons, who have roles of their own which grant them a certain authority). A role is thus always optional; the bearer does not have to be in the given external circumstances.
A disposition is a realizable entity in virtue of which—for example, through appropriate triggers—a process of a certain kind occurs (or can occur or is likely to occur) in the independent continuant in which the disposition inheres.
A disposition can thus be conceived of as an internally grounded realizable entity.
Incorporation of dispositions into the BFO ontology provides a means to deal with those aspects of reality that involve possibility or potentiality without the need for complicated appeals to modal logics or possible worlds.
A function is a special kind of disposition. It is a realizable entity whose realization is an end-directed activity of its bearer that occurs because this bearer is (a) of a specific kind and (b) in the kind or kinds of contexts that it is made or selected for.
A BFO: process boundary is an occurrent entity that is the instantaneous temporal boundary of a process.