Development of CDO Ontologies

This page houses practices followed for development of CDO ontologies. It is the result of needs identified over the early years of CDO ontologies’ socialization and software review processes. Before the 1.0.0 release of the CDO ontologies, it is likely practices on this page will continue to adapt as workflows are refined.

Review checklists

(Link)

GitHub repositories for CDO ontology development use the following checklist templates for coordinating the issue’s progression with the respective ontology committee. To enable GitHub progress-tracking (based on checkbox counts), these templates are inlined as edits into the initial Issue or Pull Request description, as an edit by OC Chair or Coordinator.

Due to a display issue with website font colors, the checklists are presented in the CONTRIBUTE.md file in the website’s source repository, here.

Branching

GitHub repositories in CDO follow two branching practices:

“Git-flow” branching

(Link)

The “Git-flow” branching model used by CDO is based on the description by Vincent Driessen, dated 2010-01-05. Repositories following this branch model generally expect most development to be done in “Feature” branches, branching off of develop. The “Primary” branch (typically named master or main) designates releases with tags and the GitHub release-list interface.

In this branching model, pull requests should target the develop branch, not the primary branch.

The head of the primary branch is typically the current release. There may be some non-release commits made on the primary branch due to needing to program components of GitHub interface elements.

“Continous-release” branching

(Link)

This branching model is used for repositories that do not designate releases. The head of the primary branch (master or main) is the “Current release.”

In this branching model, pull requests should target the primary branch.

Testing

Testing prereleases

(Link)

The CDO ontology Git repositories (including CASE’s ontology repository and UCO’s) follow the “Git-flow” branching model. There is additional consideration put into processing the develop and feature branches:

Part of the testing process for the ontology is assessing impact of proposals, across tooling and existing example data. To assist with this review, CASE provides “Prerelease” ontology builds, available here:

These are monolithic and syntax-normalized builds of the CASE and UCO ontologies. Their states are used to review each of the CASE examples, and their validation SHACL results are stored as files alongside the examples’ source materials. For instance, here are the current validation results for the website’s Asgard example:

CASE-develop.ttl is built according to develop branch states of CASE and UCO, and thus incorporates all of the proposals that are committee-approved and staged for the next release. CASE-unstable.ttl follows an implementation practice that is, well, unstable: Most, or all, proposals under consideration are merged into one branch, before committee review or approvals that would see the proposals merged into develop.

CDO ontologies maintain a -Archive Git repository (CASE’s, UCO’s), whose primary functions are these:

  1. Serve an unstable branch that represents every proposal under committee consideration.
  2. Store an archive of every prior state of the unstable branch.

(The -Archive repositories are not “GitHub forks”, in order to prevent interface confusion from Pull Requests. They do, however, share the master and develop branch histories.)

What makes the unstable branch worth its own archive repository is the branch will not guarantee preservation of its own Git history. Feature branches split off of develop, but do not have a single stable joining point in the ontology development repository where all of their effects can be considered in aggregate. Yet, it is important to discover when in-flight proposals might conflict with one another, and sometimes this is only visible when considering all at once. Meanwhile, the order in which these branches are tested might not be the order in which they are voted upon and accepted into develop by the committee.

The unstable branch will be reset to develop, by the Ontology Committee Chair, Coordinator, or Product Manager, with some left-undefined frequency. To ensure access to prior states of the unstable branch, the -Archive repository will maintain named branches at the time of reset, e.g. archive/unstable-2022-04-01. (This can benefit users who test by using Git submodules and/or Git Bisect, rather than website downloads. The CASE-Examples repository and CASE website both track with submodules.)

To summarize, if a developer wishes to test against some “Prerelease” state:

Profiles

(Link)

The ontologies within CDO, including UCO, are designed as “mid-level” domain ontologies, generally but not entirely scoped within the cyber domain. A “mid-level” ontology is distinct from “top-level” (aka “foundational” or “upper”) ontologies. The rationale for being “mid-level” has been to avoid excluding other potential ontological alignments that exist as independent efforts modeling other domains, such as provenance. Because top-level ontologies are generally not compatible with one another (“Foundational” typically being a distinct status within a knowledge model), to adopt one top-level ontology potentially declines interoperability with another and all adopters of the other. Similarly, other ontologies that do not consider themselves “top-level” are not necessarily compatible with any “top-level” ontology that might be adopted.

CDO ontologies have need of adopting existing efforts in other domains, especially when there is a demonstrated need for something that is adjacent to the cyber domain, such as photographing physical objects. UCO can provide description of the camera; CASE, the photograph-subject’s relevance to an investigation; but, neither CASE nor UCO have, say, the class of Motorcycles as photograph-subject in their scope, nor the photograph’s location being “near” this particular conceptualization of the Washington Monument.

UCO can explore alignment between, say, uco-location:Location and GeoNames’ Feature, but should not do so at the expense of other geospatial representations, such as GeoSPARQL 1.1’s Feature, or BFO 2.0’s spatial region. To explore alignment, CDO ontologies are using “Profile” repositories on Github.

Profiles serve three use cases, which have different strategic objectives:

Though the objectives for each of these use cases differ significantly, the overall implementation method remains consistent for the three, except for the mimicking profile declining to relate UCO to the external ontology with subclassing.

Each “Profile” repository follows this pattern:

These repositories can be brought together to review how well current examples adhere to the profiles’ ontological alignments, whether by confirming graph-individuals’ disjointedness through RDFS expansion, or through a consistency review through OWL-DL expansion. (A Github repository attempting this is currently under development.) Bringing these profiles together is one reason the CDO class is a subclass of the external class, rather than an equivalent class. One of the objectives is to explore whether multiple profiles reveal an inconsistency in unrelated ontologies, when exercised in a CDO example. (The other reason equivalent-class designations are avoided is to avoid inappropriate scope-expansions of CDO rules within adopters’ knowledge graphs, such as individuals under UCO hierarchies generally being urged to end with UUIDs.)

These repositories are each designated as “Exploratory”. Their contents are neither official, versioned beyond Git commit mechanisms, nor subject to Ontology Committee workflows for revisions. They are expected to change as modeling needs are demonstrated through new class, property, and example development. Those wishing to adopt a Profile are encouraged to do so using a Git submodule. Contributions or requests for alignment explorations are welcome.