Skip to content

Commit

Permalink
2725 Add CellMark CLM-CL component (#2772)
Browse files Browse the repository at this point in the history
* Prevent update_repo from overwriting the QC workflow.

CL has a customised QC workflow, in which we additionally check for
taxon constraint violations. Whenever the repository is updated (e.g. to
reflect changes in cl-odk.yml after adding a new component), our QC
workflow file should _not_ be overwritten by the ODK-generated one.

When upgrading to a newer major version of the ODK, if the never version
brings changes to the standard QC workflow, it will be up to a CL
"pipeline engineer" to port those changes if necessary.

* Add the CLM-CL component.

We add a new component obtained from the Cell Markers Ontology
(https://github.com/Cellular-Semantics/CellMark).

For now, it is refreshed unconditionally as part of all workflows, as
for the other similar components (the HRA subset and the CellxGene
subset). Later we will need to uncouple the refresh from at least the QC
workflow.

* Commit re-generated files.

* Only refresh external resources upon MIR=true.

This commit make sure that external resources (e.g. mapping sets and
components that are maintained elsewhere, and simply used as they are in
CL) are only refreshed when the MIR Make variable is set to true.

Importantly, when running the QC workflows MIR is false, which means
that those resources will then _not_ be refreshed as part of the QC.
This is on purpose. Those external resources could at any time introduce
changes that could break CL. QC should test changes that happen entirely
within CL.

With that changes, it is up to CL's editors/maintainers to refresh the
external resources _when they want to do so_. For that, all they need to
do is to make sure that MIR is set to true. For example, this is done
automatically when calling `make refresh-imports`.

closes #2644
  • Loading branch information
gouttegd authored Nov 28, 2024
1 parent 9aa526e commit 1ca7174
Show file tree
Hide file tree
Showing 9 changed files with 42 additions and 15 deletions.
2 changes: 2 additions & 0 deletions docs/odk-workflows/RepositoryFileStructure.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ These are the current imports in CL
| ro | http://purl.obolibrary.org/obo/ro.owl | None |
| pato | http://purl.obolibrary.org/obo/pato.owl | None |
| ncbitaxon | http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim.owl | None |
| ncbitaxondisjoints | http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim-disjoint-over-in-taxon.owl | None |
| omo | http://purl.obolibrary.org/obo/omo.owl | mirror |

## Components
Expand All @@ -42,3 +43,4 @@ These are the components in CL
| general_cell_types_upper_slim.owl | None |
| kidney_upper_slim.owl | None |
| cellxgene_subset.owl | None |
| clm-cl.owl | None |
5 changes: 3 additions & 2 deletions src/ontology/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
# More information: https://github.com/INCATools/ontology-development-kit/

# Fingerprint of the configuration file when this Makefile was last generated
CONFIG_HASH= d4e868f8bbec3ce7b0d7656c824520ab0b96d8901a8c2b9cf400c27c3fb3dac2
CONFIG_HASH= fe26dd231ab531dab5609c794de598d2a28083168916941a3276fe672cfe8be2


# ----------------------------------------
Expand Down Expand Up @@ -53,7 +53,7 @@ OBODATE ?= $(shell date +'%d:%m:%Y %H:%M')
VERSION= $(TODAY)
ANNOTATE_ONTOLOGY_VERSION = annotate -V $(ONTBASE)/releases/$(VERSION)/$@ --annotation owl:versionInfo $(VERSION)
ANNOTATE_CONVERT_FILE = annotate --ontology-iri $(ONTBASE)/$@ $(ANNOTATE_ONTOLOGY_VERSION) convert -f ofn --output $@.tmp.owl && mv $@.tmp.owl $@
OTHER_SRC = $(PATTERNDIR)/definitions.owl $(COMPONENTSDIR)/hra_subset.owl $(COMPONENTSDIR)/mappings.owl $(COMPONENTSDIR)/blood_and_immune_upper_slim.owl $(COMPONENTSDIR)/eye_upper_slim.owl $(COMPONENTSDIR)/general_cell_types_upper_slim.owl $(COMPONENTSDIR)/kidney_upper_slim.owl $(COMPONENTSDIR)/cellxgene_subset.owl
OTHER_SRC = $(PATTERNDIR)/definitions.owl $(COMPONENTSDIR)/hra_subset.owl $(COMPONENTSDIR)/mappings.owl $(COMPONENTSDIR)/blood_and_immune_upper_slim.owl $(COMPONENTSDIR)/eye_upper_slim.owl $(COMPONENTSDIR)/general_cell_types_upper_slim.owl $(COMPONENTSDIR)/kidney_upper_slim.owl $(COMPONENTSDIR)/cellxgene_subset.owl $(COMPONENTSDIR)/clm-cl.owl
ONTOLOGYTERMS = $(TMPDIR)/ontologyterms.txt
EDIT_PREPROCESSED = $(TMPDIR)/$(ONT)-preprocess.owl
PATTERNDIR= ../patterns
Expand Down Expand Up @@ -513,6 +513,7 @@ $(COMPONENTSDIR)/cellxgene_subset.owl: $(TEMPLATEDIR)/cellxgene_subset.tsv
$(ANNOTATE_CONVERT_FILE); fi

.PRECIOUS: $(COMPONENTSDIR)/cellxgene_subset.owl

# ----------------------------------------
# Mirroring upstream ontologies
# ----------------------------------------
Expand Down
1 change: 1 addition & 0 deletions src/ontology/catalog-v001.xml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
<uri id="User Entered Import Resolution" name="http://ontology.neuinfo.org/NIF/BiomaterialEntities/NIF-Cell.owl" uri="mirror/NIF-Cell.owl"/>
<uri id="User Entered Import Resolution" name="http://purl.obolibrary.org/obo/cl/patterns/definitions.owl" uri="../patterns/definitions.owl"/>
<uri name="http://purl.obolibrary.org/obo/cl/components/hra_subset.owl" uri="components/hra_subset.owl"/>
<uri name="http://purl.obolibrary.org/obo/cl/components/clm-cl.owl" uri="components/clm-cl.owl"/>
<uri name="http://purl.obolibrary.org/obo/cl/components/mappings.owl" uri="components/mappings.owl"/>
<uri name="http://purl.obolibrary.org/obo/cl/components/blood_and_immune_upper_slim.owl" uri="components/blood_and_immune_upper_slim.owl"/>
<uri name="http://purl.obolibrary.org/obo/cl/components/eye_upper_slim.owl" uri="components/eye_upper_slim.owl"/>
Expand Down
1 change: 1 addition & 0 deletions src/ontology/cl-edit.owl
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ Prefix(ncbitaxon:=<http://purl.obolibrary.org/obo/ncbitaxon#>)
Ontology(<http://purl.obolibrary.org/obo/cl.owl>
Import(<http://purl.obolibrary.org/obo/cl/components/blood_and_immune_upper_slim.owl>)
Import(<http://purl.obolibrary.org/obo/cl/components/cellxgene_subset.owl>)
Import(<http://purl.obolibrary.org/obo/cl/components/clm-cl.owl>)
Import(<http://purl.obolibrary.org/obo/cl/components/eye_upper_slim.owl>)
Import(<http://purl.obolibrary.org/obo/cl/components/general_cell_types_upper_slim.owl>)
Import(<http://purl.obolibrary.org/obo/cl/components/hra_subset.owl>)
Expand Down
3 changes: 3 additions & 0 deletions src/ontology/cl-odk.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ use_dosdps: TRUE
use_mappings: True
use_edit_file_imports: FALSE
release_diff: TRUE
workflows:
- docs
export_formats:
- owl
- obo
Expand Down Expand Up @@ -154,4 +156,5 @@ components:
use_template: True
templates:
- cellxgene_subset.tsv
- filename: clm-cl.owl

25 changes: 15 additions & 10 deletions src/ontology/cl.Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -95,8 +95,8 @@ $(MAPPINGDIR)/cl.sssom.tsv: $(MAPPINGDIR)/cl-local.sssom.tsv \
EXTERNAL_SSSOM_PROVIDERS = fbbt zfa
EXTERNAL_SSSOM_SETS = $(foreach provider, $(EXTERNAL_SSSOM_PROVIDERS), $(MAPPINGDIR)/$(provider).sssom.tsv)

# We only refresh external resources under IMP=true
ifeq ($(strip $(IMP)),true)
# We only refresh external resources under MIR=true
ifeq ($(strip $(MIR)),true)

# FBbt mapping set
$(MAPPINGDIR)/fbbt.sssom.tsv: .FORCE
Expand Down Expand Up @@ -344,10 +344,9 @@ add-replacedby:
# EXTERNAL RESOURCES
# ----------------------------------------

ifeq ($(strip $(MIR)),true)

# Human reference atlas subset
# FIXME: Refreshing of this resource should be uncoupled from the
# release/QC pipelines.
# See <https://github.com/obophenotype/cell-ontology/issues/2644>
HRA_SUBSET_URL="https://raw.githubusercontent.com/hubmapconsortium/ccf-validation-tools/master/owl/CL_ASCTB_subset.owl"
$(TMPDIR)/hra_subset.owl:
wget $(HRA_SUBSET_URL) -O $@
Expand All @@ -356,14 +355,20 @@ $(COMPONENTSDIR)/hra_subset.owl: $(TMPDIR)/hra_subset.owl
$(ROBOT) merge -i $< annotate --ontology-iri $(ONTBASE)/$@ --output $@

# CellXGene reference subset
# FIXME: Never actually downloaded again, unless the
# $(TEMPLATEDIR)/cellxgene_subset.tsv file is manually removed; probably
# not what was intended.
# See <https://github.com/obophenotype/cell-ontology/issues/2644>
CELLXGENE_SUBSET_URL="https://raw.githubusercontent.com/hkir-dev/cellxgene-cell-reporter/main/templates/cellxgene_subset.tsv"
$(TEMPLATEDIR)/cellxgene_subset.tsv:
$(TEMPLATEDIR)/cellxgene_subset.tsv: .FORCE
wget $(CELLXGENE_SUBSET_URL) -O $@

# CellMark reference subset
CLM_CL_URL="https://raw.githubusercontent.com/Cellular-Semantics/CellMark/main/clm-cl.owl"
$(TMPDIR)/clm-cl.owl:
wget $(CLM_CL_URL) -O $@

$(COMPONENTSDIR)/clm-cl.owl: $(TMPDIR)/clm-cl.owl
$(ROBOT) merge -i $< annotate --ontology-iri $(ONTBASE)/$@ --output $@

endif


# ----------------------------------------
# RELEASE DEPLOYMENT
Expand Down
14 changes: 14 additions & 0 deletions src/ontology/components/clm-cl.owl
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
<?xml version="1.0"?>
<rdf:RDF
xml:base="http://purl.obolibrary.org/obo/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:oboInOwl="http://www.geneontology.org/formats/oboInOwl#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:obo="http://purl.obolibrary.org/obo/">
<owl:Ontology rdf:about="http://purl.obolibrary.org/obo/cl/components/clm-cl.owl"/>

<!-- This is a placeholder, it will be regenerated when makefile is first executed -->
</rdf:RDF>
4 changes: 2 additions & 2 deletions src/scripts/update_repo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ cp target/$OID/src/ontology/run.sh $SRCDIR/ontology/
cp -r target/$OID/src/sparql/* $SRCDIR/sparql/
mkdir -p $ROOTDIR/.github
mkdir -p $ROOTDIR/.github/workflows
cp target/$OID/.github/workflows/qc.yml $ROOTDIR/.github/workflows/qc.yml




Expand All @@ -36,5 +36,5 @@ cp target/$OID/.github/workflows/docs.yml $ROOTDIR/.github/workflows/docs.yml
cp -n target/$OID/mkdocs.yaml $ROOTDIR/

echo "WARNING: These files should be manually migrated: mkdocs.yaml, .gitignore, src/ontology/catalog.xml (if you added a new import or component)"

echo "WARNING: Your QC workflows have not been updated automatically. Please update the ODK version number(s) in .github/workflows/qc.yml."
echo "Ontology repository update successfully completed."
2 changes: 1 addition & 1 deletion src/sparql/class-count-by-prefix.sparql
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ SELECT ?prefix (COUNT(DISTINCT ?cls) AS ?numberOfClasses) WHERE
FILTER (!isBlank(?cls))
BIND( STRBEFORE(STRAFTER(str(?cls),"http://purl.obolibrary.org/obo/"), "_") AS ?prefix)
}
GROUP BY ?prefix
GROUP BY ?prefix

0 comments on commit 1ca7174

Please sign in to comment.