Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Typo/Bug]CL:0000163 CL:0000164 #2856

Closed
jingangdidi opened this issue Dec 11, 2024 · 9 comments
Closed

[Typo/Bug]CL:0000163 CL:0000164 #2856

jingangdidi opened this issue Dec 11, 2024 · 9 comments
Labels

Comments

@jingangdidi
Copy link

CL term
CL:0000163 and CL:0000164

Description of typo, bug or error

Why CL:0000163 is_a is CL:0000164, and vice versa?

id: CL:0000163
is_a: CL:0000164 {gci_relation="part_of", gci_filler="UBERON:0000945"} ! enteroendocrine cell
is_a: CL:0000164 {gci_filler="UBERON:0000160", gci_relation="part_of"} ! enteroendocrine cell
is_a: CL:0000164 {gci_filler="UBERON:0001264", gci_relation="part_of"} ! enteroendocrine cell

id: CL:0000164
is_a: CL:0000163 ! endocrine cell

@dosumis
Copy link
Contributor

dosumis commented Dec 11, 2024

Hi @jingangdidi

This is confusion caused by the way an OWL construct (GCI) is expressed in OBO.

Here's what this looks like in OWL

'endocrine cell' and ('part of' some intestine) SubClassOf 'enteroendocrine cell'
'endocrine cell' and ('part of' some stomach) SubClassOf 'enteroendocrine cell'
'endocrine cell' and ('part of' some pancreas) SubClassOf 'enteroendocrine cell'

In English: Any endocrine cell that is part of the intestine is an enterendocrine cell

Is there any reason for you to use OBO? I would recommend our JSON serialisation instead:
http://purl.obolibrary.org/obo/cl/cl.json . With this you can use the edges array to find relationships without encountering confusing issues like this.

@cmungall; @gouttegd could we remove GCIs from our OBO products? This issue comes up v.regularly.

@gouttegd
Copy link
Collaborator

gouttegd commented Dec 11, 2024

could we remove GCIs from our OBO products?

We certainly could. I am working right now on customising CL’s OBO products to merge rdfs:comment annotations (when there are several of them for one class) into a single one (as discussed in #2847), and it shouldn’t be difficult to include a GCI stripping step as part of this customisation.

As to whether we should do it, and whether we should do it more generally across our ontologies (e.g. by doing it directly in the ODK), I have no opinion.

@dosumis
Copy link
Contributor

dosumis commented Dec 11, 2024

As to whether we should do it, and whether we should do it more generally across our ontologies (e.g. by doing it directly in the ODK), I have no opinion.

I think it comes down to whether an ontology decides to support round-tripping between OBO and OWL. This is gone once non-coverted axioms are stripped from the OBO header. Arguably this is just a specialised version of the same. @cmungall do you have software stack using the OBO representation of these axioms - e.g. are they used by OAK?.

@jingangdidi
Copy link
Author

Is there any reason for you to use OBO? I would recommend our JSON serialisation instead:
http://purl.obolibrary.org/obo/cl/cl.json . With this you can use the edges array to find relationships without encountering confusing issues like this.

Thank you for your clarification. I encountered the use of cl.obo in popv, I want to retrieve all parent and child types for any given id, so I write code to read cl.obo, match id:, name: and is_a:, then build relationships. Is it more convenient to utilize cl.json for this requirement compared to cl.obo? I understand the meaning of is_a in cl.obo is "a subclassing relationship between one term and another", making its use more intuitive. Could you elucidate what sub, pred, and obj represent within the edges, or provide some information regarding the format of cl.json?

@gouttegd
Copy link
Collaborator

Is it more convenient to utilize cl.json for this requirement compared to cl.obo?

I’d say yes, because the OBO format has quite a few quirks, and parsing it is more complicated than it looks like. The use of the “qualifier” syntax to encode GCIs is a good example of the not-always-apparent complexity of the format.

The cl.json file follows the OBOGraphs schema, which is defined here. Basically, if you want the parent-child relationships, they are represented in the edges array as:

{
  "sub": "<ID of the subclass>",
  "pred": "is_a",
  "obj": "<ID of the superclass>"
}

For example, 'enteroendocrine cell' is_a 'endocrine cell' is represented as:

{
  "sub": "http://purl.obolibrary.org/obo/CL_0000164",
  "pred": "is_a",
  "obj": "http://purl.obolibrary.org/obo/CL_0000163"
}

@gouttegd
Copy link
Collaborator

(You may also want to have a look at the Ontology Access Kit, which aims to provide access to this kind of data without having to bother about formats and parsing.)

gouttegd added a commit that referenced this issue Dec 12, 2024
Use a new command in the Uberon ROBOT plugin to produce "customized" OBO
artefacts in which:

* all "untranslatable" OWL axioms (owl-axioms tag) are stripped (they
  were already stripped before, this just changes the way we do it);
* all GCI axioms are stripped (#2856);
* when a class has several rdfs:comment annotations (which is not
  allowed in OBO), they are merged into a single annotation (#2847).
@canergen
Copy link

Thanks for raising this issue @jingangdidi. Code in popV is fully rewritten to use json. I still have to merge it but for this all reference models need to be retrained. @gouttegd It would be helpful to list the json as product in https://obofoundry.org/ontology/cl.html.

@cmungall
Copy link
Member

I think these GCIs should be removed from CL altogether, I think they are trying to do what is better done using an equivalence axiom.

For the general case, I'll answer on the PR on #2860

gouttegd added a commit that referenced this issue Dec 17, 2024
Use a new command in the Uberon ROBOT plugin to produce "customized" OBO
artefacts in which:

* all "untranslatable" OWL axioms (owl-axioms tag) are stripped (they
  were already stripped before, this just changes the way we do it);
* all GCI axioms are stripped (#2856);
* when a class has several rdfs:comment annotations (which is not
  allowed in OBO), they are merged into a single annotation (#2847).
@gouttegd
Copy link
Collaborator

I assume we can close here. The original poster got their answers, the confusing GCI axioms will no longer be included in OBO products, and CL’s page on the OBO Foundry will be updated to explicitly mention the JSON products (OBOFoundry/OBOFoundry.github.io#2665).

Feel free to re-open if an unanswered question has been missed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants