-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ISOCat reference to DatCatInfo #2227
Comments
I can try to take this one, because it requires a bit of crafting. The new standard defines a bit different entity with a bit different role, so a direct substitution might not be the best way to handle that. But I've just updated an ISO standard wrt the references to the dead ISOCat (even Heisenberg wouldn't be uncertain, here, sadly), so I think I can handle this one as well. Not sure if I don't actually have that assignment on a very old plate, somewhere among the grey TEI issues -- gonna have a look now. |
Hm, I thought I had already implemented #1866. I will investigate why I did not. |
Oops, the middle issue that I listed above is actually closed. Well, @martinascholger , please ping me if you decide that I can be of use. My instinct would be to look at all the text fragments that mention the mechanism, to remove the recommendation to use ISOCat and definitely not replace it with a recommendation of the privately-owned datcat, but rather generalise. The original 'mistake' was to recommend ISOCat as if it were the only such service, while it should have simply be treated as an example of an external reference taxonomy. Its use goes beyond language-related applications, too. |
@bansp: Can you recommend some wording that would allow us to indicate the need to refer to a standard without mentioning datcat? Here are the sections of the Guidelines that currently contain references to ISOcat: Attribute class: att.datcat Element spec: Text of 18.3: "Whether at the level of feature-system declarations, feature- and feature-value libraries, or individual features, it is possible to align both feature names and their values with standardized external data category repositories such as ISOcat. In the following example, both the feature part_of_speech and its value #commonNoun are aligned with the respective definitions provided by [ISO DCR (Data Category Registry)], as implemented by ISOcat." Note 82 in 18.3 Text of 9.5.2: "The TEI provides means to align grammatical categories as well as their content with the ISOcat reference, which is a Web implementation of [ISO 12620]. / In the example below, a fragment of the entry for isotope cited in section 9.3.2 Grammatical Information is adorned by references to ISOcat definitions for "part of speech" (dcr:datcat) and "adjective" (dcr:valueDatcat). Depending on the status and extent of the dictionary, various strategies may be used to reduce the redundancy of the repeated ISOcat references." |
I can submit a PR some time after this week. Will try to keep this in sights. Cheers! |
Thanks @bansp! We refrigerate this weekend. We can sneak these changes into the upcoming release if you are able to do them early next week. |
I'd love to say "challenge accepted", but I am unable to make promises at this point. Will try my best though, knowing the stakes. :-) |
Council F2F: @bansp We're approaching another release in October, so we're hoping perhaps to fix this by then. Can you help? |
Heck, yes, @ebeshero and thanks for the ping -- I'll handle this after my Wednesday presentation and before Saturday morning. |
@bansp Can we discuss this at the Ling SIG this afternoon? Seems like an appropriate venue since this is a matter for linguists. |
At Ling SIG, @bansp gave us a really helpful overview of the situation here and we think every current reference in the Guidelines to ISOCat should be replaced by a generic recommendation to point to a data category repository which ideally conforms with ISO 12620 if appropriate. |
Following discussion with @peterstadler and @bansp: @bansp will revise the spec page for att.datcat to serve as an example, and make a pull request; then Council can track down and revise all other mentions and invocations of ISOCat throughout the Guidelines and fix them following @bansp's example. |
While I'm nibbling on this, a sub-issue struck me, namely the matter of the dcr namespace, which is "http://www.isocat.org/ns/dcr" (recall that it's One solution could be to ask the CLARIN Standards Committee to assign a "clarin.eu"-based namespace URI for the Another solution, which at this point seems to me pretty optimal (under the circumstances), is to deprecate the "Oh no," you will go on, "the I see the deprecation of the prefix as "nativizing" the DCR mechanism by the TEI. If that move were approved by the Council, I would be happy to use the unprefixed attributes in the version of ISO MAF that is about to be submitted for the committee ballot. Perhaps there is a chance for the Council to address this in the time remaining in Newcastle? |
I agree with removing the prefix and the namespace. That will make the attributes seem more generally applicable for people who want to point at their own data categories. @laurentromary do you have any thoughts on this? |
Yes, that's a good option! |
@rettinghaus @bansp @sydb @martindholmes @martinascholger I was working on #2340 about inconsistent ISO referencing, and stumbled into the datcat question on my own. Can we do one of the following here?
|
Reviewing the ticket and recalling conversations, that link update seems precisely what we want to avoid. Sorry for barging in on the back of another ticket! Anyway, I’ll concentrate on the easier updates in #2340 . And await the PR. |
But I wonder if we can, for the moment, just remove the references to http://isocat.org/ as preparation for the coming PR, since we know we shouldn’t be pointing to it at all. We seem agreed here not to be pointing to a standard, and the idea of “nativizing” data categories (and perhaps other things for which we used to rely on ISO) seems the path for TEI. |
If there is no rush to remove the references to isocat today as opposed to a week ago, may I ask for them to be left in place for now, simply because I'm, going over them all (I know, more than I was asked for, but it's hard to leave them if I can handle them), and I already anticipate some conflicts with my version even before I submit the PR. Not increasing the amount of extra work spent on resolving conflicts would be very welcome, because I'm racing against several clocks (but this item is my priority now). |
Got it--thank you, @bansp . I won't touch the isocat links and leave this to you. I'd like Council to review my table of proposed ISO citation updates anyway before I do anything more on #2340 . Let us know if we can help with anything on your end! |
The result is in PR #2359 , spread mostly across Specs/att.datcat.xml and the FS and DI chapters, with some little extras.
Note: the three above documents invoke the Paderborn version of the TEI schema, with the I hope the result is acceptable (I still need to check if I have introduced any layout mess in FS; but will catch some sleep first). I am of course willing to work on improving/rearranging the info even if something close to its current version gets merged for the upcoming release. Cheers! |
This is firstly to register my thanks to @sydb for the wonderful lot of work he has put into his review of the PR. I prefer to do this here, rather than within the now lengthy and I'm not sure how persistent PR itself. Extremely helpful, and I think I'm learning even some small things, like whether I really want to use a solidus, or is the use of it just admitting my laziness... |
From Roberto Rosselli Del Turco via TEI-L (2022-02-02): "in the Guidelines section devoted to Dictionaries there's one reference to the ISOCat standard, but the latter has been superseded by DatCatInfo: http://www.datcatinfo.net/ (with a more precise URL of course) instead of http://www.isocat.org/ in https://tei-c.org/release/doc/tei-p5-doc/en/html/DI.html#index-egXML-d52e79215."
The text was updated successfully, but these errors were encountered: