-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What are the rules for putting spaces around prefixes and suffixes? #13
Comments
Yes, this is beyond the CSL specification. The |
For what it's worth, I think any behavior that is "beyond the specification" should cause the test to fail. The specification should demand a trailing space or forbid a trailing space. If the specification is ambiguous, that's a bug in the spec. |
I agree. If the spec is unclear now, but it's straightforward to fix, we should do that. If there's some debate, then we should remove from the test suite, or maybe put it in some "pending" directory or something. Flagging this for a few people, since it bears on the clarity of the spec. Would be nice if we could plug these holes are Norm is identifying them, and we're working on resolving lingering issues on the schema. |
@fbennett I know that citeproc-js has a fair amount of these sorts of behaviors. Do you have a suggestion for how we might go through and identify them systematically? |
For the general issue: we're well aware that there are many such cases. Given the amount of attention to detail and fixes for specific issues have gone into citeproc-js and thus the test suite, I'd be surprised if there were fewer than 50 such tests that cover issues not made explicit in the specifications and I think we actually have some language somewhere that says the tests are right where the specification doesn't cover something. Translating these into specifications would definitely be welcome (and something Rintze and I have talked about for a while), but it's a massive effort and is probably best be done piecemeal, lest it delays a spec update by another couple of years. For the specific issue, though, I don't believe the specification even addresses prefix/suffix on cites. We could decide to change that (although for this particular case I'm not certain we have to*), but unless we do, the correct interpretation is that the CSL spec sees this as a downstream design implementation and citeproc-js made a specific call.
|
Indeed, there's been recent discussion of just this. |
If CSL is defined by the I want to generate citations and bibliographic entries from tagged data. Hand coding every possible format is impractical. Having some kind of abstraction seems like a much better approach. CSL was suggested so I started doing a little implementation work, just to get a sense for how much effort was involved and whether any sort of incremental approach would be useful. If the challenge is "implement the spec", that might be hard work and might require clarifications in the spec (there's nothing like a new implementor to reveal where the spec isn't clear). If the challenge is "implement a processor that's compatible with the output of If the spec doesn't completely and independently define the language, then it's not a spec, it's a collection of notes that go with the reference implementation. I'm not saying that's bad or wrong or not what you intended, but it's not a solid foundation for new implementations. |
The specification is a complete specification of the syntax of CSL styles, i.e. (mostly) exhaustive for people interesting in coding styles. To facilitate implementation we have assumed what was indeed originally the citeproc-js test-suite into the CSL set of repositories, since citeproc-js is the oldest citeproc and the vast majority of CSL styles have been tested against it; so passing the test suite will guarantee expected outcome for existing CSL styles. For now, that's the best we can offer. As per the above, it'd be great to have the spec be complete, but the amount of work that would entail is massive and I'd be (pleasantly) surprised if it could happen in the next 12 month. At a minimum, it would require sustained effort by multiple people beyond labor that's been available to CSL over the last ~5 years. |
Maybe, then, we need explicitly say this in the right places; at the top of the README for the test suite, for example? It does seems like a strategy to address this over time is to first identify those tests that are inconsistent with the spec, so we at least know where the current gaps are, can note them so implementers are aware, and then fix them as we can? |
Yes, that seems right to me. We say something along those lines on the website already: https://citationstyles.org/developers/ and I think similar or even more explicit text in the introduction of the test-suite makes a lot of sense.
That also seems right, though I think "inconsistent" really isn't the right term for a lot of them. "Aren't current addressed in..." makes more sense. I would also at least consider whether this should all go in the same document. A lot of these tests are descriptions of the kind "in situation XYZ, a citeproc should..." -- the current specifications are a document that principally describes "to code a citation style, you should...". As I suggest above, the two aren't designed for the same user group and I'm somewhat concerned that, by making all processor behavior explicit in the specs, they become virtually unusable for style authors (they're intimidating enough for that purpose already). |
Good point. Maybe we need two spec docs.
|
Two observations that I hope will not further complicate the discussion.
|
I completely understand the necessity of making pragmatic choices. However, at the risk of being perceived as argumentative, the behavior of I don't care (I do in fact care, but my personal opinions aren't relevant to this discussion) how this is resolved, but I think there are two choices:
If the corpus is too large to refactor, put a version number on the style element and say that removing adjacent punctuation is manditory on "old" styles and forbidden on "new" ones. |
I think that ultimately producing style-authoring and citeproc-implementing spec documents will be valuable. In the interim, explicitly stating that the test suite results should considered canonical in cases where the spec is ambiguous would address much of this issue. (For other cases where Norman notes CSL-M-specific content in tests, these should be explicitly labeled as such or removed.) |
Yes. Concerning one or two docs: Having two distinct documents is a good idea, but I guess there will be some overlap though. Where the document for style authors will say "do X if you want Y output" the document for citeproc implementers will say "citeprocs are expected to process X in a Y way". I don't know if there is an elegant way to solve this and to keep both documents in sync. |
I think the citeproc spec would be supplemental to the main spec. |
Just for the record: Biblatex has a very nice mechanism for this problem: you add |
Numbers:
We already have a version number on the style element. All current styles are "1.0". So it would be feasible to do a "1.1" release that did this; cleaned up and simplified the currently implicit processing expectations, and so also made clear new styles need to account for duplicate punctuation, etc. We've been discussing new releases recently elsewhere, and we can include this. But if we did, we'd obviously want to identify any other similar examples. I really think the best immediate path on the test-suite is to create a separate subdirectory and start to move tests like this to that, and while doing it, add some metadata to these that tag them as such, maybe with some description. We could start with this one. With 871 tests, this will take time. |
To simplify things for the present, I've added explicit trailing space on all cite prefixes, and explicit leading space on all suffixes that require it. (264f3bf), retaining a copy of the original fixtures in the |
@ndw - does this indeed help, so I can close this? I'll leave the broader #17 issue for addressing your broader point longer term, though at this point most of the work needs to happen in documentation. |
We should discuss spacing behavior, perhaps in a different issue. I think that citeproc-js's behavior here is correct, especially for GUI clients like Zotero or Mendeley, where whether a space if provided at the end of a prefix is often not clear to the user. |
So maybe open an issue on the documentation repo to clarify the spec on this, link to this issue, and then revise the test to confirm to the new spec language? |
I opened a new linked issue. But CSL is not for GUI clients only, so we can't base decisions on that. I have a hunch they can tweak their code to help users here. Whatever we settle on, though, we need to make it explicit in the spec, per the point of this issue. |
I can't find any discussion in the specification of when spaces are added before and/or after prefixes and/or suffixes. Indeed some of the examples in the spec appear to explicitly make spaces part of the prefix and/or suffix. (Apologies if I've overlooked the relevant part of the spec.)
In
affix_PrefixFullCitationTextOnly.txt
, the expected result isBy my naive interpretation of the test, I think the output should be
There is no trailing space in the item prefix.
The text was updated successfully, but these errors were encountered: