-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add adjacent punctuation modification rules to the spec #81
Comments
I could be wrong, but I don't think a move from affix to delimiter joins can be easily automated. Macros often express different joins depending on the result of conditional branching, and so would need to be disaggregated. Locators are also a thorny problem, since the joining punctuation for them can vary. It would probably need to be done gradually over time, and it would be good to back up the more popular styles with bespoke test suites to protect against regressions. |
I hope you're wrong (but you're probably not!), but we could at least automate simple conversions, and flag more complex examples of styles that need manual work? If yes, we should do this sooner rather than later so we can evaluate? Are there not too-complicated rules we could look for to identify the issues? |
A generic set of input data and test fixtures that exercise a style through
the most common permutations might be used to turn up punctuation and space
pairs. If duplicates are not that common, they might be controlled well
enough by adjusting the use of affixes, rather than wholesale refactoring.
It's a partial solution, since potential duplicates might still be lurking
in untested forms, but might be less demanding.
…On Fri, Jun 5, 2020 at 10:36 PM bdarcus ***@***.***> wrote:
I hope you're wrong (but you're probably not!), but we could at least
automate simple conversions, and flag more complex examples of styles that
need manual work?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#81 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAASMSREP6VETYIWEA24IJTRVDYGFANCNFSM4NTRNTBA>
.
|
OK. To illustrate what I was initially thinking ... Perhaps one simple rule to check the style files themselves is looking for elements where there are a prefix but no suffix, or vice versa? If we look at this example: <group delimiter=" ">
<text variable="URL" prefix=" Available from: "/>
<date variable="accessed" prefix="[accessed " suffix="]" form="text"/>
</group> That rule as stated above would skip the date element, but would flag the text element. So maybe only look for punctuation in the affixes, or content outside of group, in which case the above would pass, but this would be flagged: <text variable="title" suffix=", "/>
<text variable="URL" suffix=", "/> ... and could even be converted automatically to: <group delimiter=", " suffix="' ">
<text variable="title"/>
<text variable="URL"/>
</group> Do you think an approach like that could work, or must one look at the actual output? Advantage if we could do it is it could be integrated in the styles repo CI for changes going forward. I don't think I"ll be able to do either, but just so we know the right path or paths. |
I’ve done such conversions several times manually because I try to follow a punctuation-in-delimiter-attribute-only policy and the template styles often use affixes for that. My impression is that you’ll often have to intervene manually to get group nesting right and on some occasions you’d have to introduce if/else-blocks and duplicate elements to account for cases of omitted variables that influence delimiter use. That said, I believe that in the long run the delimiter approach, while more verbose, should be the way to go in the future. IMO it reduces the cases of unexpected punctuation clashes. |
I think the punctuation/spacing correction has fairly little to do with style coding practices. That's one case, but there are numerous others. Some punctuation modification absolutely needs to be permitted. For example, many styles place a period after the title. But this period should not be added when the title already ends in punctuation. That should be handled by the processor. For the question of affixes vs. groups, I'm really not sure that most of the changes that would be required could be automated. I've refactored a bunch of styles over the years to use groups instead of chained affixes, and it usually is a fairly bespoke affair. There is also the issue that style-coded affixes and user-entered affixes in citation data are different beasts. We could avoid the need for affix correction in styles through enforcing best practices there, but there isn't a way to prevent, for example, a user from entering extra/omitting necessary spaces in citation affixes. I don't think it would be a good user experience at all to have I suggest we explicitly go in the opposite direction--add the current citeproc-js/pandoc-citeproc modifcation behavior to the processor schema. |
Thanks for the thorough explanation.
To be clear, I don't really care. I just presented an emphatic statement,
with the plea for us to decide.
|
PS - advantage of status quo is we could just amend the 1.0 spec to include
the behavior, and be done.
|
@bwiernik I agree with you that punctuation correction is necessary. When I commented, I already forgot that that was the initial proposal and concentrated only on moving punctuation from affixes to group delimiters which I understood to be the consequence. I think, it’s a good idea to add the current behavior to the spec 1.0 spec. In a later version that could perhaps be made available to style authors so to allow for styles that don’t merge title-ending punctuation with the following punctuation character. Citavi presents the user with a substitution table for punctuation character combinations in the style editor. |
Let's do this then. Is there existing documentation of the behavior in citeproc-js and/or pandoc-citeproc that we could use as the basis? Could someone with comfort-level with the feature (not me), and with the specification.rst file (also not me), please put together a PR that we could include in the 1.0.2 release? I'll re-title this issue in the meantime to be more narrow. |
For citeproc-js, there are only the processor-specific tests, unfortunately. In addition to suppression of spaces and duplicate punctuation, there are issues around quotation marks and moving punctuation, where decisions must be made about the distribution of various combinations of punctuation marks (comma, period, semicolon, colon, queston mark, exclamaton point) on either side of a closing quote. |
This appears to be the pandoc-citeproc code responsible for this, but alas there are no docstrings on the functions, and I don't find Haskell the most easy-to-understand language. The citeproc-rs code seems to be in this directory, but is also undocumented, and I don't understand the code.
Which tests? I've seen specs, BTW, that embed test content within them. Maybe we could start from the test examples, and build the language around them? But to do that, we need to know which tests. |
Picking up on this comment from @fbennett and this reply from @ndw, I propose we document adjacent punctuation modification rules it for inclusion in 1.0.2.
explicitly disallow modification of adjacent punctuation in v1.1 of the spec.As @fbennett indicated, in existing implementations like citeproc-js and pandoc-citeproc:
Current expectations, however, are unspecified in the documentation.
This change would instead be the latter, and so would force styles to be updated to fix what are bugs in the styles.I believe, though am not 100% sure, that @fbennett already removed these tests from the test-suite.I am also unsure, but think it should be possible, whether can can automate updating of styles to fix these problems, perhaps along with the csl-update.xsl file. Might help if we could have a handful of example styles to test with.@ndw also suggested we add language to the current spec that required modification, to match existing behavior. I am unsure of whether we need this, particularly if we release 1.1 this summer and can get the styles updated quickly and easily.Closes citation-style-language/test-suite#13
The text was updated successfully, but these errors were encountered: