-
-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inappropriate hyphen conversion in number
#239
Comments
I treat |
Yeah, that would be best in my opinion. @fbennett @retorquere do you think one of you could make that change in citeproc-js? |
By my count, there are 1299 repo styles that use <text variable="number",
and 461 that use <number variable="number">, with one overlapping style
that uses both, and 552 styles that don't render the number variable.
With <text variable="number"> as the dominant form, a change in behavior
will affect some users. The en-dash conversion is assuming a range is
intended. You'd want to be sure that assumption is generally incorrect for
this variable, and be prepared for possible support requests if a change
that dumps raw variable content is pushed through to production.
Looseness in the schema puts a tough task to processors, by: (a) specifying
some variables as "numeric" type; but (b) allowing numeric variables to
render via a <text> node; and (b) permitting arbitrary text content in
numeric variables.
Given the range of possible inputs, it can be a challenge to find a path
that doesn't lead to unpleasant surprises.
…On Tue, Feb 13, 2024, 1:00 PM Brenton M. Wiernik ***@***.***> wrote:
Yeah, that would be best in my opinion.
—
Reply to this email directly, view it on GitHub
<#239 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAASMSUIBIH3SRZYLUATLU3YTLQMPAVCNFSM6AAAAABDEAG7FOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBQGM4DONZQGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks for analyzing that! I know you've put a lot of thought into this over the years, and I really appreciate the inclusion of the override syntax I think this is more a question of data usage of the The most common usage of Related to that, the most common type used with (Whether In sum, if |
I didn't know about this. Does anyone have a sample of this? "pass on verbatim" in the case of BBT also means that |
It would be ideal to see arbitrary database content feed through smoothly
to formatted citations. If treating <text variable="number"> alone as a
text variable would better approximate that result, it would be a step
forward.
I have a couple of other tasks that have been queued up for quite a while,
and I'll have to put my focus there for now. For this, it would be good
(whoever implements the change) to put team Zotero in the loop, since
support requests will likely follow.
As a note to file, a text render of the content would need to account for
escaped hyphenation values on this particular field in existing records.
Also, I'm curious about special handling required for <text
variable="number"> in patent items. Maybe that should be shoehorned in as
well if a change is made.
…On Tue, Feb 13, 2024, 10:05 PM Brenton M. Wiernik ***@***.***> wrote:
Thanks for analyzing that! I know you've put a lot of thought into this
over the years, and I really appreciate the inclusion of the override
syntax -- and \- in citeproc-js to force or prevent en dash conversion.
The override syntax means that there is a path for users to get a different
behavior from the default and the question is what should the default be
for number conversion or not?
I think this is more a question of data usage of the number variable
versus style expectations—what is number used for and how is it presented
in databases that import into CSL tools like Zotero?
The most common usage of number in databases is as a verbatim identifier
(eg, report numbers, preprint repository IDs, URNs, etc). It is rarely used
to reflect sequential numbering in databases (eg, journal article numbers
are usually just stored as page unless a user specifically edits their
data after import to a client).
Related to that, the most common type used with number in styles the
repository is report, which I think supports the interpretation that most
cases of number are identifiers, not sequences.
(Whether number should ironically be a non-number variable is a separate
question.)
In sum, if number is primarily used as an identifier, I think defaulting
to not transforming hyphens to en dashes and leaving the override syntax
active makes sense.
—
Reply to this email directly, view it on GitHub
<#239 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAASMSQ75LIEOY3ITW7PAF3YTNQILAVCNFSM6AAAAABDEAG7FOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBRGQ3TINJYG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
If a report number is something like
114-11
(e.g., for a U.S. congressional report) and I render it with<text variable="number"/>
, the hyphen is converted to an en-dash (or other range delimiter). This is incorrect behavior. I'm thinking the best approach would be to not apply any punctuation conversion to thenumber
field.What do you think @fbennett @adam3smith @retorquere ?
The text was updated successfully, but these errors were encountered: