-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix authorsAlpha #11614
Fix authorsAlpha #11614
Conversation
The Output: LaTeX-Source: https://github.com/gi-ev/biblatex-lni/blob/main/basic-test-en.tex If we change something in the behavior, all things need to be aligned. The "ultimate" truth is the Word template at https://gi.de/service/publikationen/lni. But I remember, that also does not cover all cases. |
The current test cases are: static Stream<Arguments> authorsAlpha() {
return Stream.of(
Arguments.of("A+", "Alexander Artemenko and others"),
Arguments.of("A+", "Aachen and others"),
Arguments.of("AB+", "Aachen and Berlin and others"),
Arguments.of("ABC+", "Aachen and Berlin and Chemnitz and others"),
// FIXME: Arguments.of("Aach", "Aachen"),
Arguments.of("AB", "Aachen and Berlin"),
Arguments.of("ABC", "Aachen and Berlin and Chemnitz"),
Arguments.of("ABCD", "Aachen and Berlin and Chemnitz and Düsseldorf"),
Arguments.of("ABC+", "Aachen and Berlin and Chemnitz and Düsseldorf and others"),
Arguments.of("ABC+", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen"),
Arguments.of("ABC+", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen and others"));
} If the test case do not fail at this PR, you found a simpler implementation. |
ABC+ won't work with this one. I had removed the + for DIn 1505-02. |
Examples from the template:
|
After double check with the LNI guidelines, it is OK to remove the +. For proper software engineering:
|
Sorry, I mis-clicked on edit instead of reply.
Meaning - if there is "and others" in the title, what should be done differently? Should it be treated as if we ignore "and others"? First last name of whom? First author? All authors? And when? (is the two letters thing related to/valid only in the case of "and others"?) That is, till three authors, a total of 4 letters would be filled, depending on how many "extra authors" are there after 1. |
I thought, the rules for "exceeding" are clear. Four letters of the first last name. Treat |
A random dude in the net writing something in a comment without trustable sources. - Moreover, his examples lead to different results as shown at #11614 (comment). I think, the pattern the linuxguy said is supported by JabRef too, but I don't find at https://docs.jabref.org/setup/citationkeypatterns |
Side note: Also required for PR-D of CSL4LibreOffice. |
Can you tell me if the following expected outputs are correct/desired?: Case 1: Arguments.of("Al", "Alexander Artemenko and others"),
Arguments.of("Aa", "Aachen and others"),
Arguments.of("Aa", "Aachen and Berlin and others"),
Arguments.of("Aa", "Aachen and Berlin and Chemnitz and others"),
Arguments.of("AB", "Aachen and Berlin"),
Arguments.of("ABC", "Aachen and Berlin and Chemnitz"),
Arguments.of("ABCD", "Aachen and Berlin and Chemnitz and Düsseldorf"),
Arguments.of("Aa", "Aachen and Berlin and Chemnitz and Düsseldorf and others"),
Arguments.of("ABCD", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen"),
Arguments.of("Aa", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen and others")); Case 2: Arguments.of("Al", "Alexander Artemenko and others"),
Arguments.of("Aa", "Aachen and others"),
Arguments.of("AaBe", "Aachen and Berlin and others"),
Arguments.of("AaBeCh", "Aachen and Berlin and Chemnitz and others"),
Arguments.of("AB", "Aachen and Berlin"),
Arguments.of("ABC", "Aachen and Berlin and Chemnitz"),
Arguments.of("ABCD", "Aachen and Berlin and Chemnitz and Düsseldorf"),
Arguments.of("AaBeChDü", "Aachen and Berlin and Chemnitz and Düsseldorf and others"),
Arguments.of("ABCD", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen"),
Arguments.of("AaBeChDü", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen and others")); Case 3: Arguments.of("Al", "Alexander Artemenko and others"),
Arguments.of("Aa", "Aachen and others"),
Arguments.of("AaB", "Aachen and Berlin and others"),
Arguments.of("AaBC", "Aachen and Berlin and Chemnitz and others"),
Arguments.of("AB", "Aachen and Berlin"),
Arguments.of("ABC", "Aachen and Berlin and Chemnitz"),
Arguments.of("ABCD", "Aachen and Berlin and Chemnitz and Düsseldorf"),
Arguments.of("AaBC", "Aachen and Berlin and Chemnitz and Düsseldorf and others"),
Arguments.of("ABCD", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen"),
Arguments.of("AaBC", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen and others")); or anything else? |
In the comment stating requirements, you mentioned two letters of the first last name instead of four (as mentioned here). I would request more clarification with examples for "and others" case with multiple authors (refer the cases above). |
I take #11614 (comment) as ground truth. They use two letters. We follow that. With "exceeding" I meant, it exceeds maximum number of authors. Thus, "and others" means five authors present. Thus, it is case 1. |
Also, the last two examples from the template have:
[Wa14a] and [Wa14b] imply there has to be an alphabetical counter that gets auto-incremented if two instances of authors are exactly the same but the papers are different. CSL uses itemDataProvider, which automatically supplies numbers, which are then taken care of by reference marks. We would need a similar system for this - but also that detects difference only in papers with the same authors. Furthermore, the letter has to be appended after the "Year" part, which is not handled by authorsAlpha. Is this a compulsory requirement for this PR? Or should we just add the other cases to the test case and keep this as an edge case and try to think of it later? |
This is also slightly contradictory, as it does not exceed the maximum number of authors (=4), yet uses the first author's two characters only instead of the first character from all 4. (At least it is well-established for DNI everywhere that there can be 4 authors maximum represented). |
Okay, after some thought, I am skipping these last two cases. Reason: It would also require "future insight" - if the same authors are going to be cited again in future, we have to append an "a" at the end of the first instance right away. Since this will not be possible, the only way is to have targeted editing of the citations as more such entries are cited, which is a hard problem again (auto-refreshment of old citations). |
One by one. Keep focus. Divide and conquer. "Just" fix authorsAlpha. JabRef has this functionality internally already. Check the preferences of the citation key generator. Maybe, it can be reused. Maybe only if a complete bibdatabasecontext is available. Then, the code needs to be refactored. |
@koppor please see if the changes are fine.
(since it did not have any author name in it, I figured by translating, nor commas that would separate a name like Is this OK for now? Also added one-to-one test cases for an auxiliary function I extracted to get last name, it was common for both single author and "and others" case. |
I highlight the current text regarding the branches. I think, points 2 and 3 need to be merged. And refined with the two git commands.
There is no choice. It is always |
My bad, that was unclear. I meant branching from the fork (origin) vs the main repo (upstream). I also saw Loay mention that builds are generated for branches from upstream. |
Summary of last two commits (resolution of failing unit tests):
|
Arguments.of("ABC+", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen"), | ||
Arguments.of("ABC+", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen and others")); | ||
Arguments.of("Aa", "Aachen and Berlin and Chemnitz and Düsseldorf and others"), | ||
Arguments.of("ABCD", "Aachen and Berlin and Chemnitz and Düsseldorf and Essen"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does exceed 4 authors, why is this not Aa
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our discussion, we would do that if there is "and others". (Refer Case "1"). Furthermore, this has been established ever since we started talking about DIN. I can do only the first author's two letters of last name in case of exceeding authors but note that everywhere I have looked, in case of exceeding authors (and no "and others"), the abbreviation of first four authors are taken.
You can try citing 10.3945/ajcn.111.023457 using citationmachine as well: https://www.citationmachine.net/bibliographies/a446c586-d900-4583-a249-72d7b9bc0b52
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the LNI template: #11614 (comment)
We need to setup a minimal latex+bibtex document with authorsalpha with the test cases and see what that returns.
static String getLastName(Author author) { | ||
String[] nameParts = author.getNamePrefixAndFamilyName() | ||
.replaceAll("\\s+", " ") | ||
.trim() | ||
.split(" "); | ||
return nameParts[nameParts.length - 1]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't you use org.jabref.model.entry.Author#getFamilyName
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
@subhramit It is so great to see that now two persons (you and me) are in the context of Can you add a test for institutions? In BibTeX, they are written as I am curious whether the thing works there, too. |
It is good enough for now. What I understood:
What I miss:
Reason for the last point: We exchange information at multiple different places (which makes it scattered), but the core knowledge is not condensed. What if someone looks into that in one year - and if none of us both is available? --> Documentaiton enables asynchronous work. Follow-up TODOs:
|
static String getLastName(Author author) { | ||
String[] nameParts = author.getNamePrefixAndFamilyName() | ||
.replaceAll("\\s+", " ") | ||
.trim() | ||
.split(" "); | ||
return nameParts[nameParts.length - 1]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code hint - if you encounter a similar case in the future, there is a more intuitive code IMHO:
https://www.w3schools.com/java/ref_string_lastindexof.asp
int pos = string.lastIndexOf(" ");
String result = string.substr(pos +1);
I have handled the case of institutions now. |
if (lastName.startsWith("{") && lastName.endsWith("}")) { | ||
alphaStyle.append(getOrganizationInitials(lastName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrite using org.jabref.logic.formatter.bibtexfields.RemoveEnclosingBracesFormatter
- format the string using that formatter
- if result is not equal -> you found an organization
- Pass that string to
getOrganizationInitials
- the first two lines can be removed then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be better than before 😅.
Follow-ups as discussed.
Where is the issue? There was no link in the PR description here - and at the issue tracker, I only found JabRef/user-documentation#453 |
The issue is mentioned in the PR description:
I am/was not aware of any other existing issues around this. We started brainstorming when the requirement for adding alphanumeric style support in CSL came up (in the above pull/issue). |
Fix authorsAlpha
[Auxiliary PR, required for the GSoC '24 CSL4LibreOffice project]
Fix for the function
authorsAlpha
inBracketedPattern
which returns author representation for the purpose of alphanumeric styles.Adapted for the purpose of DIN 1505-02.
Changes:
Investigation ref. #11577 and issue subhramit#17
References:
https://forums.zotero.org/discussion/12239/style-request-lecture-notes-in-informatics-lni-from-the-germany-computer-science-society
https://forums.zotero.org/discussion/82660/format-din-1505-2-alphanumeric-german-standard-superseded-by-iso-690
https://forums.zotero.org/discussion/82631/problem-with-citation-style-din-1505-2-alphanumeric-german-and-word-integration
Template attached to first link: ISSS-word-template.docx
Mandatory checks
CHANGELOG.md
described in a way that is understandable for the average user (if applicable)