-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BIPM requested fixes 10: "Milton J" should be "Milton M" #230
Comments
The fetched value is as follows (BIPM Metrologia 56 2 022001):
Ping @andrew2net |
or at most
|
@ronaldtse I've checked the source dataset. It has inconsistency in person names:
I don't see staring way to correct parse all the cases. Isn't it a good idea just concatenate given name and surname save them as a full name string? |
@andrew2net for "source dataset" did you mean the IOP Metrologia XML, which has that article like this? The source XML is: <contrib contrib-type="author" xlink:type="simple">
<contrib-id authenticated="false" contrib-id-type="orcid">0000-0002-8174-2211</contrib-id>
<name name-style="western">
<surname>Milton</surname>
<given-names>Martin J T</given-names>
</name>
<xref ref-type="aff" rid="affiliation01">1</xref>
</contrib> So this means that there is a problem with parsing source XML -- is it this issue? |
@ronaldtse I've updated the parser to solve this issue. This case is not a big problem. To be sure that the update won't cause other problems I've checked the what names there are in the rawdata-bipm-metrologia dataset. It revealed that there are many others problem with names consistency in the dataset. The relaton/relaton-bipm#2 issue is about documents duplications. I'll answer to your comment in the issue.
|
Can you point out what the inconsistencies are? @MStock78120 and @jmilesBIPM would be interested in finding out the issues with the bibliographic encodings at Metrologia. Thanks. |
Some examples are here in format "given name", "surname"
You can see that name parts can be all in surname or distributed between surname and given name unpredictably. Affixes can be capitalized or not. Even if we managed to make rules to parse forename, surname, initials, prefixes, and additions, I doubt that we'll be able to restore original name form the parts. I think if we use given-name + surname string as full name, it will give us original name. We have FYI since the issue we use branch |
@jmilesBIPM it seems that the Metrologia is the culprit -- can we request them to update this data to correct the names? |
The online version of the article in question (https://doi.org/10.1088/1681-7575/ab0013) displays the full name (given name plus surname) of all the authors correctly: e.g. "Martin J T Milton" Isn't it possible for you just to display these two fields as provided in the XML files? I see in the first message from @anermina that the XML gave <name>
<forename language="en" script="Latn" initial="J">Martin</forename>
<forename initial="T"/>
<surname language="en" script="Latn">Milton</surname>
</name> This is admittedly more long winded than one might expect, but the answer seems to be forename + forename initial given with forename + forename initial given separately + surname = Martin J T Milton |
@ronaldtse why don't we just save "given_name" + "surname" as a "fullname"? I don't think it's possible to parse all the names correctly. |
In that case we'd have "Martin T Milton" here? Close enough! |
@jmilesBIPM in the source we have <name name-style="western">
<surname>Milton</surname>
<given-names>Martin J T</given-names>
</name> So "given-name" + "surname" will be "Martin J T Milton". @ronaldtse I just noticed that there is "name-style" attribute. May be we can use it to parse name parts correctly. |
There is only "western" name style. |
From Michael Stock:
The very last reference (no. 112 in the English text, no.111 in the French text) has a typo: it should be Milton M not Milton J.
The text was updated successfully, but these errors were encountered: