Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interactive mode #5

Open
vsoch opened this issue Jun 30, 2020 · 11 comments
Open

Add interactive mode #5

vsoch opened this issue Jun 30, 2020 · 11 comments
Assignees

Comments

@vsoch
Copy link
Collaborator

vsoch commented Jun 30, 2020

No description provided.

@vsoch vsoch self-assigned this Jun 30, 2020
@yarikoptic
Copy link
Member

crazy idea -- if it was a bot -- it could have presented those options as an issue (or just part of the PR description) with

- [ ]

for choices, so we could just select the one we want to add (or not)...
e.g.

- [x] Important one
  - [ ] [orcid1](https://orcid.org/orcid1)  (name, affiliation etc)
  - [ ] ...
- [x] The one we decide to ignore and just uncheck the [x]
  - [ ] [orcid1](https://orcid.org/orcid1)  (name, affiliation etc)
  - [ ] ...

and then monitor for some @con/tributors process in a comment so it would
take that markedup description and use it to actually do all the choices
etc.

@vsoch
Copy link
Collaborator Author

vsoch commented Jul 30, 2020

This is a cool idea, although it would probably need to be a separate action from the main con/tributors (I'm not sure how one GitHub repo can serve more than one). I think to start wouldn't it be more reasonable to make an actually interactive mode, to run on the command line, and ask the user to choose from a list?

In terms of metadata, it's not trivial to just list all the names /affiliations, because the initial request just returns orcids, and a follow up request shows metadata. So if we find 30 results, that means 30 calls to get all of the detailed metadata and then prompt the user. Is that a reasonable thing to do? I think it's just as easy to see the list, and then copy paste the identifiers into a URL to look carefully at the records. In practice just the name and affiliation isn't enough - a lot of users just have a name and you need to use papers, etc. to actually figure out the affiliation.

@vsoch
Copy link
Collaborator Author

vsoch commented Jul 30, 2020

@yarikoptic this won't work because here is the metadata that we get back:

[{'orcid-identifier': {'uri': 'https://orcid.org/0000-0001-9374-7098',
   'path': '0000-0001-9374-7098',
   'host': 'orcid.org'}},
 {'orcid-identifier': {'uri': 'https://orcid.org/0000-0001-9750-2514',
   'path': '0000-0001-9750-2514',
   'host': 'orcid.org'}},
 {'orcid-identifier': {'uri': 'https://orcid.org/0000-0003-3181-8561',
   'path': '0000-0003-3181-8561',
   'host': 'orcid.org'}},
 {'orcid-identifier': {'uri': 'https://orcid.org/0000-0003-0925-2012',
   'path': '0000-0003-0925-2012',
   'host': 'orcid.org'}}]

And the API works to look up based on metadata, but you can't just get full metadata for any orcid (it only works for your own). So we could add an interactive mode to ask the user to select an orcid (and they would need to still open the browser to do it) or we could keep as is and give them the list, and then assume they are skilled at opening a text file and copy pasting the entry.

@yarikoptic
Copy link
Member

rright, according to https://orcid.org/organizations/integrators/API it would require "Basic Member API" to "Search/retrieve member-subscriber data Subject to permissions granted by iD holders" and according to https://orcid.org/about/membership "Standard (single legal entity): US$5,150 " (- some discounts) which is quite obnoxious IMHO. I would have probably paid ~100$ to support/use ... but hey -- Dartmouth is a member without any contact information :-(. I will inquire. If I can get a token I could use, then I will see what it would return!

@vsoch
Copy link
Collaborator Author

vsoch commented Jul 31, 2020

Awesome! Yes please let me know and I can update here to support it, if it's something we could reasonably do.

@yarikoptic
Copy link
Member

uff, ok - looked at https://pub.orcid.org/v3.0/#!/Development_Public_API_v3.0/viewRecordv3 and it seems you do not even need any TOKEN to access public records. So, as long as you have (a candidate) orcid id already, it seems to possible to retrieve an entire public record, e.g.

$> curl --silent -X GET --header 'Accept: application/json' 'https://pub.orcid.org/v3.0/0000-0003-3456-2493' | jq . > /tmp/myorcid.json

$> grep email /tmp/myorcid.json
    "verified-email": true,
    "verified-primary-email": true
    "emails": {
      "email": [
          "email": "[email protected]",
      "path": "/0000-0003-3456-2493/email"

from which you could display name, affiliation(s), etc. It seems no token is even needed for basic search:

$> curl --silent -X GET --header 'Accept: application/json' 'https://pub.orcid.org/v3.0/expanded-search?q=Vanessa+AND+Sochat' | jq . | head -n 20
{
  "expanded-result": [
    {
      "orcid-id": "0000-0002-4387-3819",
      "given-names": "Vanessa",
      "family-names": "Sochat",
      "credit-name": null,
      "other-name": [
        "Vanessasaurus"
      ],
      "email": [],
      "institution-name": [
        "Stanford University School of Medicine"
      ]
    }
  ],
  "num-found": 1
}

@vsoch
Copy link
Collaborator Author

vsoch commented Aug 3, 2020

okay so let's say that we do this - and that we get a result of N=400 for some other names query. Then we would do 400 other requests just to show the user a list? :/

@yarikoptic
Copy link
Member

so why did we need token at all? It seems to be doing quite good job for me as well:

$> curl --silent -X GET --header 'Accept: application/json' 'https://pub.orcid.org/v3.0/expanded-search?q=Yaroslav+AND+Halchenko' | jq .             
{
  "expanded-result": [
    {
      "orcid-id": "0000-0003-3456-2493",
      "given-names": "Yaroslav",
      "family-names": "Halchenko",
      "credit-name": null,
      "other-name": [
        "Ярослав Олеговіч Гальченко"
      ],
      "email": [
        "[email protected]"
      ],
      "institution-name": [
        "Center for Open Neuroscience",
        "Dartmouth College",
        "Debian Project",
        "New Jersey Institute of Technology",
        "Rutgers University",
        "University of New Mexico",
        "Vinnytsia State Technical University"
      ]
    }
  ],
  "num-found": 1
}

for Michael Hanke it brings some false positives etc, actually not even sure if real one among them -- good example where showing possibly emails etc would be of help to disambiguate.

The question is though -- why needed to do all the API token dance? ;)

@yarikoptic
Copy link
Member

note: doesn't tollerate unicode well ;)

$> curl --silent -X GET --header 'Accept: application/json' 'https://pub.orcid.org/v3.0/expanded-search?q=Ярослав+AND+Олеговіч+AND+Гальченко'
<!doctype html><html lang="en"><head><title>HTTP Status 400 – Bad Request</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 400 – Bad Request</h1><hr class="line" /><p><b>Type</b> Exception Report</p><p><b>Message</b> Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986</p><p><b>Description</b> The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).</p><p><b>Exception</b></p><pre>java.lang.IllegalArgumentException: Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986
	org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:483)
	org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:502)
	org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
	org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:810)
	org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1623)
	org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
	java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	java.lang.Thread.run(Thread.java:748)
</pre><p><b>Note</b> The full stack trace of the root cause is available in the server logs.</p><hr class="line" /><h3>Apache Tomcat/8.5.50</h3></body></html>% 

@vsoch
Copy link
Collaborator Author

vsoch commented Aug 3, 2020

Good question - I never got it to work without the token! Let me give that a try.

@vsoch
Copy link
Collaborator Author

vsoch commented Aug 3, 2020

Yep that works! The difference is that you are using expanded-search and not regular search, which I had never tried. I'll update the current PR to do this, and also add interactive mode since we have the metadata available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants