-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PubMed IDs #227
Add PubMed IDs #227
Conversation
c4789fa
to
e4a3db6
Compare
Incomplete commit.
Doesn't work because ftp.ncbi.nlm.nih.gov has a robots.txt, ugh.
Even though it's empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the relationship between PMID and PMC? I'm assuming that the pubmed download doesn't include PMC or doi?
It does! The PubMed download consists of 1000+ baseline XML files and 100s of update XML files. Each file conforms to the PubMed DTD, and consists of a list of articles, each of which can have multiple |
So does this now allow transformation between PMID and PMCID? |
Oh that's awesome! We should alert translator to this as well |
Adds 36,980,104 PubMed IDs (PMIDs) and their titles into NodeNorm (but not into NameRes, since I don't think putting the titles in there matches our use-case). Also includes DOI and PMCID mappings. Closes #204.
To implement this, I also added a recursive download option to our FTP download option.