Skip to content

sckott/pytaxize

Folders and files

NameName
Last commit message
Last commit date

Latest commit

dc4a2ed · Sep 11, 2020
Jul 14, 2020
Jul 21, 2020
Sep 11, 2020
Jul 21, 2020
Jul 10, 2020
Jul 14, 2020
Aug 10, 2019
Jun 15, 2020
Feb 6, 2020
Jun 15, 2020
Aug 8, 2019
Jul 14, 2020
Jun 15, 2020
Jul 13, 2020
Jul 13, 2020
Jul 13, 2020

Repository files navigation

pytaxize

pypi travis coverage black

This is a port of the R package taxize. There is a lot going on in the R version of this library, so it will take a while to get all the same functionality over here.

Why? A significant advantage of a Python version of taxize will be for those that are pythonistas at heart. Also, you could use pytaxize in a web app, whereas you could with taxize (e.g., in a Shiny app), but it wouldn't scale, be very fast, etc.

python 2/3

pytaxize is only developed in and tested with Python 3

Installation

Stable from pypi

pip install pytaxize

Development version

sudo pip install git+git://github.com/sckott/pytaxize.git#egg=pytaxize

Taxonomic Ids

I've started working on a class interface for taxonomic IDs, which will have a bunch of extension methods to do various things with taxon ids. What's available right now is just getting COL ids.

from pytaxize import Ids
res = Ids('Poa annua')
res.ncbi()
res.ids
{'Poa annua': [{'id': '93036',
   'name': 'Poa annua',
   'rank': 'species',
   'uri': 'https://www.ncbi.nlm.nih.gov/taxonomy/93036'}]}

Vascan search

import pytaxize
pytaxize.vascan_search(q = ["Helianthus annuus"])
{u'apiVersion': u'0.1',
 u'results': [{u'matches': [{u'canonicalName': u'Helianthus annuus',
     u'distribution': [{u'establishmentMeans': u'introduced',
       u'locality': u'NS',
       u'locationID': u'ISO 3166-2:CA-NS',
       u'occurrenceStatus': u'introduced'},
      {u'establishmentMeans': u'',
       u'locality': u'PE',
       u'locationID': u'ISO 3166-2:CA-PE',
       u'occurrenceStatus': u'excluded'},
      {u'establishmentMeans': u'',
       u'locality': u'NT',
       u'locationID': u'ISO 3166-2:CA-NT',
       u'occurrenceStatus': u'doubtful'},
      {u'establishmentMeans': u'introduced',

Scrape taxonomic names

out = pytaxize.scrapenames(url = 'http://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf')
out['data'][0:3]
[{'verbatim': '(Hemiptera:',
  'scientificName': 'Hemiptera',
  'offsetStart': 222,
  'offsetEnd': 233},
 {'verbatim': 'Sternorrhyncha:',
  'scientificName': 'Sternorrhyncha',
  'offsetStart': 234,
  'offsetEnd': 249},
 {'verbatim': 'Coccoidea:',
  'scientificName': 'Coccoidea',
  'offsetStart': 250,
  'offsetEnd': 260}]

ITIS low level functions

from pytaxize import itis
itis.accepted_names(504239)

{'acceptedName': 'Dasiphora fruticosa',
   'acceptedTsn': '836659',
   'author': '(L.) Rydb.'}
itis.comment_detail(tsn=180543)

[{'commentDetail': 'Status: CITES - Appendix I as U. arctos (Mexico, Bhutan, China, and Mongolia populations) and U. a. isabellinus; otherwise Appendix II. U. S. ESA - Endangered as U. arctos pruinosus, as U. arctos in Mexico, and as U. a. arctos in Italy. Threatened as U. a. ho...',
  'commentId': '18556',
  'commentTimeStamp': '2007-08-20 15:06:38.0',
  'commentator': 'Wilson & Reeder, eds. (2005)',
  'updateDate': '2014-02-03'},
 {'commentDetail': "Comments: Reviewed by Erdbrink (1953), Couturier (1954), Rausch (1963a), Kurtén (1973), Hall (1984) and Pasitschniak-Arts (1993). Ognev (1931) and Allen (1938) recognized U. pruinosus as distinct; not followed by Ellerman and Morrison-Scott (1951), Gao (1987), and Stroganov (1962). Lönnberg (1923b) believed that differences between pruinosus and arctos warranted subgeneric distinction as (Mylarctos) pruinosus; however, this was not supported by Pocock's (1932b) thorough revision. Synonyms allocated a...",
  'commentId': '18557',
  'commentTimeStamp': '2007-08-20 15:06:38.0',
  'commentator': 'Wilson & Reeder, eds. (2005)',
  'updateDate': '2014-02-03'}]
itis.hierarchy_up(tsn = 36485)

{'author': 'Raf.',
 'parentName': 'Asteraceae',
 'parentTsn': '35420',
 'rankName': 'Genus',
 'taxonName': 'Agoseris',
 'tsn': '36485'}

Catalogue of Life

from pytaxize import col
x = col.children(name=["Apis"])
x[0][0:3]
[{'id': '7a4a38c5095963949d6d6ec917d471de',
  'name': 'Apis andreniformis',
  'rank': 'Species'},
 {'id': '39610a4ceff7e5244e334a3fbc5e47e5',
  'name': 'Apis cerana',
  'rank': 'Species'},
 {'id': 'e1d4cbf3872c6c310b7a1c17ddd00ebc',
  'name': 'Apis dorsata',
  'rank': 'Species'}]

Parse names

Parse names using GBIF's parser API

from pytaxize import gbif
gbif.parse(name=['Arrhenatherum elatius var. elatius',
     'Secale cereale subsp. cereale', 'Secale cereale ssp. cereale',
   'Vanessa atalanta (Linnaeus, 1758)'])
[{'scientificName': 'Arrhenatherum elatius var. elatius',
  'type': 'SCIENTIFIC',
  'genusOrAbove': 'Arrhenatherum',
  'specificEpithet': 'elatius',
  'infraSpecificEpithet': 'elatius',
  'parsed': True,
  'parsedPartially': False,
  'canonicalName': 'Arrhenatherum elatius elatius',
  'canonicalNameWithMarker': 'Arrhenatherum elatius var. elatius',
  'canonicalNameComplete': 'Arrhenatherum elatius var. elatius',
  'rankMarker': 'var.'},
 {'scientificName': 'Secale cereale subsp. cereale',
  'type': 'SCIENTIFIC',
  ...

Contributors

Meta