Skip to content

Python library to deal with taxonomic IDs and lineages from the NCBI's Taxdump files

License

Notifications You must be signed in to change notification settings

CVUA-RRW/taxidtools

Repository files navigation

CD/CI PyPI - License GitHub release (latest by date) Conda Version Pypi Version Docker Image Version DOI

TaxidTools - A Python Toolkit for Taxonomy

taxidTools is a Python library to handle Taxonomy definitions.

Highlights

  • Load taxonomy defintions for the NCBI's taxdump files
  • Prune, filter, and normalize branches
  • Save as JSON for later use
  • Determine consensus, last common ancestor, or distances
  • Retrieve ancestries or list descendants
  • Export as Newick trees

Installation

With pip:

pip install taxidtools

With conda:

conda install -c conda-forge taxidtools

With docker:

docker pull gregdenay/taxidtools

Quickstart

With the NCBI's taxdump files installed locally:

>>> import taxidTools
>>> tax = taxidTools.read_taxdump('nodes.dmp', 'rankedlineage.dmp', 'merged.dmp')
>>> tax.getName('9606')
'Homo sapiens'
>>> lineage = tax.getAncestry('9606')
>>> lineage.filter()
>>> [node.name for node in lineage]
['Homo sapiens', 'Homo', 'Hominidae', 'Primates', 'Mammalia', 'Chordata', 'Metazoa']
>>> tax.lca(['9606', '10090']).name
'Euarchontoglires'
>>> tax.distance('9606', '10090')
18

Documentation

Full documentation is hosted on the homepage

Cite us

If you use taxidTools for your reasearch, you can cite it using the DOI at the top of this page.