scikit-bio is an open-source, BSD-licensed Python 3 package providing data structures, algorithms and educational resources for bioinformatics.
We are excited to announce the resurgence of scikit-bio (https://scikit.bio)! With a re-assembled developer team and funding from the DOE, we're back with renewed vigor in 2024!
Our vision is to expand scikit-bio into a more robust and versatile library, catering to the ever-growing demands of multi-omic data analysis. This resurgence marks a new chapter in which we will focus on:
- Streamlining the analysis of diverse, massive omic data, emphasizing efficiency and versatility.
- Integrating advanced techniques for multi-omic integration to unravel the complex interplay between biological systems and environments.
- Implementing methods for modeling and annotating biological features utilizing community ecology and phylogenetics.
We invite the scientific community to join us in shaping the future of scikit-bio. Your expertise, feedback, and contributions will be the driving force behind this exciting phase.
Stay tuned for updates, and let's innovate together for a deeper understanding of bio-complexities!
Your questions, ideas, and contributions matter!
Join our community on GitHub Discussions: This is your go-to place for asking questions, sharing insights, and participating in discussions about scikit-bio. Engage with both the developers and fellow users here.
Report issues and bugs: If you encounter specific problems when using scikit-bio, let us know directly through the GitHub Issues page. Your reports are vital for the continuous improvement of scikit-bio.
Wanna contribute? We enthusiastically welcome community contributors! Whether it's adding new features, improving code, or enhancing documentation, your contributions drive scikit-bio and open-source bioinformatics forward. Start your journey by reading the Contributor's guidelines.
Visit the new scikit-bio website: https://scikit.bio to learn more about this project.
Latest release: 0.5.9 (documentation, changelog). Compatible with Python 3.8 and above.
Install the latest release of scikit-bio using conda
:
conda install -c conda-forge scikit-bio
Or using pip
:
pip install scikit-bio
Verify the installation:
python -m skbio.test
See further instructions on installing scikit-bio on various platforms.
Some of the projects that we know of that are using scikit-bio are:
- QIIME 2, Qiita, Emperor, tax2tree, ghost-tree, Platypus-Conquistador, An Introduction to Applied Bioinformatics.
scikit-bio is available under the new BSD license. See LICENSE.txt for scikit-bio's license, and the licenses directory for the licenses of third-party software that is (either partially or entirely) distributed with scikit-bio.
Our core development team consists of three lead developers: Dr. Qiyun Zhu at Arizona State University (ASU) (@qiyunzhu), Dr. James Morton at Gutz Analytics (@mortonjt), and Dr. Daniel McDonald at the University of California San Diego (UCSD) (@wasade), one software engineer: Matthew Aton (@mataton) and one bioinformatician: Dr. Lars Hunger (@LarsHunger). Dr. Rob Knight at UCSD (@rob-knight) provides guidance on the development and research. Dr. Greg Caporaso (@gregcaporaso) at Northern Arizona University (NAU), the former leader of the scikit-bio project, serves as an advisor on the current project.
We thank the many contributors to scikit-bio. A complete list of contributors to the scikit-bio codebase is available at GitHub. This however may miss the larger community who contributed by testing the software and providing valuable comments, who we hold equal appreciation to.
Wanna contribute? We enthusiastically welcome community contributors! Whether it's adding new features, improving code, or enhancing documentation, your contributions drive scikit-bio and open-source bioinformatics forward. Start your journey by reading the Contributor's guidelines.
The development of scikit-bio is currently supported by the U.S. Department of Energy, Office of Science under award number DE-SC0024320, awarded to Dr. Qiyun Zhu at ASU (lead PI), Dr. James Morton at Gutz Analytics, and Dr. Rob Knight at UCSD.
If you use scikit-bio for any published research, please see our Zenodo page for how to cite.
For collaboration inquiries and other formal communications, please reach out to Dr. Qiyun Zhu at [email protected]. We welcome academic and industrial partnerships to advance our mission.
The logo of scikit-bio was created by Alina Prassas. Vector and bitmap image files are available at the logos directory.
scikit-bio began from code derived from PyCogent and QIIME, and the contributors and/or copyright holders have agreed to make the code they wrote for PyCogent and/or QIIME available under the BSD license. The contributors to PyCogent and/or QIIME modules that have been ported to scikit-bio are listed below:
- Rob Knight (@rob-knight), Gavin Huttley (@gavinhuttley), Daniel McDonald (@wasade), Micah Hamady, Antonio Gonzalez (@antgonza), Sandra Smit, Greg Caporaso (@gregcaporaso), Jai Ram Rideout (@jairideout), Cathy Lozupone (@clozupone), Mike Robeson (@mikerobeson), Marcin Cieslik, Peter Maxwell, Jeremy Widmann, Zongzhi Liu, Michael Dwan, Logan Knecht (@loganknecht), Andrew Cochran, Jose Carlos Clemente (@cleme), Damien Coy, Levi McCracken, Andrew Butterfield, Will Van Treuren (@wdwvt1), Justin Kuczynski (@justin212k), Jose Antonio Navas Molina (@josenavas), Matthew Wakefield (@genomematt) and Jens Reeder (@jensreeder).