Skip to content

Open-source software pipeline for cancer classification from high-throughput data using machine learning.

License

Notifications You must be signed in to change notification settings

HelikarLab/CancerDiscover

Repository files navigation


CancerDiscover

A data mining suite for cancer classification

CancerDiscover is an open source command line pipeline tool (released under the GNU General Public License v3) that allow users to efficiently and automatically process large high-throughput datasets by converting data (for example CEL files, etc.), normalizing, and selecting best performing features from multiple feature selection algorithms. The pipeline lets users apply different feature thresholds and various learning algorithms to generate multiple prediction models that distinguish different types and subtypes of cancer.

Cite: If you use our tool, please cite Mohammed, A., Biegert, G., Adamec, J., & Helikar, T. (2018). CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data. Oncotarget, 9(2), 2565–2573. (https://doi.org/10.18632/oncotarget.23511)

Note: CancerDiscover is an open-source software, in case if you run across bugs or errors, raise an issue over here.

Table of Contents

This README file will serve as a guide for using this software tool. We suggest reading through the document, in order to get an idea of the options available, and how to customize the pipeline to fit your needs.

System Requirements

You will need current or very recent generations of your operating system: Linux OS, Mac OSX.

Downloading CancerDiscover and Dependencies

curl -sL bit.do/installation_linux | sh
curl -sL bit.do/installation_mac | sh

To install CancerDiscover dependencies right from scratch, check out our exhaustive guides:

Directory Structure of the Pipeline

Execution of Pipeline

Contribution

Dr. Akram Mohammed [email protected]

Dr. Tomas Helikar (PI) [email protected]

Dr. Jiri Adamec [email protected]

Greyson Biegert [email protected]

License

This software has been released under the GNU General Public License v3.

About

Open-source software pipeline for cancer classification from high-throughput data using machine learning.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages