Skip to content

A self-supervised learning framework for multi-omics cancer data

License

Notifications You must be signed in to change notification settings

hashimsayed0/self-omics

Repository files navigation

Self-omics: A Self-supervised Learning Framework for Multi-omics Cancer Data

Published in Pacific Symposium on Biocomputing © 2022 World Scientific Publishing Co., Singapore, https://psb.stanford.edu/

Paper: https://psb.stanford.edu/psb-online/proceedings/psb23/hashim.pdf

Architecture

pretext_arch (1)

Create environment

To create a conda environment using the environment file given, run the command given below:

conda env create -f environment.yml
conda activate self-omics

Prepare data

  1. Data can be downloaded from UCSC Xena Data Portal using the following links
  1. Rename gene expression data as A.tsv, DNA methylation data as B.tsv, and miRNA expression dataset as C.tsv
  2. Place the files in data folder
  3. (Optional) Run cells in notebooks/preprocessing.ipynb to convert .tsv files to .npy files. This helps in loading data quicker as well as alleviating memory issues.

Steps to run the code

  1. Clone this repository: git clone https://github.com/hashimsayed0/self-omics.git
  2. Change directory to this project folder: cd self-omics
  3. Edit scrips/train.sh as you like and run the script: sh ./scripts/train.sh
  4. Logs will be uploaded to wandb once you login and models will be saved in checkpoints folder

Acknowledgments

Code for a few functions and networks was taken from the repository OmiEmbed and modified as needed.

About

A self-supervised learning framework for multi-omics cancer data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published