Step 1: Link count files with phenotypes

This description applies to each directory. Here I will use thca as an example.

Step 1: Link count files with phenotypes

The convert.sh file prepares the text files necessary to create the count and phenotype Robjects.

First, create 3 files

File with absolute path to each count file

ls /group/stranger-lab/grossman_cancer_data/thca/*.gz > thca_files.txt

File provided by Allison to link GDC UUID with TCGA UUID

thca_htseq.txt

File of phenotype data, 3 columns with TCGA_ID, TCGA_UUID, and Gender

Created by:

grep bcr_patient_barcode All_CDEs.txt | tr '\t' '\n' > barcode.txt
grep bcr_patient_uuid All_CDEs.txt | tr '\t' '\n' > uuid.txt
grep gender All_CDEs.txt | tr '\t' '\n' > gender.txt

paste barcode.txt uuid.txt gender.txt > phens.txt

Run the convert.sh script with these input files

sh convert.sh thca_files.txt thca_htseq.txt phens.txt

This creates the id_conversions.txt file read by the expr.R script.

Step 2 Generate count and phenotype R objects

First, create a file with duplicated IDs

cut -f 1 id_conversions.txt | sort | uniq -d > duplicated.txt

This was used previously to remove individuals with both tumor and normal samples.

Run the expr.R script, which filters out duplicated ids, and sorts both the phen and count Robjects to have the same individuals in the same order.

Step 3 Run the differential expression analysis

The count_DE.R script takes the previously generated count and phenotype Robjects and runs the DESeq2 sex-biased differential expression analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
kirp		kirp
lgg		lgg
luad		luad
thca		thca
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Step 1: Link count files with phenotypes

Step 2 Generate count and phenotype R objects

Step 3 Run the differential expression analysis

About

Releases

Packages

Languages

cczysz/gdc

Folders and files

Latest commit

History

Repository files navigation

Step 1: Link count files with phenotypes

Step 2 Generate count and phenotype R objects

Step 3 Run the differential expression analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages