Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a list of TitanCall abbreviations in an easily-accessible form #8

Open
lbeltrame opened this issue Sep 5, 2017 · 9 comments

Comments

@lbeltrame
Copy link
Contributor

lbeltrame commented Sep 5, 2017

As the subject says, having to hunt a supplementary table in a manuscript to see what an abbreviation indicates is awkward (and, IIRC, there is no "supplementary table 2" in the TitanCNA paper).

Ideally this should be in the README.md or in the repo (or in the help files).

@gavinha
Copy link
Owner

gavinha commented Sep 5, 2017

Hi Luca,

Sorry for the inconvenience. I was an oversight on my part not to have put this directly into the R documentation.
I will add these to the documentation. Also, I plan to port over some of the original website material (plus updates) into the GitHub repo wiki. I will try to get to this soon.
For now, the definitions are actually found in Supplementary Table 14 (sorry for the further confusion with the incorrect table number).

homozygous deletion (HOMD),  
hemizygous deletion LOH (DLOH),  
copy neutral LOH (NLOH),  
diploid heterozygous (HET),  
amplified LOH (ALOH),  
gain/duplication of 1 allele (GAIN),  
allele-specific copy number amplification (ASCNA),  
balanced copy number amplification (BCNA),  
unbalanced copy number amplification (UBCNA)

I typically use the Copy_Number, MajorCN, MinorCN columns in the *segs.txt files.

Best,
Gavin

@fpbarthel
Copy link

fpbarthel commented Dec 5, 2018

What are all the possible values for Corrected_Call, I've noticed that in addition to the 9 above there is also the HLAMP class? (not mentioned in Supp Table 14 either)

Also I've observed states > 20. I figured perhaps the table continues in a linear fashion, since there are 2 rows for CN=1, 3 for CN=2, 4 for CN=3, 5 for CN=4, 6 for CN=5. However, I've observed state 21 and 23 for CN=8?

I've copied Supp Table 14 below:

titan_state titan_genotype titan_total_copy_number titan_call titan_call_description
-1 NULL NULL OUT Outlier state
0 NULL 0 HOMD Homozygous deletion
1 A 1 DLOH Hemizygous deletion LOH
2 B 1 DLOH Hemizygous deletion LOH
3 AA 2 NLOH Copy neutral LOH
4 AB 2 HET Diploid heterozygous
5 BB 2 NLOH Copy neutral LOH
6 AAA 3 ALOH Amplified LOH
7 AAB 3 GAIN Gain/duplication of 1 allele
8 ABB 3 GAIN Gain/duplication of 1 allele
9 BBB 3 ALOH Amplified LOH
10 AAAA 4 ALOH Amplified LOH
11 AAAB 4 ASCNA Allele-specific copy number amplification
12 AABB 4 BCNA Balanced copy number amplification
13 ABBB 4 ASCNA Allele-specific copy number amplification
14 BBBB 4 ALOH Amplified LOH
15 AAAAA 5 ALOH Amplified LOH
16 AAAAB 5 ASCNA Allele-specific copy number amplification
17 AAABB 5 UBCNA Unbalanced copy number amplification
18 AABBB 5 UBCNA Unbalanced copy number amplification
19 ABBBB 5 ASCNA Allele-specific copy number amplification
20 BBBBB 5 ALOH Amplified LOH

Here are the (distinct) titan states from two of my TITAN seg output files, which seem to match each other, but not Supplementary Table 14?

TITAN_state TITAN_call Copy_Number
1 DLOH 1
2 NLOH 2
3 HET 2
4 ALOH 3
5 GAIN 3
6 ALOH 4
7 ASCNA 4
8 BCNA 4
9 ALOH 5
12 ALOH 6
21 ASCNA 8
23 UBCNA 8
TITAN_state TITAN_call Copy_Number
0 HOMD 0
1 DLOH 1
2 NLOH 2
3 HET 2
4 ALOH 3
5 GAIN 3
6 ALOH 4
7 ASCNA 4
8 BCNA 4
9 ALOH 5
10 ASCNA 5
11 UBCNA 5
13 ASCNA 6
14 UBCNA 6
15 BCNA 6
17 ASCNA 7
18 UBCNA 7
19 UBCNA 7
20 ALOH 8
21 ASCNA 8
22 UBCNA 8
23 UBCNA 8
24 BCNA 8

EDIT: looking at the code here it looks like this has to do with whether the parameter symmetric is set to TRUE or FALSE. I suppose looking at the two examples I shared above the maximum value is 24, so it's probably set to TRUE. Is this a new feature that was added after the paper was released?

@fpbarthel
Copy link

Working back from the code I reconstructed the following version of supplementary table 14 for when symmetric=TRUE. I'm not exactly sure I got everything correct, @gavinha could you verify?

It may be useful to add this to the wiki somewhere, where it is easily visible.

titan_state titan_genotype titan_total_copy_number titan_call titan_corrected_call titan_call_description titan_corrected_call_description
-1 NULL NULL OUT OUT Outlier state Outlier state
0 NULL 0 HOMD HOMD Homozygous deletion Homozygous deletion
1 A 1 DLOH DLOH Hemizygous deletion LOH Hemizygous deletion LOH
2 AA 2 NLOH NLOH Copy neutral LOH Copy neutral LOH
3 AB 2 HET HET Diploid heterozygous Diploid heterozygous
4 AAA 3 ALOH ALOH Amplified LOH Amplified LOH
5 AAB 3 GAIN GAIN Gain/duplication of 1 allele Gain/duplication of 1 allele
6 AAAA 4 ALOH ALOH Amplified LOH Amplified LOH
7 AAAB 4 ASCNA ASCNA Allele-specific copy number amplification Allele-specific copy number amplification
8 AABB 4 BCNA BCNA Balanced copy number amplification Balanced copy number amplification
9 AAAAA 5 ALOH ALOH Amplified LOH Amplified LOH
10 AAAAB 5 ASCNA ASCNA Allele-specific copy number amplification Allele-specific copy number amplification
11 AAABB 5 UBCNA UBCNA Unbalanced copy number amplification Unbalanced copy number amplification
12 AAAAAA 6 ALOH ALOH Amplified LOH Amplified LOH
13 AAAAAB 6 ASCNA ASCNA Allele-specific copy number amplification Allele-specific copy number amplification
14 AAAABB 6 UBCNA UBCNA Unbalanced copy number amplification Unbalanced copy number amplification
15 AAABBB 6 BCNA BCNA Balanced copy number amplification Balanced copy number amplification
16 AAAAAAA 7 ALOH ALOH Amplified LOH Amplified LOH
17 AAAAAAB 7 ASCNA ASCNA Allele-specific copy number amplification Allele-specific copy number amplification
18 AAAAABB 7 UBCNA UBCNA Unbalanced copy number amplification Unbalanced copy number amplification
19 AAAABBB 7 UBCNA UBCNA Unbalanced copy number amplification Unbalanced copy number amplification
20 AAAAAAAA 8 ALOH HLAMP Amplified LOH High-level amplification
21 AAAAAAAB 8 ASCNA HLAMP Allele-specific copy number amplification High-level amplification
22 AAAAAABB 8 UBCNA HLAMP Unbalanced copy number amplification High-level amplification
23 AAAAABBB 8 UBCNA HLAMP Unbalanced copy number amplification High-level amplification
24 AAAABBBB 8 BCNA HLAMP Balanced copy number amplification High-level amplification

@fpbarthel
Copy link

This is also not documented anywhere I believe, but males seem to get the NEUT corrected copy number designation on X-chromosomal events.

@gavinha
Copy link
Owner

gavinha commented Dec 7, 2018

Thanks @fpbarthel

As I probably mentioned in this issue, earlier, the table of states went down with the original website.

TitanCNA is usually run with symmetric=TRUE because it simplifies the model and has better runtime.

The chromosome X correction is a new addition and like many other things, I haven't had time to provide documentation. The idea with using NEUT by default for chrX is that it is neutral relative to the matched normal. Yes, there is only 1 copy originally, and any copy number changes in the signal should be scaled by half compared to autosomes (which originated with 2 copies). It's a subtle detail but I had to deal with this when analyzing prostate cancer data, so I thought I'd include it into the TitanCNA post-processing.

I just hope that it makes sense. Again for the actual formulation where this idea of using a baseline of 1 copy for chrX in males will lead to a scaling of 1/2, see the writeup here:
https://www.cell.com/cell/fulltext/S0092-8674(18)30648-2#secsectitle0075
under section **Copy number analysis - 10X Genomics WGS data" after step 6 of TITAN.

@fpbarthel
Copy link

Thank you for the clarification!

@sbamin
Copy link

sbamin commented Feb 4, 2019

@gavinha,

In ploidy 3 or more cases, I am seeing GAIN or ASCNA calls (not exceeding corrected copy of 4) in regions where canonical tumor suppressor like CDKN2A, PTEN are located. While other cases have expected HOMD and DLOH calls for these genes, I believe there is a possibility for GAIN/ASCNA regions to have an actual state of a deletion of one or more allele following whole genome duplication event. Comments?

I did enrichment analysis to test whether CDKN2A GAIN calls are in ploidy2 vs ploidy3 cases vs CDKN2A HOMD/DLOH calls (equally present in ploidy2 vs 3 cases). While is 18/19 GAIN calls (CN < 4) are in ploidy3 cases, I think I am missing something here because the way TITANCNA selects optimal solution, if a case had majority of GAIN (2+1) calls, it should fall under ploidy3 solution, and perhaps I need a different way to test significance, if any.

@gavinha
Copy link
Owner

gavinha commented Feb 7, 2019

Hi @sbamin

If regions are first deleted and then genome doubled, then (hopefully) the prediction is copy neutral LOH (NLOH) or amplified LOH (ALOH).

The tumor ploidy parameter represents, in practice, the average tumor copy number across the genome. So your interpretation is correct.

I am less clear what your question is. Do you have examples of plots that can help to illustrate it?

Thanks,
Gavin

@sbamin
Copy link

sbamin commented Feb 8, 2019

Thanks Gavin,

Indeed and early CDKN2A loss have LOH or NLOH for most cases. So, TitanCNA is calling it correctly. For CDKN2A calls with GAIN/ASCNA state, I am labelling them as late events if I've reasonable confidence for WGD based on TitanCNA major:minor CN of >5:<1, ploidy of >4, and segmental logR copy number of > 2 on at least 50% of of at least half of canonical chromosomes (last one is based on https://www.biorxiv.org/content/10.1101/415133v2). Ad-hoc rules for now and open for comments.

Samir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants