Inconsistencies across intron/exon boundaries #655

cassiemk · 2023-05-23T16:39:47Z

We have a number of variants at the intron/exon or exon/intron boundary that return no protein change that we believe should be treated as coding because the splice site & region remain completely intact but return no var_p.

In [1]: hgvs_c = "NM_004380.2:c.3251-1dup"
In [2]: var_c = parse(hgvs_c)
In [3]: c_to_p(var_c)
Out[3]: SequenceVariant(ac=NP_004371.2, type=p, posedit=None, gene=None)

In [4]: hgvs_c = "NM_004380.2:c.3250_3250+1insT"
In [5]: var_c = parse(hgvs_c)
In [6]: c_to_p(var_c)
Out[6]: SequenceVariant(ac=NP_004371.2, type=p, posedit=None, gene=None)

In [7]: hgvs_c = "NM_004380.2:c.3251-1_3251insA"
In [8]: var_c = parse(hgvs_c)
In [9]: c_to_p(var_c)
Out[9]: SequenceVariant(ac=NP_004371.2, type=p, posedit=None, gene=None)

While other variants at the boundary return a protein change.
In [10]: hgvs_c = "NM_004380.2:c.3251dup"
In [11]: var_c = parse(hgvs_c)
In [12]: c_to_p(var_c)
Out[12]: SequenceVariant(ac=NP_004371.2, type=p, posedit=(Phe1085LeufsTer2), gene=None)

It seems like it's deciding if it's coding or not based on the var_c nomenclature (the presence of +/-1 in this case) rather than biology.

katiestahl · 2023-09-18T22:05:31Z

@cassiemk @reece can likely explain this better, but I will try to give it a shot!

You are correct; the package does return no protein change for converted sequence variants based on the nomenclature when offsets are provided, like in your top 3 examples.

I believe this is working as designed, because we cannot guarantee that every splice site/region will be unaffected/remain intact by intronic variants.

I am unsure if there are plans to change this or add edge cases for specific variants where the coding regions are not affected. I will defer to Reece to comment on that.

gostachowiak · 2023-09-19T01:32:39Z

@katiestahl
when there's an insertion right at the intron/exon boundary, there is a choice to make. Should the inserted material be treated as part of the coding region (because the canonical splice site and in fact the entire intron is intact), or as part of the intron (because it is adjacent to the canonical splice site).

Currently, the behavior is inconsistent.

the 4 examples from the original issue are all insertions right at the boundary. 3 of them are treated as intronic, and 1 is treated as CDS. And the difference seems to be arbitrary, based on whether the cdot nomenclature includes an intronic position or not. So a conscious decision has not yet been made.

We have a developer working on updating the logic so that insertions at the boundary are treated as CDS, and were planning a pull request sometime soon once we get all of our tests passing. This seems to be the more common choice, and is the choice that our users seem to expect.

So the immediate task would be to see if we can come to alignment about which decision is most appropriate for insertions right at the boundary. As far as I can tell, HGVS (the society) doesn't have any guidance on this situation (they don't talk much about the right decisions to make for edge cases when projecting DNA changes onto transcripts).

Reasons we think these insertions should be treated as part of the coding region:

the canonical splice site, and the entire intron are fully intact
it's very difficult to say what actually happens in the cell, and is certainly context dependent, context that we don't have. So it should default to the most "visible" change, i.e. one with an aa-change. For example, to avoid any filters that eliminate non-coding variants

github-actions · 2023-12-08T01:57:24Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions · 2023-12-16T01:54:46Z

This issue was closed because it has been stalled for 7 days with no activity.

gostachowiak · 2023-12-16T03:13:25Z

Would it be possible to re-open this issue? It is a flaw with a PR out to fix it.

github-actions · 2024-03-19T01:49:15Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

b0d0nne11 mentioned this issue Oct 26, 2023

Fix variant region for ins and dup on intron-exon boundary #709

Closed

github-actions bot added the stale Issue is stale and subject to automatic closing label Dec 8, 2023

github-actions bot added the closed-by-stale label Dec 16, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 16, 2023

reece reopened this Dec 18, 2023

github-actions bot removed closed-by-stale stale Issue is stale and subject to automatic closing labels Dec 19, 2023

b0d0nne11 mentioned this issue Dec 19, 2023

c_to_p at intron/exon boundary where splice region is preserved #714

Open

github-actions bot added the stale Issue is stale and subject to automatic closing label Mar 19, 2024

jsstevenson added keep alive exempt issue from staleness checks and removed stale Issue is stale and subject to automatic closing labels Mar 19, 2024

holtgrewe mentioned this issue Jul 24, 2024

Port over biocommons/hgvs#709 (Fix variant region for ins and dup on intron-exon boundary) varfish-org/hgvs-rs#193

Open

b0d0nne11 linked a pull request Sep 15, 2024 that will close this issue

Fix variant region for ins and dup on intron-exon boundary #748

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistencies across intron/exon boundaries #655

Inconsistencies across intron/exon boundaries #655

cassiemk commented May 23, 2023

katiestahl commented Sep 18, 2023

gostachowiak commented Sep 19, 2023

github-actions bot commented Dec 8, 2023

github-actions bot commented Dec 16, 2023

gostachowiak commented Dec 16, 2023

github-actions bot commented Mar 19, 2024

Inconsistencies across intron/exon boundaries #655

Inconsistencies across intron/exon boundaries #655

Comments

cassiemk commented May 23, 2023

katiestahl commented Sep 18, 2023

gostachowiak commented Sep 19, 2023

github-actions bot commented Dec 8, 2023

github-actions bot commented Dec 16, 2023

gostachowiak commented Dec 16, 2023

github-actions bot commented Mar 19, 2024