Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering issue with neat features #2063

Closed
abretaud opened this issue Feb 20, 2019 · 26 comments
Closed

Rendering issue with neat features #2063

abretaud opened this issue Feb 20, 2019 · 26 comments
Assignees
Labels
Milestone

Comments

@abretaud
Copy link
Collaborator

Testing Apollo 2.3.1 with NeathHTMLFeatures and NeatCanvasFeatures, I get this rendering issue:

apollo_neat

Here's the track config for UCA:

{
  "maxFeatureSizeForUnderlyingRefSeq": 250000,
  "subfeatureDetailLevel": 2,
  "maxFeatureScreenDensity": 0.5,
  "maxHeight": 600,
  "style": {
    "arrowheadClass": "annot-arrowhead",
    "className": "annot",
    "_defaultHistScale": 4,
    "_defaultLabelScale": 30,
    "_defaultDescriptionScale": 120,
    "minSubfeatureWidth": 1,
    "maxDescriptionLength": 70,
    "showLabels": true,
    "label": "name,id",
    "description": "note, description",
    "centerChildrenVertically": false,
    "renderClassName": "annot-render",
    "subfeatureClasses": {
      "UTR": "annot-UTR",
      "CDS": "annot-CDS",
      "exon": "container-100pct",
      "intron": null,
      "wholeCDS": null,
      "start_codon": null,
      "stop_codon": null,
      "match_part": "darkblue-80pct",
      "non_canonical_three_prime_splice_site": "noncanonical-splice-site",
      "non_canonical_five_prime_splice_site": "noncanonical-splice-site"
    },
    "alternateClasses": {
      "terminator": {
        "renderClassName": "terminator-render annot-apollo",
        "className": "terminator"
      },
      "transposable_element": {
        "renderClassName": "blue-ibeam-render annot-apollo",
        "className": "blue-ibeam"
      },
      "pseudogene": {
        "renderClassName": "gray-center-30pct annot-apollo",
        "className": "light-purple-80pct"
      },
      "snRNA": {
        "renderClassName": "gray-center-30pct annot-apollo",
        "className": "brightgreen-80pct"
      },
      "rRNA": {
        "renderClassName": "gray-center-30pct annot-apollo",
        "className": "brightgreen-80pct"
      },
      "snoRNA": {
        "renderClassName": "gray-center-30pct annot-apollo",
        "className": "brightgreen-80pct"
      },
      "repeat_region": {
        "className": "magenta-80pct"
      },
      "ncRNA": {
        "renderClassName": "gray-center-30pct annot-apollo",
        "className": "brightgreen-80pct"
      },
      "miRNA": {
        "renderClassName": "gray-center-30pct annot-apollo",
        "className": "brightgreen-80pct"
      },
      "tRNA": {
        "renderClassName": "gray-center-30pct annot-apollo",
        "className": "brightgreen-80pct"
      },
      "SNV": {
        "renderClassName": "snv-variant",
        "className": "snv-variant-render"
      },
      "MNV": {
        "renderClassName": "mnv-variant",
        "className": "mnv-variant-render"
      },
      "insertion": {
        "renderClassName": "insertion-variant",
        "className": "insertion-variant-render"
      },
      "deletion": {
        "renderClassName": "deletion-variant",
        "className": "deletion-variant-render"
      }
    },
    "uniqueIdField": "id",
    "centerSubFeature": {
      "non_canonical_three_prime_splice_site": false,
      "non_canonical_five_prime_splice_site": false
    }
  },
  "hooks": {},
  "events": {},
  "menuTemplate": null,
  "noExport": true,
  "pinned": true,
  "autocomplete": "none",
  "key": "User-created Annotations",
  "storeClass": "WebApollo/Store/SeqFeature/ScratchPad",
  "phase": 0,
  "compress": 0,
  "label": "Annotations",
  "type": "WebApollo/View/Track/AnnotTrack",
  "subfeatures": 1,
  "baseUrl": "http:https://localhost:8500/apollo//1155672999863085291160881064/jbrowse/plugins/WebApollo/json/",
  "metadata": {}
}

And here's the config for the track I dragged models from (I left them untouched in the uca):

{
  "maxFeatureSizeForUnderlyingRefSeq": 250000,
  "subfeatureDetailLevel": 2,
  "maxFeatureScreenDensity": 0.5,
  "maxHeight": "600",
  "style": {
    "arrowheadClass": "webapollo-arrowhead",
    "className": "feature2",
    "_defaultHistScale": 4,
    "_defaultLabelScale": 30,
    "_defaultDescriptionScale": 120,
    "minSubfeatureWidth": 1,
    "maxDescriptionLength": 70,
    "showLabels": true,
    "label": "product,name,id",
    "description": "note,description",
    "centerChildrenVertically": false,
    "renderClassName": "gray-center-30pct annot-apollo",
    "subfeatureClasses": {
      "UTR": "webapollo-UTR",
      "CDS": "webapollo-CDS",
      "exon": "container-100pct",
      "intron": null,
      "wholeCDS": null,
      "start_codon": null,
      "stop_codon": null,
      "match_part": "darkblue-80pct"
    },
    "color": "#a6cee3"
  },
  "hooks": {},
  "events": {},
  "menuTemplate": [
    {
      "label": "View details",
      "title": "{type} {name}",
      "action": "contentDialog",
      "iconClass": "dijitIconTask"
    },
    {
      "iconClass": "dijitIconFilter"
    },
    {},
    {}
  ],
  "trackType": "NeatHTMLFeatures/View/Track/NeatFeatures",
  "topLevelFeatures": "mRNA",
  "overridePlugins": false,
  "urlTemplate": "raw/58e6be8d7034a88d34fec4bb2a578578_0.gff.gz",
  "overrideDraggable": false,
  "label": "58e6be8d7034a88d34fec4bb2a578578_0",
  "type": "WebApollo/View/Track/DraggableNeatHTMLFeatures",
  "storeClass": "JBrowse/Store/SeqFeature/GFF3Tabix",
  "category": "Default",
  "key": "merlin_html_2.gff",
  "baseUrl": "http:https://localhost:8500/apollo//1155672999863085291160881064/jbrowse/data/",
  "index": 1
}

(I think I've seen it in another issue, but I can't find it... Sorry if it's a duplicate)

@abretaud
Copy link
Collaborator Author

Just in case, the gff file I loaded in the merlin_html_2.gff track:

##gff-version 3
##sequence-region Merlin 1 172788
Merlin	GeneMark.hmm	gene	752	1039	-339.046618	+	.	ID=Merlin_2;seqid=Merlin
Merlin	GeneMark.hmm	mRNA	752	1039	.	+	.	ID=Merlin_2_mRNA;Parent=Merlin_2;seqid=Merlin
Merlin	GeneMark.hmm	CDS	752	830	.	+	0	ID=Merlin_2_CDS1;Parent=Merlin_2_mRNA;seqid=Merlin
Merlin	GeneMark.hmm	CDS	890	1039	.	+	0	ID=Merlin_2_CDS2;Parent=Merlin_2_mRNA;seqid=Merlin
Merlin	GeneMark.hmm	gene	1067	2011	-1229.683915	-	.	ID=Merlin_3;seqid=Merlin
Merlin	GeneMark.hmm	mRNA	1067	2011	.	-	.	ID=Merlin_3_mRNA;Parent=Merlin_3;seqid=Merlin
Merlin	GeneMark.hmm	CDS	1067	1500	.	-	0	ID=Merlin_3_CDS1;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin	GeneMark.hmm	CDS	1600	1911	.	-	0	ID=Merlin_3_CDS2;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin	GeneMark.hmm	UTR	1912	2011	.	-	0	ID=Merlin_3_UTR;Parent=Merlin_3_mRNA;seqid=Merlin

@nathandunn
Copy link
Contributor

@abretaud Can you do a couple of things:

  1. Can you right-click on the annotations and display the GFF3 and various FASTA options for the two genes?
  2. That being said, do you have any non-standard configuration options set?

Its possible we aren't properly annotating CDS's, but I thought we would turn them into Exon's first.

@abretaud
Copy link
Collaborator Author

  1. Hum, not sure what you mean here :/
  2. I don't think so, though it's still a jbrowse exported from galaxy into my apollo docker image... The apollo-config.groovy is there: https://github.com/abretaud/docker-apollo/blob/bipaa/apollo-config.groovy

In case it helps, here's the html of the first gene in uca:

<div class="feature-label" style="top: 18px; left: 50.4%;">
    <div class="feature-name">Merlin_2-00001</div>
</div>
<div class="feature plus-annot ui-droppable" style="left: 50.4%; top: 0px; width: 57.6%; background-color: transparent; border-width: 0px;" _dijitmenudijit_menu_5="3">
    <div class="plus-annot-arrowhead" style="right: -12px;"></div>
    <div class="subfeature plus-container-100pct" style="left: 0%; width: 100%;">
        <div class="subfeature annot-CDS cds-frame1 neat-subfeature" style="left: 0%; width: 100%;"></div>
    </div>
    <div class="subfeature plus-container-100pct" style="left: 0%; width: 27.4306%;">
        <div class="subfeature annot-CDS cds-frame1 neat-subfeature" style="left: 0%; width: 100%;"></div>
    </div>
    <div class="subfeature plus-container-100pct ui-resizable" style="left: 47.9167%; width: 52.0833%;">
        <div class="subfeature annot-CDS cds-frame0 neat-subfeature" style="left: 0%; width: 100%;"></div>
        <div class="ui-resizable-handle ui-resizable-e" style="z-index: 90;"></div>
        <div class="ui-resizable-handle ui-resizable-w" style="z-index: 90;"></div>
    </div>
    <svg class="jb-intron" viewBox="0 0 100 100" preserveAspectRatio="none" xmlns="http:https://www.w3.org/2000/svg" xmlns:xlink="http:https://www.w3.org/1999/xlink" style="position:absolute;z-index: 15;left: 27.4306%;width: 20.4861%;height: 100%"><polyline class="neat-intron" points="0,50 50,5 100,50" shape-rendering="optimizeQuality"></polyline></svg>
    <div class="feature-render annot-render"></div>
</div>

@nathandunn
Copy link
Contributor

  1. I mean what is the exported GFF3 / FASTA of the annotation?

screen shot 2019-02-20 at 8 27 36 am
screen shot 2019-02-20 at 8 27 41 am

  1. I mean in Apollo, there are some CDS options. If you have any relevant options in your apollo-config.groovy it would be useful to see.

@abretaud
Copy link
Collaborator Author

  1. Ok, here's the exported GFF:
##gff-version 3
##sequence-region Merlin 1 172788
Merlin	.	gene	1067	2011	.	-	.	owner=abretaud@bipaa;ID=c2151e23-dde4-4bd5-bfd5-29d809d4b3ee;date_last_modified=2019-02-20;Name=Merlin_3_mRNA;date_creation=2019-02-20
Merlin	.	mRNA	1067	2011	.	-	.	owner=abretaud@bipaa;Parent=c2151e23-dde4-4bd5-bfd5-29d809d4b3ee;ID=edebd2c1-d755-47d6-9f61-e8cb82548624;date_last_modified=2019-02-20;Name=Merlin_3_mRNA-00002;date_creation=2019-02-20
Merlin	.	exon	1067	1464	.	-	.	Parent=edebd2c1-d755-47d6-9f61-e8cb82548624;ID=85560d2b-8f18-4fa9-a306-84fe04136335;Name=85560d2b-8f18-4fa9-a306-84fe04136335
Merlin	.	exon	1498	2011	.	-	.	Parent=edebd2c1-d755-47d6-9f61-e8cb82548624;ID=5a34afd8-4889-4337-b338-8d2d42f1d170;Name=5a34afd8-4889-4337-b338-8d2d42f1d170
Merlin	.	CDS	1498	2011	.	-	0	Parent=edebd2c1-d755-47d6-9f61-e8cb82548624;ID=edebd2c1-d755-47d6-9f61-e8cb82548624-CDS;Name=edebd2c1-d755-47d6-9f61-e8cb82548624-CDS
Merlin	.	CDS	1067	1464	.	-	2	Parent=edebd2c1-d755-47d6-9f61-e8cb82548624;ID=edebd2c1-d755-47d6-9f61-e8cb82548624-CDS;Name=edebd2c1-d755-47d6-9f61-e8cb82548624-CDS
###
Merlin	.	gene	752	1039	.	+	.	owner=abretaud@bipaa;ID=7d64d66d-32e9-46a0-b7e4-1b056f17b0a7;date_last_modified=2019-02-20;Name=Merlin_2;date_creation=2019-02-20
Merlin	.	mRNA	752	1039	.	+	.	owner=abretaud@bipaa;Parent=7d64d66d-32e9-46a0-b7e4-1b056f17b0a7;ID=58514be3-b539-481b-8a22-9fe91d1275e1;date_last_modified=2019-02-20;Name=Merlin_2-00001;date_creation=2019-02-20
Merlin	.	exon	752	1039	.	+	.	Parent=58514be3-b539-481b-8a22-9fe91d1275e1;ID=4ff428f2-a826-4c69-b5e1-96b3db640ac3;Name=4ff428f2-a826-4c69-b5e1-96b3db640ac3
Merlin	.	CDS	752	1039	.	+	0	Parent=58514be3-b539-481b-8a22-9fe91d1275e1;ID=58514be3-b539-481b-8a22-9fe91d1275e1-CDS;Name=58514be3-b539-481b-8a22-9fe91d1275e1-CDS
###

The exported cds fasta:

>edebd2c1-d755-47d6-9f61-e8cb82548624 (mRNA) 912 residues [Merlin:1067-2011 - strand] [cds] name=Merlin_3_mRNA-00002
ATGCTAACTTTAGATGAATTTAAAAACCAAGCGGGTAATATAGACTTTCAGCGTACTAAT
ATGTTTAGTTGTGTATTTGCAACTACTCCGTCAGCAAAGTCTCAACAATTACTCGATCAA
TTTGGCGGTATGCTCTTTAATAACCTTCCGTTGAATAATGACTGGCTTGGATTAACACAA
GGTGAGTTCACATCAGGACTCACCTCAATTATCACTGCCGGTACTCAACAGCTGGTAAGA
AAGTCTGGTGTATCGAAATATCTTATTGGAGCAATGAGCAATCGTGTTGTTCAGTCTTTA
TTAGGTGAATTTGAAGTCGGAACTTATTTGTTAGACTTCTTTAACATGGCTTATCCGCAA
TCTGGATTGATGATTTATTCGGTCAAAATTCCAGAGAACAGATTGTCTCATGAAATGGAT
TTCAACCATAACTCACCGAATATTAGAATAACTGGACGTGAACTCGATCCGTTAACTATA
TCATTCAGAATGGATCCCGAAGCAAGTAACTATCACCCGGTTACTGGATTGCGAGCATTA
CCAACTGACGTCGAAGCTGACATTCAGGTTAACCTTCATGCTCGAAATGGATTACCTCAT
ACTGTGATAATGTTCACAGGTTGTGTTCCTGTTGCGTGTGGAGCTCCTGAGCTTACATAT
GAAGGAGATAACCAAATTGCGGTTTTCGATGTTACATTTGCTTACAGAGTAATGCAAACG
GGTGCTGTTGGACGTCAAGCTGCTCTTGATTGGATTGAAGATAGAGCTGTTAATTCTATA
ACTGGAATTAATAGTGAAATGTCTCTTAATGGAAGTTTAAGTAGATTATCTAGACTTGGA
GGAGCTGCTGGAGGGTTGTCTCACGTCATTAATTCGACCCGAAACTCTACTTCGAAAATA
CTTGGATTGTAA
>58514be3-b539-481b-8a22-9fe91d1275e1 (mRNA) 288 residues [Merlin:752-1039 + strand] [cds] name=Merlin_2-00001
ATGAAATCAATTTTTCGTATCAACGGTGTAGAAATTGTAGTTGAAGATGTAGTTCCTATG
TCTTATGAATTCAATGAAGTTGTTTTCAAAGAGCTTAAGAAAATTTTAGGCGATAAGAAG
CTTCAAAGTACTCCAATTGGACGTTTTGGAATGAAAGAAAACGTTGATACTTATATTGAA
AGTGTAGTGACAGGGCAGTTAGAAGGTGAATTTTCTGTAGCAGTTCAAACTGTAGAAAAT
GATGAAGTTATTTTAACTTTACCAGCTTTCGTAATTTTCCGCAAATAA
  1. ok, the apollo-config.groovy is still this one: https://github.com/abretaud/docker-apollo/blob/bipaa/apollo-config.groovy

@nathandunn nathandunn added this to the 2.4.0 milestone Feb 20, 2019
@nathandunn
Copy link
Contributor

Okay.

  1. You're not passing any options here?

https://github.com/abretaud/docker-apollo/blob/bipaa/apollo-config.groovy#L74-L90

  1. You get the same results on refresh?

  2. Can you click on the individual exons for each?

  3. What happens if you change the CDS to exon in your input GFF3? Apollo will recalculate by default, so its better (though not essential) if the input is exon. There are some options to over-ride this behavior.

@abretaud
Copy link
Collaborator Author

  1. Hum, I have this:
        WEBAPOLLO_CDS_FOR_NEW_TRANSCRIPTS: "true"
        WEBAPOLLO_FEATURE_HAS_DBXREFS: "true"
        WEBAPOLLO_FEATURE_HAS_ATTRS: "true"
        WEBAPOLLO_FEATURE_HAS_PUBMED: "true"
        WEBAPOLLO_FEATURE_HAS_GO: "true"
        WEBAPOLLO_FEATURE_HAS_COMMENTS: "true"
        WEBAPOLLO_FEATURE_HAS_STATUS: "true"
  1. yep

  2. yep

  3. I've just tried, it's the same

@nathandunn
Copy link
Contributor

nathandunn commented Feb 20, 2019

I would remove the WEBAPOLLO_CDS_FOR_NEW_TRANSCRIPTS line.

The default is false. Basically, this tries to use the existing CDS to calculate the new one. By default Apollo always tries to recalculate the most likely CDS based on the largest ORF. The most common use-case is to promote of bunch of existing predicted annotations, preserving their annotations.

If you don't have a good reason for making that true, I wouldn't set it to true.

@abretaud
Copy link
Collaborator Author

It's the same when removing WEBAPOLLO_CDS_FOR_NEW_TRANSCRIPTS

@nathandunn
Copy link
Contributor

By the "same" did you mean, when you created an annotation again after removing the line (or setting it to false) and redeploying it?

If its not too much time, you might want to try explicitly setting it to false, redeploying, and re-create the annotations.

I'll try to take a closer look at it this week or early next.

@abretaud
Copy link
Collaborator Author

Yep, I meant there is still the problem after unsetting the env var (or setting it to false explicitly), redeploying, add a new organism with my jbrowse instance, and add genes to uca
Thanks for the help!

@nathandunn
Copy link
Contributor

nathandunn commented Feb 21, 2019 via email

@abretaud
Copy link
Collaborator Author

I've just sent 2 sample data dirs by email, you received it? (sent to lbl address)
I have to leave now sorry

@nathandunn
Copy link
Contributor

nathandunn commented Feb 21, 2019 via email

@nathandunn
Copy link
Contributor

@nathandunn
Copy link
Contributor

@nathandunn
Copy link
Contributor

Interesting . . .the GFF3 is correct, but for some reason the returned element has two exons (or an exon and a CDS) in separate places:

screen shot 2019-02-25 at 10 17 12 am
screen shot 2019-02-25 at 10 17 05 am

The problem is that the exon has 3 children that are exons (?!?), two of which are the original correct exons.

{"track":"Merlin","features":[{"location":{"fmin":751,"fmax":1039,"strand":1},"type":{"cv":{"name":"sequence"},"name":"mRNA"},"name":"Merlin_2","orig_id":"Merlin_2","children":[{"location":{"fmin":751,"fmax":1039,"strand":1},"type":{"cv":{"name":"sequence"},"name":"exon"},"orig_id":"Merlin_2_mRNA","children":[{"location":{"fmin":751,"fmax":1039,"strand":1},"type":{"cv":{"name":"sequence"},"name":"CDS"}},{"location":{"fmin":751,"fmax":830,"strand":1},"type":{"cv":{"name":"sequence"},"name":"exon"},"orig_id":"Merlin_2_CDS1"},{"location":{"fmin":889,"fmax":1039,"strand":1},"type":{"cv":{"name":"sequence"},"name":"exon"},"orig_id":"Merlin_2_CDS2"}]}]}],"operation":"add_transcript","clientToken":"13959384401953500954"}

Viewing it in the JSON viewer, it like there is an intermediate sequence layer that shouldn't be there:

screen shot 2019-02-25 at 10 25 28 am

@nathandunn
Copy link
Contributor

With the other one, its slightly different, but still not properly defined evidence:

image

Basically, it should always go, mRNA -> (exon|CDS) and exons should not generally have subfeatures, though if there is a good argument, I can take a look.

If sequence is something that should be mapped somewhere else, the let me know, but I'm not sure what it would be, since we are generally just looking for exon coordinates or CDS if exons are unavailable.

@nathandunn
Copy link
Contributor

Though, the evidence suggests that what you have is correct when I view details (either the CDS or exon version should have worked), so I'm unsure why we are getting two layers.

Looking at the evidence, the GFF3 is a bit funky:

Merlin  GeneMark.hmm    CDS 752 830 .   +   0   ID=Merlin_2_CDS1;Parent=Merlin_2_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    gene    752 1039    -339.046618 +   .   ID=Merlin_2;seqid=Merlin
Merlin  GeneMark.hmm    mRNA    752 1039    .   +   .   ID=Merlin_2_mRNA;Parent=Merlin_2;seqid=Merlin
Merlin  GeneMark.hmm    CDS 890 1039    .   +   0   ID=Merlin_2_CDS2;Parent=Merlin_2_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    CDS 1067    1500    .   -   0   ID=Merlin_3_CDS1;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    gene    1067    2011    -1229.683915    -   .   ID=Merlin_3;seqid=Merlin
Merlin  GeneMark.hmm    mRNA    1067    2011    .   -   .   ID=Merlin_3_mRNA;Parent=Merlin_3;seqid=Merlin
Merlin  GeneMark.hmm    CDS 1600    1911    .   -   0   ID=Merlin_3_CDS2;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    UTR 1912    2011    .   -   0   ID=Merlin_3_UTR;Parent=Merlin_3_mRNA;seqid=Merlin

Fixed it here:

Merlin  GeneMark.hmm    gene    752 1039    -339.046618 +   .   ID=Merlin_2;seqid=Merlin
Merlin  GeneMark.hmm    mRNA    752 1039    .   +   .   ID=Merlin_2_mRNA;Parent=Merlin_2;seqid=Merlin
Merlin  GeneMark.hmm    CDS 752 830 .   +   0   ID=Merlin_2_CDS1;Parent=Merlin_2_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    CDS 890 1039    .   +   0   ID=Merlin_2_CDS2;Parent=Merlin_2_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    gene    1067    2011    -1229.683915    -   .   ID=Merlin_3;seqid=Merlin
Merlin  GeneMark.hmm    mRNA    1067    2011    .   -   .   ID=Merlin_3_mRNA;Parent=Merlin_3;seqid=Merlin
Merlin  GeneMark.hmm    CDS 1067    1500    .   -   0   ID=Merlin_3_CDS1;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    CDS 1600    1911    .   -   0   ID=Merlin_3_CDS2;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    UTR 1912    2011    .   -   0   ID=Merlin_3_UTR;Parent=Merlin_3_mRNA;seqid=Merlin

@nathandunn
Copy link
Contributor

Actually, I think the sequence cv type should be there. I'm going to go through some regressions to see if I didn't inadvertently introduce a problem.

@nathandunn
Copy link
Contributor

I redid this with just a straight GFF3 with 2.3.1 and the same problem:

##gff-version 3
##sequence-region Merlin 1 172788
Merlin  GeneMark.hmm    gene    752 1039    -339.046618 +   .   ID=Merlin_2;seqid=Merlin
Merlin  GeneMark.hmm    mRNA    752 1039    .   +   .   ID=Merlin_2_mRNA;Parent=Merlin_2;seqid=Merlin;Name=bob
Merlin  GeneMark.hmm    CDS 752 830 .   +   0   ID=Merlin_2_CDS1;Parent=Merlin_2_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    CDS 890 1039    .   +   0   ID=Merlin_2_CDS2;Parent=Merlin_2_mRNA;seqid=Merlin


Merlin  GeneMark.hmm    gene    1067    2011    -1229.683915    -   .   ID=Merlin_3;seqid=Merlin
Merlin  GeneMark.hmm    mRNA    1067    2011    .   -   .   ID=Merlin_3_mRNA;Parent=Merlin_3;seqid=Merlin;Name=jenny
Merlin  GeneMark.hmm    CDS 1600    1911    .   -   0   ID=Merlin_3_CDS2;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    CDS 1067    1500    .   -   0   ID=Merlin_3_CDS1;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    UTR 1912    2011    .   -   0   ID=Merlin_3_UTR;Parent=Merlin_3_mRNA;seqid=Merlin

Testing with a 2.2.0 regression

@nathandunn
Copy link
Contributor

Same result for 2.2.0 . . . I'm wondering it he issue might be more related to it using the GFF3Tabix store versus the NCList one. This would be good to fix, as I would prefer the native stores.

@nathandunn
Copy link
Contributor

@abretaud I remember the problem.

The issue is that when using the GFF3Tabix, it flips out if it has a top-level gene class.

I changed it to be top-level mRNA:

Merlin  GeneMark.hmm    mRNA    752 1039    .   +   .   ID=Merlin_2_mRNA;seqid=Merlin;Name=bob
Merlin  GeneMark.hmm    CDS 752 830 .   +   0   ID=Merlin_2_CDS1;Parent=Merlin_2_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    CDS 890 1039    .   +   0   ID=Merlin_2_CDS2;Parent=Merlin_2_mRNA;seqid=Merlin


Merlin  GeneMark.hmm    mRNA    1067    2011    .   -   .   ID=Merlin_3_mRNA;seqid=Merlin;Name=jenny
Merlin  GeneMark.hmm    CDS 1600    1911    .   -   0   ID=Merlin_3_CDS2;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    CDS 1067    1500    .   -   0   ID=Merlin_3_CDS1;Parent=Merlin_3_mRNA;seqid=Merlin
Merlin  GeneMark.hmm    UTR 1912    2011    .   -   0   ID=Merlin_3_UTR;Parent=Merlin_3_mRNA;seqid=Merlin

and it works:

screen shot 2019-02-25 at 11 11 26 am

@nathandunn
Copy link
Contributor

This obviously isn't acceptable as GFF3's should have genes in them. Two solutions:

  1. if topLevelFeatures are defined then use what is there
  • OR -
  1. if gene (or pseudogene) has subfeatures, we automatically use those unless gene is specified in topLevelFeatures

Anyway, this is critical for 2.4.0

@abretaud
Copy link
Collaborator Author

Cool, thanks for looking into it! I won't be able to do it until tomorrow, but no problem if you want me to test some patch

@nathandunn
Copy link
Contributor

Nothing to do today (or tonight) for you. Hopefully I'll have something more working tomorrow so I can finish #2064

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants