Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update nfosi jhu dataset #2035

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

anngvu
Copy link
Contributor

@anngvu anngvu commented May 22, 2024

What?

Fix #2034

Cancer studies updated in this pull request:

  • ntap_nfosi_ntap

checks

For all pull requests:

  • Passes validation

For a new study (in addition to above):

  • Does study name and study ID follow our convention? e.g. Tumor_Type (Institue, Journal Year); brca_mskcc_2015
  • is study meta data complete? e.g. pmid, group of PUBLIC
  • were all samples profiled with WES/WGS? If not, is gene panel file curated?
  • are oncotree codes of all samples curated; Cancer Type and Cancer Type Detailed needs to be added in addition to Oncotree Code
  • clinical sample and patient data with meta files
  • mutations data with meta files
  • MAF is based on hg19
  • MAF with 2 isoforms: uniprot and mskcc
  • CNA data with meta files
  • CNA segment data with meta files
  • Expression data including z-scores with meta files
  • Case-lists for all profiles.
  • Manual checking (Niki or JJ): Triage or private Portal link here

@anngvu
Copy link
Contributor Author

anngvu commented May 30, 2024

@ritikakundra Help with this?

@ritikakundra
Copy link
Collaborator

@anngvu is this for private access?

@jaybee84
Copy link
Contributor

@ritikakundra I am chiming in here on behalf of @anngvu, This is for the public portal. This is the original study which needs to be updated.

We are hoping to highlight this update/new submission in a conference in June, 2024. Would it be possible to add these updates before mid June?

@Rima-Waleed
Copy link
Collaborator

Rima-Waleed commented Jun 4, 2024

Hi @jaybee84 @anngvu,

Thank you for the study update and PR! The data looks good but I have a couple of fixes before we can merge:

  1. The paper mentions 55 samples from 23 patients, while the study files have 134 samples from 54 patients. Can you clarify the difference in patient/sample count?
  2. Do all the samples have matched normals? Can this info be added to the sample file? The attribute we use is SOMATIC_STATUS (Matched/Unmatched).
  3. The mrna expression files have to be renamed to the recommended file names: I assume data_mrna_seq_rpkm and data_mrna_seq_tpm, correct?
  4. It would be helpful if you add a README file to outline how the data was curated.

I added some attributes required by cBioPortal to the clinical files such as ONCOTREE_CODE, CANCER_TYPE, CANCER_TYPE_DETAILED, & TMB_NONSYNONYMOUS.

Looking forward to update the study!
Thanks!

@jaybee84
Copy link
Contributor

jaybee84 commented Jun 4, 2024

@Rima-Waleed Thanks for the review.

The paper mentions 55 samples from 23 patients, while the study files have 134 samples from 54 patients. Can you clarify the difference in patient/sample count?

This submission updates the previous cohort by adding additional batches of samples. The new preprint that describes the full cohort is here: https://www.biorxiv.org/content/10.1101/2024.01.23.576977v1. Please add this preprint along with the original publication for reference.

I am tagging @anngvu to address the points 2,3, and 4.

@anngvu
Copy link
Contributor Author

anngvu commented Jun 5, 2024

@Rima-Waleed Thanks for the notes, we'll update these. I think the SOMATIC_STATUS is new (at least it wasn't pointed out last time).

@anngvu
Copy link
Contributor Author

anngvu commented Jun 6, 2024

Hmm, I am unable to update the last thing because I get this:
image

It sounds as if cBioPortal account rather than ours needs to purchase bandwidth for LFS?

UPDATE: Actually, it looks like we are the ones who need to update storage according to this:

@Rima-Waleed
Copy link
Collaborator

Hi @anngvu,
Apologies about that, we have hit the datahub LFS limit this month and it will reset itself on the 12th. This is why you're not able to push the study files to datahub. Should work on the 12th, I'll review the files as soon as it's updated.
Thanks!

@anngvu
Copy link
Contributor Author

anngvu commented Jun 13, 2024

@Rima-Waleed No worries, I confirmed side of the bandwidth issue a couple of days ago. Maybe we can find a way to donate bandwidth through GitHub sponsorship? Just an idea -- I'll talk to Ino and Ritika. Anyway, made suggested updates with somatic status (and we're only going to keep TPM). Thanks!

@anngvu
Copy link
Contributor Author

anngvu commented Jun 14, 2024

@Rima-Waleed All files are updated.

@Rima-Waleed
Copy link
Collaborator

Thank you @anngvu! Doing a final review of the study and sending you the study link soon.

@jaybee84
Copy link
Contributor

jaybee84 commented Jun 18, 2024

@Rima-Waleed @ritikakundra Just wanted to follow up here to see if it would be possible to update the study on the public portal by today or tomorrow. We are hoping to showcase the study and cBioPortal functionalities in an upcoming international conference (on Friday) if possible. :)

@ritikakundra
Copy link
Collaborator

@jaybee84 We are just wrapping up it up. We will have the link ready by tomorrow morning.

@jaybee84
Copy link
Contributor

Super! Thanks to the whole cBioPortal team and specially to @Rima-Waleed and @ritikakundra for pushing this through ! Really appreciate the help.

@Rima-Waleed
Copy link
Collaborator

Hi @jaybee84, thank you for your patience and collaboration! We have the public link ready and you can access the study using this link: https://www.cbioportal.org/study/summary?id=nst_jhusm_2020

I had to make some edits to make sure the study complies with cBioPortal formats, so please let me know if you have any questions.

Also, this is an easy fix but I just wanted to make sure which paper you want linked to the study: just the preprint, or the original study, or both?

@jaybee84
Copy link
Contributor

@Rima-Waleed Thanks for making this ready so quickly. Couple of thoughts:

We have the public link ready and you can access the study using this link: https://www.cbioportal.org/study/summary?id=nst_jhusm_2020

Is it possible to update the original study link: https://www.cbioportal.org/study/summary?id=nst_nfosi_ntap instead of making a new one? This is an ongoing project with additional samples from the same biobank. Also, can we add the description that has been included in the Readme of this PR?

this is an easy fix but I just wanted to make sure which paper you want linked to the study: just the preprint, or the original study, or both?

We should include both.

@Rima-Waleed
Copy link
Collaborator

@jaybee84 thanks for the feedback.

Of course! This link was a test link just to get your approval before updating the original study link. I have added the original description along with the preprint and citation in the description. The study should be imported to public portal after midnight and you'll be able to see the updated study here: https://www.cbioportal.org/study/summary?id=nst_nfosi_ntap

@jaybee84
Copy link
Contributor

Thanks @Rima-Waleed. The updated study looks great. Thanks again for all the help!

@jaybee84
Copy link
Contributor

Oh! actually one more request:
Could we make the title of the study "Nerve Sheath Tumors (Johns Hopkins NF1 Biospecimen Repository, 2024)" instead of "Nerve Sheath Tumors (Johns Hopkins, Sci Data 2020)". The data currently exceeds the samples described in Sci Data 2020 paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update current study nst_nfosi_ntap with new samples, data
4 participants