Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reference recommendations to usage docs #1314

Open
wants to merge 13 commits into
base: dev
Choose a base branch
from

Conversation

lazappi
Copy link

@lazappi lazappi commented Jun 11, 2024

Adds a section to the usage docs providing more information and recommendations about reference files. Fixes #1086.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/rnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

drpatelh and others added 3 commits January 8, 2024 17:22
Dev -> master for 3.14.0 release
Sentence per line to allow easier commenting.
@lazappi
Copy link
Author

lazappi commented Jun 11, 2024

I've opened this a draft as I think it would be good to get more comments etc. before merging. Most of the checklist didn't seem relevant for a docs only change and I wasn't sure how to edit CHANGELOG.md.

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestions on content, but much of this is already covered, if at a lower level of detail, in the pre-existing material.

I've suggested some minor improvements to flow etc of your additions, but in general these need to be worked into existing sections to make sure we're not repeating ourselves.

docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
@lazappi
Copy link
Author

lazappi commented Jun 12, 2024

Good suggestions on content, but much of this is already covered, if at a lower level of detail, in the pre-existing material.

I've suggested some minor improvements to flow etc of your additions, but in general these need to be worked into existing sections to make sure we're not repeating ourselves.

This was just a bit of an information dump to start with but I agree it makes sense to combine with what is already there. I guess I overlooked it because it is an "options" section further down and I figured that was more about the technical details of the different arguments and not so much the broader question of which genome/annotation to use. Could we move the combined section closer to the start as it's likely to be relevant to more users than the options sections?

@pinin4fjords
Copy link
Member

Good suggestions on content, but much of this is already covered, if at a lower level of detail, in the pre-existing material.
I've suggested some minor improvements to flow etc of your additions, but in general these need to be worked into existing sections to make sure we're not repeating ourselves.

This was just a bit of an information dump to start with but I agree it makes sense to combine with what is already there. I guess I overlooked it because it is an "options" section further down and I figured that was more about the technical details of the different arguments and not so much the broader question of which genome/annotation to use. Could we move the combined section closer to the start as it's likely to be relevant to more users than the options sections?

Let's postpone reorganisation for a future PR (@drpatelh may have views), and stick with the new content integration for the moment.

@lazappi
Copy link
Author

lazappi commented Jun 14, 2024

@pinin4fjords I've incorporated your suggestions and moved the new content to the existing sections

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some more tweaks, then we probably need @drpatelh to approve.

docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
@pinin4fjords pinin4fjords marked this pull request as ready for review June 20, 2024 09:41
@pinin4fjords pinin4fjords added this to the 3.15.0 milestone Jun 20, 2024
@drpatelh
Copy link
Member

Thanks alot for this @lazappi ! We will include it in the next release!

Would love for @tdanhorn @MatthiasZepper to give this a once over too since they were involved in discussions in the parent issue.

PS: I also sent you an invite to become a member of the nf-core community on Github. Hopefully, future contributions won't have to be approved on PRs.

Copy link
Member

@MatthiasZepper MatthiasZepper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really nice write-up and summary! I think the pipeline will profit greatly from this additional documentation. The only concern that I have is using the toplevel assemblies directly.

On the other hand, the quantification may be good enough and the impact not as big as I presume.

docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
Copy link

github-actions bot commented Jul 10, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit c6c8bcc

+| ✅ 173 tests passed       |+
#| ❔   9 tests were ignored |#
!| ❗   7 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: assets/multiqc_config.yml
  • files_exist - File not found: .github/workflows/awstest.yml
  • files_exist - File not found: .github/workflows/awsfulltest.yml
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-07-10 06:05:44

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just restoring the pre-existing structure. All of this content belongs under the Explicit Reference File Specification section.

More information and links to further resources are [available from Ensembl](https://www.ensembl.org/info/website/upload/gff.html).
:::

### Reference transcriptome
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Reference transcriptome
#### Reference transcriptome


We recommend not providing a transcriptome FASTA file and instead allowing the pipeline to create it from the provided genome and annotation. Similar to aligner indexes, you can save the created transcriptome FASTA and BED files to a central location for future pipeline runs. This helps avoid redundant computation and having multiple copies on your system. Ensure that all genome, annotation, transcriptome, and index versions match to maintain consistency.

### Indices
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Indices
#### Indices

#### Gencode
Remember to note the genome and annotation versions as well as the versions of the software used for indexing, as an index created with one version may not be compatible with other versions.

### GENCODE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### GENCODE
#### GENCODE

#### Prokaryotic genome annotations
As well as the standard annotations, GENCODE also provides "basic" annotations, which include only representative transcripts, but we do not recommend using these.

### Prokaryotic genome annotations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Prokaryotic genome annotations
#### Prokaryotic genome annotations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants