Add new input to multiqc module for use with --rename_samples #5973

pinin4fjords · 2024-07-12T13:38:41Z

Supplying a TSV to multiqc with --rename-samples allows you to substitute sample identifiers to allow for consistency across a report.

PR checklist

Closes #XXX

…o multiqc_rename_input

MatthiasZepper · 2024-07-12T14:18:18Z

Given how frequently some form of name cleaning is required, I think it is very useful to have a separate channel to supply the alternative names in a straightforward manner. (In contrast to collecting a Nextflow channel with the sample names into a file, referencing that path in a config file and mixing that file into the extra_multiqc_config channel).

However, I think that the --sample_names argument will be used almost as frequently as --rename_samples.

Hence, I wonder if it would be advisable to change the channel from a path channel to a map like [ val(argument), path(config) ]?

Inside the module, instead of

def replace = replace_names ? "--replace-names ${replace_names}" : ''

we would have something along the line

def replace = sampleMap ? sampleMap.collect { argument, value -> "--${argument} ${value}"}.join(' ') : ''

which would work on an input channel like

def sampleMap = [
    replace_names: '/path/to/sample_aliases.tsv'
]

In that case, when using the MultiQC module, the pipeline authors could .map the path to the argument they wish to use?

Basically what we already did in the rnaseq pipeline for the extra_args, just this time inside the module?

pinin4fjords · 2024-07-12T15:20:21Z

@MatthiasZepper I'm not convinced, that's not a pattern I've seen elsewhere in nf-core. Can you imagine explaining that in the meta.yml? If we need another file input I think we should have another file input- I'll add it now

MatthiasZepper · 2024-07-12T16:25:40Z

I'm not convinced, that's not a pattern I've seen elsewhere in nf-core.

But maybe it should become a more common pattern? I agree that this requires a larger discussion, e.g. on the #modules channel prior to merging it into such an important module, but I have always felt that the ext.args pattern is hard to wrap your head around, in particular for end users of the pipelines.

I do not recall how often I have explained how to add extra arguments to a tool on various Slack channels (many times) and the extra_star_align_args & Co. parameters are really popular in the rnaseq pipeline - at least I have seen them quite often being used in param files.

Having a key -value map that is consolidated on a module level with the ext.args could be really neat for pipeline developers to support such extra_[...]_arg pipeline parameters without much headache. It would also help to avoid duplication, if e.g. the same module is used in two different subworkflows that represent two routes of a pipeline. It would become possible to mix in their specific settings into a default map that applies to both.

Can you imagine explaining that in the meta.yml? If we need another file input I think we should have another file input- I'll add it now

I have not yet thought about that problem, but it seems solvable to me. If the code itself is brief and works reliably, it should not be a dealbreaker to describe that functionality schematically.

What worries me more is duplicating the boilerplate consolidation code in every module. I guess that should then be some central function like check_max(). for the configs.

I'll add it now

Fair enough. Hardcoding too many input channels to me also doesn't seem to be desirable, but I guess two additional ones would work.

pinin4fjords · 2024-07-12T16:28:48Z

Yep, I agree there are things we could improve in the conventions, but that that's a discussion for elsewhere.

Merging for now.

pinin4fjords added 2 commits July 12, 2024 14:37

Add new input to multiqc module for use with --rename_samples

9616c58

Update tests

b880013

pinin4fjords requested review from abhi18av, bunop, drpatelh, jfy133 and ewels as code owners July 12, 2024 13:38

pinin4fjords added 3 commits July 12, 2024 14:43

Merge branch 'master' into multiqc_rename_input

25d4ef2

update meta.yml

6123a87

Merge branch 'multiqc_rename_input' of github.com:nf-core/modules int…

0761174

…o multiqc_rename_input

jfy133 approved these changes Jul 12, 2024

View reviewed changes

Add provision for sample_names

c3afd85

pinin4fjords added this pull request to the merge queue Jul 12, 2024

Merged via the queue into master with commit b80f5fd Jul 12, 2024
12 checks passed

pinin4fjords deleted the multiqc_rename_input branch July 12, 2024 16:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new input to multiqc module for use with --rename_samples #5973

Add new input to multiqc module for use with --rename_samples #5973

pinin4fjords commented Jul 12, 2024 •

edited

Loading

MatthiasZepper commented Jul 12, 2024 •

edited

Loading

pinin4fjords commented Jul 12, 2024

MatthiasZepper commented Jul 12, 2024

pinin4fjords commented Jul 12, 2024

Add new input to multiqc module for use with --rename_samples #5973

Add new input to multiqc module for use with --rename_samples #5973

Conversation

pinin4fjords commented Jul 12, 2024 • edited Loading

PR checklist

MatthiasZepper commented Jul 12, 2024 • edited Loading

pinin4fjords commented Jul 12, 2024

MatthiasZepper commented Jul 12, 2024

pinin4fjords commented Jul 12, 2024

pinin4fjords commented Jul 12, 2024 •

edited

Loading

MatthiasZepper commented Jul 12, 2024 •

edited

Loading