Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collate_igblast_results function #138

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

collate_igblast_results function #138

wants to merge 8 commits into from

Conversation

mannycruz
Copy link
Collaborator

@mannycruz mannycruz commented Nov 3, 2022

Pull Request Checklists

Important: When opening a pull request, keep only the applicable checklist and delete all other sections.

Checklist for all PRs

Required

  • I tested the new code for my use case (please provide a reproducible example of how you tested the new functionality)

`

sample_table <- read_tsv("/projects/nhl_meta_analysis_scratch/gambl/results_local/shared/gambl_genome_results.tsv")
sample_table <- collate_igblast_results(sample_table = sample_table)
[1] "Searching for MiXCR results in directory:/projects/nhl_meta_analysis_scratch/gambl/results_local/gambl/mixcr-1.2/01-mixcr/genome/"
[1] "Removing empty entries"
[1] "Removing rows that exceed missing threshold"
[1] "Joining tables"
Joining, by = "biopsy_id"
`

  • I ensured all dplyr functions that commonly conflict with other packages are fully qualified.

This can be checked and addressed by running check_functions.pl and responding to the prompts. Test your code after you do this.

  • I generated the documentation and checked for errors relating to the new function (e.g. devtools::document()) and added NAMESPACE and all other modified files in the root directory and under man.

Optional but preferred with PRs

  • I updated and/or successfully knitted a vignette that relies on the modified code (which ones?)

Checklist for New Functions

Required

  • I documented my function using ROxygen style.)

  • All parameters for the function are described in the documentation and the function has a decriptive title.

Example:

#' Use GISTIC2.0 scores output to reproduce maftools::chromoplot with more flexibility
#'
#' @param scores output file scores.gistic from the run of GISTIC2.0
#' @param genes_to_label optional. Provide a data frame of genes to label (if mutated). The first 3 columns must contain chromosome, start, and end coordinates. Another required column must contain gene names and be named `gene`. (truncated for example)
#' @param cutoff optional. Used to determine which regions to color as aberrant. Must be float in the range [0-1]. (truncated for example)
  • My function uses a library that isn't already a dependency of GAMBLR and I made the package aware of this dependency using the function documentation import statment.

Example:

#' @return nothing
#' @export
#' @import tidyverse ggrepel

Checklist for changes to existing code

  • I added/removed arguments to a function and updated documentation for all changed/new arguments

  • I tested the new code for compatability with existing functionality in the Master branch (please provide a reprex of how you tested the original functionality)

Copy link
Collaborator

@rdmorin rdmorin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions for how to improve efficiency and generalize

R/utilities.R Outdated
@@ -1323,6 +1323,8 @@ collate_results = function(sample_table,
these_samples_metadata,
case_set,
sbs_manipulation = "",
mixcr_results_directory = "",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameters that are specific to functions being called by collate_results should really not be passed to them from this function. It's going to get too messy as the collate_ family grows. Remove from here but leave them as available parameters in the new function.

R/utilities.R Outdated
#' Identify samples with mutated IGHV from MiXCR + IgBLAST results
#'
#' @param sample_table A data frame with sample_id as the first column.
#' @param results_directory Directory containging MiXCR/IgBLASTN results. Depends on files having the columns "mutatedStatus" and "numMissingRegions"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update docs after you make this optional so the users know they only need to specify it if they want to override the default

R/utilities.R Outdated
@@ -1367,6 +1369,7 @@ collate_results = function(sample_table,
sample_table = collate_ancestry(sample_table = sample_table, seq_type_filter = seq_type_filter)
sample_table = collate_sbs_results(sample_table = sample_table, sbs_manipulation = sbs_manipulation, seq_type_filter = seq_type_filter)
sample_table = collate_qc_results(sample_table = sample_table, seq_type_filter = seq_type_filter)
sample_table = collate_igblast_results(sample_table = sample_table, seq_type_filter = seq_type_filter, results_directory = mixcr_results_directory, missing_threshold = missing_threshold)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update code above to ensure the unix_group is always present in the samples table

R/utilities.R Outdated
# Make table with necessary columns from result files
mixcr_table = data.frame(biopsy_id=character(),
mixcr_mutated=character(),
missing=double())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start with a directory listing to avoid a loop. You probably will have to do this twice (once for each unix group). I think another recently added collate function does something similar

@Kdreval
Copy link
Contributor

Kdreval commented Feb 9, 2023

@mannycruz is this ready to go or is it still in development?

@mannycruz
Copy link
Collaborator Author

@Kdreval Ready to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants