-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some updates #27
Merged
Merged
Some updates #27
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Updating from mmollina
update DESCRIPTION
Updating from mmollina
Example rules: unwrapped: examples that run in < 4s \dontrun{}: just for missing software, APIs, etc \donttest{}: examples that take more than 4s Functions: - get_genomic_order: unwrapped (takes less than 1s) - sim_homologous: (already unwrapped) - extract_map: unwrapped (takes less than 1s) - print_mrk: unwrapped (takes less than 1s) - rev_map: unwrapped (takes less than 1s) - export_map_list: unwrapped (takes less than 1s); also changed example to avoid writing local files - drop_marker: unwrapped (takes less than 1s) - plot_genome_vs_map: unwrapped (takes less than 1s) - elim_redundant: (already unwrapped) - summary_maps: removed dontrun and formattable (is not a dependency) - segreg_poly: (already unwrapped - runs in less than 2s) - plot_map_list: changed from dontrun to donttest (takes more than 5s) - import_data_from_polymapR: changed from dontrun to donttest (polymapR is a suggested package) - filter_segregation: removed dontrun because it takes less than 1s - ls_linkage_phases: removed dontrun because it takes less than 3s - calc_genoprob_error: changed from dontrun to donttest - import_from_updog: changed from dontrun to donttest (updog is a suggested package) - update_map: removed dontrun and formattable (is not a dependency) - plot_mrk_info: removed dontrun (runs in less than 3s) - poly_cross_simulate: (already unwrapped - runs ins less than 1s) - read_geno_csv: changed from dontrun to donttest - check_data_sanity: changed from dontrun to donttest - add_marker: changed from dontrun to donttest - make_mat_mappoly: changed from dontrun to donttest - loglike_hmm: changed from dontrun to donttest - dist_prob_to_class: changed from dontrun to donttest - dist_prob_to_class: (duplicated) - import_phased_maplist_from_polymapR: changed from dontrun to donttest (polymapR is a suggested package) - dist_prob_to_class: (duplicated) - update_missing: changed from dontrun to donttest - cache_counts_twopt: changed from dontrun to donttest (also changed n.cores example from 8 to 1 due to problems when checking with --run-donttest) - read_geno: changed from dontrun to donttest - calc_genoprob: changed from dontrun to donttest - export_data_to_polymapR: changed from dontrun to donttest (polymapR is a suggested package) - get_submap: changed from dontrun to donttest - est_rf_hmm: changed from dontrun to donttest - mds_mappoly: changed from dontrun to donttest - split_and_rephase: changed from dontrun to donttest - est_rf_hmm_single: changed from dontrun to donttest - merge_datasets: changed from dontrun to donttest - calc_genoprob_dist: changed from dontrun to donttest - read_geno_prob: changed from dontrun to donttest - calc_prefpair_profiles: changed from dontrun to donttest - est_rf_hmm_sequential: changed from dontrun to donttest - merge_maps: changed from dontrun to donttest - est_full_hmm_with_global_error: changed from dontrun to donttest - calc_homoprob: changed from dontrun to donttest - make_pairs_mappoly: changed from dontrun to donttest - rf_list_to_matrix: changed from dontrun to donttest - group_mappoly: changed from dontrun to donttest - est_full_hmm_with_prior_prob: changed from dontrun to donttest - read_vcf: changed from dontrun to donttest - est_pairwise_rf: changed from dontrun to donttest - filter_missing: unwrapped (takes less than 1s) - make_seq_mappoly: changed from dontrun to donttest - rf_snp_filter: changed from dontrun to donttest There isn't any remaining function with \dontrun{} Check package tests for all \donttest{} examples. Processes can not spawn in multiple cores during this step. Then, the example below was removed from est_pairwise_rf. It wouldn't be possible to achieve the desired goals without multiple-core processing. Also, the tetraploid example was changed to include only the first chromosome. Removed example: ## Hexaploid example fl = "https://github.com/mmollina/MAPpoly_vignettes/raw/master/data/BT/sweetpotato_chr1.vcf.gz" tempfl <- tempfile(pattern = 'chr1_', fileext = '.vcf.gz') download.file(fl, destfile = tempfl) dat.dose.vcf = read_vcf(file = tempfl, parent.1 = "PARENT1", parent.2 = "PARENT2") ## Filtering dataset by marker dat.filt.mrk <- filter_missing(input.data = dat.dose.vcf, type = "marker", filter.thres = 0.10, inter = FALSE) ## Filtering dataset by individual dat.filt.ind <- filter_missing(input.data = dat.filt.mrk, type = "individual", filter.thres = 0.10, inter = FALSE) ## Segregation test pval.bonf <- 0.05/dat.filt.ind$n.mrk mrks.chi.filt <- filter_segregation(dat.filt.ind, chisq.pval.thres = pval.bonf, inter = FALSE) seq.ch1<-make_seq_mappoly(mrks.chi.filt) plot(seq.ch1) ## will take ~ 19 min / peak of memory usage ~ 10GB all.pairs.1 <- est_pairwise_rf(input.seq = seq.ch1, ncpus = 7, verbose=TRUE) ## same thing, but it will take ~ 21 min / peak of memory usage ~ 6GB all.pairs.2 <- est_pairwise_rf(input.seq = seq.ch1, ncpus = 7, n.batch = 10, verbose=TRUE) plot(all.pairs.1, 161, 162) mat <- rf_list_to_matrix(all.pairs.1) plot(mat) Also removed this from function make_seq_mappoly (in make_seq.R): ## Making a sequence using the intersection between groups and genomic information s <- make_seq_mappoly(tetra.solcap, 'all') tpt <- est_pairwise_rf(input.seq = s, ncpus = 7) mat <- rf_list_to_matrix(tpt) grs <- group_mappoly(input.mat = mat, expected.groups = 12, comp.mat = FALSE) seq1 = make_seq_mappoly(grs, arg = 1, genomic.info = 1) Changed number of cores from 7 to 1 in function est_rf_hmm_sequential (in est_map_hmm.R) Changed number of cores from 7 to 1 in function make_pairs_mappoly (in make_pairs.R) Changed number of cores from 7 to 1 and arg="all" to arg="seq1" in function rf_list_to_matrix (in rf_list_to_matrix.R) In file import_from_polymapR.R, changed ncpus from 7 to 1 and replaced the following code: #### Reestimating recombination fractions using HMM cl <- parallel::makeCluster(5) parallel::clusterEvalQ(cl, require(mappoly)) parallel::clusterExport(cl, "mappoly.data") reest.maps <- parallel::parLapply(cl, mappoly.maplist, est_full_hmm_with_global_error, error = 0.05) parallel::stopCluster(cl) by this: reest.maps <- lapply(mappoly.maplist, est_full_hmm_with_global_error, error = 0.05) Also replaced this: cl <- parallel::makeCluster(5) parallel::clusterEvalQ(cl, require(mappoly)) parallel::clusterExport(cl, "mappoly.data") recons.maps <- parallel::parLapply(cl, MAPs, est_full_hmm_with_global_error, error = 0.05) parallel::stopCluster(cl) by this: recons.maps <- lapply(MAPs, est_full_hmm_with_global_error, error = 0.05) Another error: "T used instead of TRUE" Searched and fixed this kind of occurrence on all examples. Also changed number of cores in import_from_updog function example (in import_from_updog.R) and removed this section: mydata = import_from_updog(mout, filter.non.conforming = TRUE) mydata plot(mydata) Bug detected when running untested examples: - function plot.mappoly.homoprob (homolog_probs.R) line 131: changed if(lg=="all") to if(all(lg=="all")) to fix a problem when passing a vector of chromosomes as lg argument - function plot.mappoly.homoprob (homolog_probs.R): added verbose parameter - function plot.mappoly.homoprob (homolog_probs.R) lines 116 and 139: check for verbosity
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
More information in commit messages.