-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ohun #568
Comments
Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type |
Dear @maRce10 , Many thanks for your submission and apologies for the delay. The editorial team was taking a break for the holidays and are now catching up. We'll respond with a decision on whether to accept for review shortly. |
Hello again @maRce10 , I can confirm we are happy to accept the package for review! I will shortly assign a handling editor. In the meantime, I'll run our automated checks. If any issues are flagged, feel free to start addressing them. |
@ropensci-review-bot check package |
Thanks, about to send the query. |
🚀 Editor check started 👋 |
Checks for ohun (v0.1.0)git hash: c1b9be0e
Important: All failing checks above must be addressed prior to proceeding Package License: GPL (>= 2) 1. Package DependenciesDetails of Package Dependency Usage (click to open)
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table. basec (33), lapply (27), length (25), sapply (24), unique (22), data.frame (13), rbind (13), is.na (12), unlist (12), do.call (11), if (11), list (11), paste0 (11), sum (11), names (10), drop (9), nrow (9), table (9), for (8), grep (7), strsplit (7), as.data.frame (6), mean (6), which (6), col (5), file (5), ncol (5), round (5), setdiff (5), match.call (4), min (4), seq (4), cos (3), eval (3), getOption (3), paste (3), proc.time (3), sin (3), split (3), any (2), apply (2), as.list (2), by (2), call (2), deparse (2), diff (2), expand.grid (2), matrix (2), scale (2), abs (1), as.numeric (1), cat (1), colMeans (1), colnames (1), complex (1), file.path (1), floor (1), grepl (1), ifelse (1), is.null (1), is.numeric (1), max (1), order (1), pi (1), plot (1), regexpr (1), rep (1), rownames (1), seq.int (1), substring (1), summary (1), t (1), try (1), vector (1), which.min (1) warbleRpblapply_wrblr_int (17), read_sound_file (8), sound_pressure_level (4), info_sound_files (3), duration_sound_ (2), duration_sound_files (2), gaps (2), duration_wavs (1), envelope (1), stft_wrblr_int (1) ohunFUN (10), diagnose_detection (2), find_templates (2), internal_feature_reference (2), spc_FUN (2), XC_FUN (2), detect_FUN (1), energy_detector (1), feature_acoustic_data (1), label_detection (1), t2xy (1) spover (3), Polygon (3), Polygons (3), Spatia (3), SpatialPoints (3), SpatialPolygons (1) graphicslines (5), par (4), clip (1), grid (1) methodsis (10) statspoly (4), prcomp (2), smooth (2), na.omit (1), step (1) parallelmakeCluster (4), makePSOCKcluster (3) igraphas_data_frame (1), graph_from_incidence_matrix (1), max_bipartite_match (1) seewaveenv (2), duration (1) utilspackageVersion (2), head (1) crayonitalic (1), silver (1) grDevicescm (1) rlangcall_args (1) tuneRwriteWave (1) NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately. 2. Statistical PropertiesThis package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing. Details of statistical properties (click to open)
The package has:
Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the The final measure (
2a. Network visualisationClick to see the interactive network visualisation of calls between objects in package 3.
|
id | name | conclusion | sha | run_number | date |
---|---|---|---|---|---|
3864222159 | pages build and deployment | success | c1b9be | 94 | 2023-01-07 |
3864222174 | test-coverage | success | c1b9be | 7 | 2023-01-07 |
3899091232 | tic | success | c1b9be | 49 | 2023-01-12 |
3b. goodpractice
results
R CMD check
with rcmdcheck
R CMD check generated the following note:
- checking installed package size ... NOTE
installed size is 5.9Mb
sub-directories of 1Mb or more:
doc 5.2Mb
R CMD check generated the following check_fails:
- cyclocomp
- no_description_depends
- no_import_package_as_a_whole
- rcmdcheck_reasonable_installed_size
Test coverage with covr
ERROR: Test Coverage Failed
Cyclocomplexity with cyclocomp
The following functions have cyclocomplexity >= 15:
function | cyclocomplexity |
---|---|
energy_detector | 52 |
label_detection | 35 |
template_correlator | 34 |
split_acoustic_data | 26 |
feature_reference | 25 |
get_envelopes | 21 |
optimize_energy_detector | 20 |
summarize_diagnostic | 20 |
diagnose_detection | 19 |
label_spectro | 18 |
Static code analyses with lintr
lintr found the following 558 potential issues:
message | number of times |
---|---|
Avoid 1:length(...) expressions, use seq_len. | 2 |
Avoid 1:ncol(...) expressions, use seq_len. | 3 |
Avoid 1:nrow(...) expressions, use seq_len. | 34 |
Avoid library() and require() calls in packages | 3 |
Avoid using sapply, consider vapply instead, that's type safe | 27 |
Lines should not be more than 80 characters. | 488 |
unexpected symbol | 1 |
4. Other Checks
Details of other checks (click to open)
✔️ Package contains the following (potentially) obsolete packages:
- sp
See our Recommended Scaffolding for alternatives.
Package Versions
package | version |
---|---|
pkgstats | 0.1.3 |
pkgcheck | 0.1.0.32 |
Editor-in-Chief Instructions:
Processing may not proceed until the items marked with ✖️ have been resolved.
Hi thanks for reviewing my package. Not sure why you got problems with coverage. It shows 78% coverage in codecov: https://app.codecov.io/github/maRce10/ohun as well as in the coverage status badge https://github.com/maRce10/ohun So I am not sure how to proceed now. Cheers Marcelo |
Hello @maRce10 Indeed, that's odd. Let us look into it and get back to you. |
I replaced with seq_len() and vapply() when possible, replace cat() with message() and removed unused dependencies/suggests: 6ea380d |
@maRce10 It took a bit of digging, but the test failures are because
Many of the libraries given in your actions file will automatically pull in several of the others, so you should reduce this set to the minimal possible length, either through checking disribution listings (at least for Debian or Ubuntu), or iterating in a docker container. Please ping here when your DESCRIPTION has been updated, and let us know what else we might need to install. Thanks. |
OK thanks. These non-standard dependencies are not from the ohun package but from a package it relies heavily on, warbleR. These dependencies are now included in the DESCRIPTION file of warbleR on github. Should I also add them to ohun? |
No, if they're from another package then they're dependencies of that package not yours. I'll just add flac to our system and ping here when checks can be re-run. |
@maRce10 Our system has been rebuilt, and your tests all run successfully now. You may call the bot to run the checks whenever you like. |
@ropensci-review-bot assign @jhollist as editor |
Assigned! @jhollist is now the editor |
@ropensci-review-bot check package |
Thanks, about to send the query. |
🚀 Editor check started 👋 |
Checks for ohun (v0.1.0)git hash: 62cdc87d
(Checks marked with 👀 may be optionally addressed.) Package License: GPL (>= 2) 1. Package DependenciesDetails of Package Dependency Usage (click to open)
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table. basec (35), lapply (27), length (26), unique (22), vapply (17), sum (14), data.frame (13), rbind (13), unlist (13), is.na (12), do.call (11), if (11), list (11), paste0 (11), names (10), seq_len (10), drop (9), nrow (9), table (9), for (8), mean (8), strsplit (8), grep (7), numeric (7), sapply (7), as.data.frame (6), which (6), col (5), ncol (5), round (5), setdiff (5), any (4), file (4), match.call (4), min (4), seq (4), cos (3), eval (3), getOption (3), logical (3), proc.time (3), sin (3), split (3), apply (2), as.list (2), by (2), call (2), deparse (2), diff (2), expand.grid (2), matrix (2), paste (2), scale (2), abs (1), as.numeric (1), character (1), colMeans (1), colnames (1), complex (1), file.path (1), floor (1), grepl (1), ifelse (1), is.null (1), is.numeric (1), match.arg (1), max (1), order (1), pi (1), plot (1), regexpr (1), rep (1), rownames (1), seq.int (1), substring (1), summary (1), t (1), try (1), vector (1), which.min (1) warbleRpblapply_wrblr_int (17), read_sound_file (8), sound_pressure_level (4), info_sound_files (3), duration_sound_ (2), duration_sound_files (2), gaps (2), duration_wavs (1), envelope (1), stft_wrblr_int (1) ohunFUN (10), diagnose_detection (2), find_templates (2), internal_feature_reference (2), spc_FUN (2), XC_FUN (2), colortext (1), detect_FUN (1), energy_detector (1), feature_acoustic_data (1), label_detection (1), message2 (1), t2xy (1) spover (3), Polygon (3), Polygons (3), Spatia (3), SpatialPoints (3), SpatialPolygons (1) graphicslines (5), par (4), clip (1), grid (1), text (1) statspoly (4), prcomp (2), smooth (2), na.omit (1), step (1) parallelmakeCluster (4), makePSOCKcluster (3) igraphas_data_frame (1), graph_from_incidence_matrix (1), max_bipartite_match (1) seewaveenv (2), duration (1) utilspackageVersion (2), head (1) clistyle_italic (1) grDevicescm (1) methodsas (1) rlangcall_args (1) tuneRwriteWave (1) NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately. 2. Statistical PropertiesThis package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing. Details of statistical properties (click to open)
The package has:
Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the The final measure (
2a. Network visualisationClick to see the interactive network visualisation of calls between objects in package 3.
|
id | name | conclusion | sha | run_number | date |
---|---|---|---|---|---|
3944417614 | pages build and deployment | success | 62cdc8 | 103 | 2023-01-18 |
3944417809 | test-coverage | success | 62cdc8 | 16 | 2023-01-18 |
3945647101 | tic | success | 62cdc8 | 64 | 2023-01-18 |
3b. goodpractice
results
R CMD check
with rcmdcheck
R CMD check generated the following note:
- checking installed package size ... NOTE
installed size is 5.9Mb
sub-directories of 1Mb or more:
doc 5.2Mb
R CMD check generated the following check_fails:
- cyclocomp
- no_description_depends
- no_import_package_as_a_whole
- rcmdcheck_reasonable_installed_size
Test coverage with covr
Package coverage: 78.11
Cyclocomplexity with cyclocomp
The following functions have cyclocomplexity >= 15:
function | cyclocomplexity |
---|---|
energy_detector | 52 |
label_detection | 35 |
template_correlator | 34 |
split_acoustic_data | 26 |
feature_reference | 25 |
get_envelopes | 21 |
optimize_energy_detector | 20 |
summarize_diagnostic | 20 |
diagnose_detection | 19 |
label_spectro | 18 |
Static code analyses with lintr
lintr found the following 512 potential issues:
message | number of times |
---|---|
Avoid 1:nrow(...) expressions, use seq_len. | 3 |
Avoid library() and require() calls in packages | 3 |
Avoid using sapply, consider vapply instead, that's type safe | 10 |
Lines should not be more than 80 characters. | 495 |
unexpected symbol | 1 |
4. Other Checks
Details of other checks (click to open)
✖️ Package contains the following obsolete packages:
- sp
See our Recommended Scaffolding for alternatives.
Package Versions
package | version |
---|---|
pkgstats | 0.1.3 |
pkgcheck | 0.1.1.11 |
Editor-in-Chief Instructions:
This package is in top shape and may be passed on to a handling editor
🥳 |
@ropensci-review-bot submit review #568 (comment) time 10 |
Logged review for sammlapp (hours: 10) |
@ropensci-review-bot submit review #568 (comment) time 15 |
Logged review for robitalec (hours: 15) |
@maRce10: please post your response with Here's the author guide for response. https://devguide.ropensci.org/authors-guide.html |
Apologies for the delay. Many thanks to both reviewers. I am really grateful for all your time and effort. I have made changes and responded to your comments/suggestions below, quoting the commit the change was made in. Response to @sammlapp@ropensci-review-bot submit response #568 (comment) comment: Vignette(s): demonstrating major functionality that runs successfully locally - I wasn't able to see or access the vignette in R studio with vignette() response: I have added a link to each of the 3 vignettes to README ropensci/ohun@f857949 comment: I don't typically use R, so some things are not obvious to me. Where can I find the full API documentation? I opened the /docs/index.html to view the docs by rendering the HTML locally, but presumably they are online somewhere. I would recommend linking to the documentation near the top of the README. response: I added a link to the package documentation to the top of README ropensci/ohun@1ffc441 comment: The use of the word "reference" to mean "audio annotations" or "labels" is unfamiliar to me, I'm not sure if this word is typically used in the related warbleR package. response: The use of the term reference to refer to the ground truth annotations seems to be common in the field (check for instance Mesaros, A., Heittola, T., Virtanen, T., & Plumbley, M. D. (2021). Sound event detection: comment: The package should give the user some expectation for under what circumstances the detection methods are likely to be effective. Although it mentions that high SNR is required for amplitude thresholding and stereotyped sounds are necessary for template matching, there are other requirements of the data for these approaches to be effective. response: I added that time-frequency overlaps with other sounds can affect performance. I am not aware of any other main requirement. ropensci/ohun@d85c0e0 comment: to improve documentation and function/variable/argument names to better match the action that is actually taken by a function; response: The function filter_detection() has been renamed consensus_detection() ropensci/ohun@c1e108d comment: to consider a moderate refactor of code to increase modularity, reduce redundancy, and aim to follow the Single Responsibility Principle; response: a modular approach was used for checking arguments. Note that all task that are shared by different functions are carried out by a unique function. For instance, diagnosing detection performance in optimize_template_detector() or optimize_energy_detector() is done by calling diagnose_detection() and summarize_diagnostic(). diagnose_detection() also calls label_detection() which can also be called by users. get_templates() is called within template_detector(). Now all functions call check_arguments(). ropensci/ohun@c1e108d comment: increase code readability by breaking complex logical lines into separate lines with additional comments, plus comments describing the general purpose of large blocks of code. response: I simplify the code in diagnose_detection() and summarize_diagnostic() maRce10/ohun@69d81da42cc5b831779be1ebb3aaa2e794464877e. Note that functions that deal with envelopes tend to avoid duplicating envelope objects to avoid using to much memory_ comment: Template-based cross correlation is implemented in monitoR, while energy-based detection is implemented in bioacoustics (the implementation seems more generic and featured than the one in ohun). I wonder if the proposed value of Ohun is mostly in its specific ability to interact with the formats of annotations produced by warbleR. If so, the top-level documentation and README could clarify this purpose. response: monitoR offers no way to evaluate detection performance. Ohun does, and can even be applied for optimizing detections. comment: Specifically, when I started using the program without having any experience with warbleR, it was not clear to me how to create annotation tables or generally what the utility of most of the package functions are. Writing a description of the characteristics of data that this package expects as input would be helpful. For instance, "This package provides utilities for comparing detection and annotations of audio events described by frequency and time boxes. Creating annotations containing such frequency-time boxes [is possible in the warbleR package?]." response: I added a paragraph similar to that proposed by the reviewer to README. ropensci/ohun@d4edd0d
comment: In the language of rOpenSci software recommendations, this is a sign that the functions are not "modular". In other places it might be seen as a violation of the Single Responsibility Principle (SRP). For example, the label_specto function violates SRP because it doesn't just label a spectrogram: it computes a spectrogram, plots it, then overlays labels, and might additionally comput and plot an amplitude signal. As another example, I get_envelopes.R and energy_detector.R seem to both implement the process of getting an amplitude envelope, rather than using shared modular functionality specific to this purpose. Though it would take a significant refactoring effort, I believe creating a more modular code base that aims to achieve the SRP could improve the package and user experience overall. response: The function label_spectro() was created for making the graphs in the vignette. This is stated in the description of the function in bold font ("Created for graphs included in the vignette, and probably only useful for that or for very short recordings"). So it is not call by any other function or calls any other ohun function. It just for the vignette. Not sure how to make this more evident. ropensci/ohun@f857949 comment: In general, the code is difficult to read because of the formatting and smushing lots of logic into a single functional "line" (which is broken into many lines anyways due to formatting). Separating out logical steps into sequential lines, with comments where appropriate, would make the code easier to read/understand/maintain. However, it would take a major refactoring effort. response: As mentioned above I simplify the code in diagnose_detection() and summarize_diagnostic() which were the most cumbersome maRce10/ohun@69d81da42cc5b831779be1ebb3aaa2e794464877e. Note that functions that deal with envelopes tend to avoid duplicating envelope objects to avoid using to much memory_ comment: Consider adding units to variable names eg hold_time_ms response: I think that is a good suggestion although it is extremely uncommon in R packages so I would rather not include units in names to keep match what R users are expecting. split_acoustic_data.R:comment: documentation should list supported sound file types response: I added supported formats to all function working on sound files ropensci/ohun@63f2be1 comment: documentation should list required columns of table response: The argument description already mentioned that: comment: documentation should say what will happen with incomplete audio clip at end of file response: Not sure I follow. I think there is only when thing that can happen: create the short clip (not really incomplete) comment: comments denoting the purpose of larger sections of code are helpful, eg # INPUT VALIDATION response: I added more comments ropensci/ohun@05daadb ##summarize_diagnostic.R: comment: The code style makes it difficult to review the logic or infer the operation performed by much of this code. For example, lines 131-141 of summarize_diagnostic.R. In general I'm not reading this code base to check for logical errors, but I think that the code style will make it difficult for others (or the authors) to debug, extend, understand or modify the code. In this example, separating the logical operation into a few separate and sequential lines of code could make it easier to digest, and would provide opportunities to add comments for any individual steps whose purposes are not obvious. (For example in this code, a comment could explain why we would want a weighted mean of true positive durations.) response: as mention above I simplify the code to make it easier to understand (and debug) maRce10/ohun@69d81da42cc5b831779be1ebb3aaa2e794464877e comment:* Input validation is redundant across many functions, eg validating the cores argument. Rather than repeating code in several modules, use modular helper functions. That way, if you change something it is consistent across the codebase. comment: Comments should explain not just what's being done but also why (rather than re-stating the code in English). For example, instead of # set bottom and top freq if not supplied you could explain the bottom and top frequency are needed to make a spectrogram. if they're not supplied, we set them to 0 and Ns=sample_rate/2. Another example, in template_detector.R L177 "# if no detections" is not a helpful comment - the comment should explain what will happen and why. response: More comments were added. ropensci/ohun@05daadb and ropensci/ohun@4e3449d comment: Since I'm not an R programmer, I might not understand conventions for writing R modules. However, I don't think it's a good idea to define your helper functions inside another function. For example L218 of template_correlator.R begins a definition of a function to create a spectrogram. Even if this is not part of the API (its a hidden or internal function), I think it should be defined outside of the template_correlator function (1) because it doesn't make sense that it would be re-defined each time template_correlator is called, and (2) because it makes reading the code for template_correlator unnecessarily long. response: I moved the internal function out of template_correlator() ropensci/ohun@7c47092. Same thing for energy_detector() ropensci/ohun@4ca47bd comment: I'm not sure how namespaces work in R but it would be helpful for a code reader to know where functions are coming from. For instance, get_templates.R calls find_templates which is not explicitly imported or defined anywhere in the file. It apparently can be accessed because it exists in another .R file in the package, or because all functions from the package have been imported into the default namespace. response: Internal functions are common in R. They can be called using namespace:::function. All internal functions in ohun are found in internal_functions.R Notes on the Getting Started tutorial pagecomment: The "main features" description is jargon-y and could be written in broader language. For example, "The use of signal detection theory indices to evaluate detection performance" -> "Evaluate detection accuracy with precision, recall, and other metrics". As is, I feel like it obscures the functionality of the package. As another example, I wasn't sure what "Optimize detection routines based on reference annotations" meant - after reading further, I think it basically means "Adjust detection parameters to optimize accuracy on a labeled set of recordings" response: I have included "Adjust detection parameters to optimize accuracy " into the package description (ropensci/ohun@7617753). However, "Evaluate detection accuracy with precision, recall, and other metrics" assumes that users know what those metrics are. Most biologists working on bioacoustics are not familiar with those terms. comment: "Template-based detection" It's not true that template-based detection "doesn’t depend on the signal-to-noise ratio" - high background noise or low signal level will decrease the accuracy of this approach response: I reworded this: "As it is less affected by signal-to-noise ratio it's more robust to higher levels of background noise" ropensci/ohun@3fd70a7 comment: outputs (eg, tables) in the Getting Started tutorial are very hard to interpret in the current form, could they be rendered as human-readable tables? response: I don't think humans cannot read them (I can). I agree they could more printed in a more friendly way but that would require adding more dependencies (e.g. kntr, kableExtra) comment: An example of how to analyze a large set of files, for instance all files in a folder-subfolder structure, would be helpful. I had to do some googling to figure out how to do "globbing" ie find all paths mathing a pattern. response: I agree, but the amount of example data that can be included is very limited for R packages. Notes on API documentation and variable/function namescomment: Throughout the codebase, I recommend more informative variable and argument names: for example, in the function merge_overlaps, to a new user, pb is not interpreted as "progress bar" response: I try to follow the same variable name use b other packages (including those creating progress bars) comment: Throughout the documentation: why are arguments that are numbers/strings called "numeric vectors of length 1" and "character vectors of length 1"? This makes me thing I should pass c(0) or [0] instead of 0, for instance, as an argument. response: c(1) and 1 are equivalent and are both numeric vectors of length 1 (0 is not used as an index in R). Not sure how else I can call them. get_envelopes :comment: smooth argument - should state the specific function/implementation of smoothing used response: I added that is done using the function envelope from warbleR ropensci/ohun@6a2e11e comment: paths and files arguments: using the current path as default is fragile and does not match the conventions used elsewhere in the package. I recommend making the argument(s) mandatory. It is also unclear that specifying a folder but not files means that the function should attempt to analyze any/all audio files in the folder (only a specific file extension? how does it know which files to analyze or not?). Is that what happens if files=NULL?_ response: I agree that making path mandatory would be safer. However many R users like to set the working directory (using setwd()) before running analyses. That's why is common for functions to look into the current working directory by default. All 5 functions with the argument files use the same default value (NULL). I added that: "If not supplied the function will work on all sound files (in the supported format) in 'path'." ropensci/ohun@a00e687 energy_detector:comment: again, I don't think making files optional and defaulting to all audio in the working directory is a good idea. response: Agree, but again it's friendly for most R users. label_spectro:comment: can you provide an example of how to use this function with an audio file? If I want to run it on anything other than a simulated array of numbers or the two built-in audio files, I need some prior line of code to load the audio file, response: As mentioned above, I made this function just to get a nice looking visualization of the data in the vignette. I don't think is very helpful for most real data. I have this info in bold in the description: "Created for graphs included in the vignette, and probably only useful for that or for very short recordings". However, people seem to get confused. I thought about just having it inside the vignette without including it as a function in the package. However users always ask how those figures are made. So that's why I included it as a function. comment: bp argument (in several functions): how is the signal bandpassed? Should state specifically, for example, a 9th order Butterworth bandpass filter response: I added that: "Bandpass is done using the function code{\link[seewave]{ffilter}}, which applies a short-term Fourier transformation to first create a spectrogram in which the target frequencies are filtered and then is back transform into a wave object using a reverse Fourier transformation." ropensci/ohun@59a984f feature_reference:comment: "Extract quantitative features of references" - from the function name and short description, I expected this to extract acoustic features of the annotated audio events themselves, rather than summary statistics of the annotations per se. Perhaps the description could be "Summarize temporal and frequency dimensions of annotations and gaps" or something similar. Ideally, the function's name would reflect its functionality, something like summarize_annotation_boxes. response: I addopted the suggested description and added an alternative name for the function: summarize_reference() ropensci/ohun@e0f0c75 comment: gap times: I tried using this summary to choose a hold.time argument for energy_detector, but the min, max, and mean are not very informative, because what matters is the typical gap time between close-together annotations. Maybe the median or 25th percentile, or some other statistic, would be more informative. response: Yeah that is a tricky thing to get and depends a lot on duty cycle and signal duration. But I don't think there is a single metric that can get a good estimate. min and mean still feature_acoustic_datacomment: Similarly, feature_acoustic_data sounds like it would measure some acoustic feature. Since the function returns information about the file types, I would call it something like summarize_audio_file_characteristics response: I added an alternative name for the function: summarize_acoustic_data() ropensci/ohun@6a54b30 filter_detectioncomment: I'm confused as to what this function is doing. The documentation says it "removes ambiguous detections (split and merged detections)", but the detailed documentation should explain what is meant by split and merged detections. I'm also confused whether this is meant to be applied to human-created annotations, automatically-generated detections, or can be used for either purpose. response: I added an alternative name to the function: consensus_detection(). I also added that: "Useful when several detections match the same reference as in the case of template detection with multiple templates (see \code{\link{template_detector}})." ropensci/ohun@a6e8e3a optimize_template_detectorcomment: this function compares performance across parameter values, it does not itself optimize the algorithm, so its name is a bit misleading. Perhaps template_parameter_sweep is closer. I'm not sure what paramter values it sweeps over after reading the documentation, since it requires the user to have already run template_correlator. Also, I think the part of the documentation that says "Those parameters can be later used for performing a more efficient detection using optimize_template_detector." must be a typo, perhaps it means "using template_correlator" instead? response: Thanks I fixed the typo. Naming functions is always hard. To me optimize_template_detector tells me more about what is going on that template_parameter_sweep. label_spectrocomment: this is a plotting function, so it would be better named something like plot_spec_and_labels response: Again naming functions is a tricky an subjective thing. In addition, as I said above, this function is mostly used to create spectrograms in the vignette. get_templatescomment: can you explain what features are being used to create this principle component analysis of template similarity? Also, after understanding what this function does, it seems like it would be better named something like select_representative_templates or select_representative_annotations since it chooses representative templates from a set of annotations. response: I expanded the explanation in the function documentation: "the function will estimate it by measuring several acoustic features using the function \code{\link[warbleR]{spectro_analysis}} (features related to energy distribution in the frequency and time domain as well as features of the dominant frequency contours, see \code{\link[warbleR]{spectro_analysis}} for more details) and summarizing it with Principal Component Analysis (after z-transforming parameters) using the function \code{\link[stats]{prcomp}}" ( ropensci/ohun@293efe7). I think get_templates suggests users what the function does. They should always read the documentation to get a full sense of what a function does. split_acoustic_datacomment: it should be possible, if not required, to list the audio files to split. It seems like I must split every file within a folder in the current implementation. response: Explained above. Behavior is due to common habits of R users. label_detectioncomment: function name does not match what it does. It seems this function evaluates detections against a set of annotations/labels/"references" so perhaps it should be called evaluate_detections or compare_annotations_and_detections response: It add labels in a new column to each detection. To me the name makes sense. evaluate_detections sounds more like diagnose_detections which is another function. template_correlatorcomment: it's unclear to me how to use the files argument. I don't know what 'X' is as referred to in the documentation, and I would expect to simply be able to pass a vector of file paths. response: Explained above. Behavior is due to common habits of R users. "There is no 'X' argument in template_correlator() full_spectrogramscomment: it was surprising to me that this function automatically saved image files to the current working directory, rather than asking for a save dir or displaying them in the console. This doesn't seem like a good practice, because I might not want to save the results and if I do, I'd like to choose where they go. response: That function is from another package. comment: Other note: I would be very surprised if template matching is faster than amplitude-based detection, as claimed in the "Improving detection speed" section. Amplitude-based detection is about the lightest task you can perform on a time signal while 2d cross correlation is relatively heavy. response: That have been my experience in R. It seems that getting and manipulating envelopes in R is particularly slow. Errors during usagecomment: When I try to run get_templates on my own file set, I get this error:
I tried to run get_templates on a set of local files, but it seems somehow one of the features has no variance and this is causing the PCA to fail. I wasn't able to resolve this. response: I added code to remove zero-variance variables from the PCA input data ropensci/ohun@a16ff19 comment: Running template_correlator then template_detector on the results produced a table containing NA values for many rows in the start, end, and scores columns. I did not expect this, and it caused errors when I tried to run other analyses on these detections. response: That can happen when no detections were found for some sound files. Just remove the NAs. comment: Once I ran template_correlator and template_detector, I wasn't sure how to inspect my detections although it seemed like I should be able to use full_spectrograms or some other package function. I eventually realized that if I used na.omit(selections) to drop rows with NA values, I could run full_spectrograms to look at the results. I wonder if there is a way to more selectively choose to look at some of the detections alone, rather than having to view the entire audio files (imagine if I had a 1 hour audio file with 3 1-second detections). response: full_spectrograms is not a ohun function. I added a new function plot_detection() to visually inspect detections ropensci/ohun@4bca6b2 comment: When I first ran label_spectro, I got the message "Error in loadNamespace(x) : there is no package called ‘viridis’". I'm not sure if this suggests a missing dependency. I resolved the issue by running install.packages("viridis") response: I added a checking step to make sure users have viridis installed ropensci/ohun@d6a4110
response: I added the error message ropensci/ohun@9fdd163 comment: Add a more helpful error message for accidental use of Hz instead of kHz: If my detection table has Hz instead of kHz and I run get_templates, the error message says "Caution! This is a null spectrum. The spectral entropy is null!" response: There is no easy way to tell if users mean Hz or kHz unfortunately comment: When I run label_spectro with env = TRUE, I get an error message: response: Not sure what is the problem. To me it runs and it is checked by testthat. The full name of the argument is "envelope" Response to @robitalecA statement of needcomment: Please consider adding a section to the README that 1) clarifies how ohun fits into a broader workflow of working with sound event and 2) if any other packages or approaches are available that share similar functionality to ohun. For 1), this could be adding few sentences describing what typical steps (and related packages) would come before and after using ohun. You could also add the workflow from the vignette to the README in this section. Consider a new user looking at the Reference page on the website or the list of available functions in ohun... how do they fit together? response: I added that "The implementation of detection diagnostics that can be applied to both built in detection methods and to those obtained from other software packages makes the package ohun an useful tool for conducting direct comparisons of the performance of different routines. In addition, the compatibility of 'ohun' with data formats already used by other sound analysis R packages (e.g. seewave, warbleR) enables the integration of 'ohun' into more complex acoustic analysis workflows in a popular programming enviroment within the research community." ropensci/ohun@78658c7 Installation instructionscomment: The dependency seewave brings system dependencies that might not be immediately obvious to a new user of ohun. It might be helpful to include a link to seewave's installing and troubleshooting system requirements page, for example: https://rug.mnhn.fr/seewave/inst.html response: I added "Further system requirements due to the dependency seewave may be needed. Take a look a this link for instruction on how to install/troubleshoot these external dependencies." ropensci/ohun@b237f58 Vignette(s)comment: It might be worth splitting this vignette into multiple vignettes. At its current length, I find it a bit hard to digest all the moving pieces and options. I personally find it easier to learn and use R packages if there are many, shorter vignettes that describe the different major functions or steps offered. You could have maybe four vignettes: introduction, energy based detection, template based detection, and diagnosing/improving. The introductory could expand on how ohun integrates with other R packages in this field, and provide the background given in sections "Automatic sound event detection" and "Signal detection theory applied to bioacoustics". Then the following vignettes using the sections already written from the current vignette. This might just be my opinion - so if you feel like it is better for ohun users to have everything in one place, let me know. Throughout the vignette, the warnings and messages were not shown using the chunk options warnings=FALSE and messages=FALSE. I would recommend leaving all messages and warnings in the rendered vignette, because a user may be surprised and confused to see them appear locally, when they are not shown in the vignette. response: I accepted the commit ti show warnings and messages in the vignette. I also split the vignette in 3: Introduction, energy based detection, template based detection. ropensci/ohun@1b9c95f Community guidelinescomment: Please consider also including the link to the repository (https://github.com/maRce10/ohun) in the URL field in the DESCRIPTION as it should have the most consistently up to date information for the package, for example in the case that the website is not updated. response: Done (accepted the commit) Functionality Packaging guidelinescomment: Looks good: package name, snake_case, object_verb function naming, citation, and code metadata. Consider using this following options for keeping the code metadata up to date and for keeping the CITATION file up to date. Code style improved with a recent commit using the styler package. Before that, logical statements with if else were not easy to read. It is not clear whether or not functions in ohun are pipeable (see the second point here). If you think that is a relevant option, consider including an example in the vignette that shows this. response: They are not planned to be pipeable. comment: As mentioned above, the README could use more information about how ohun fits into the broader ecosystem, comparing ohun to other packages that provide related functionality and details on the system requirements for seewave. The README is also missing a brief demonstration of functionality (see the sixth point here). The long form examples in the vignettes are useful for a user that is already intending to use ohun but shorter examples in the README help new users understand what ohun is offering. response: I added more details about how it fits into the broader ecosystem following another comment above. I agree that some examples in the README might be helpful. However, it's is hard to pick up a short example that can fit in the README as sound event detection tend to require several steps comment: Documentation looks good with all functions having accompanying examples, detailed return values and references included in the vignette. There is good use of @Seealso to link users to related functions in ohun. Only one URL needed editing (according to urlchecker), see below under Additional comments/README. You could consider using vdiffr for testing plots. Currently the tests for functions that produce plots are just testing that they return NULL. This isn't a very useful test, and while testing plots is not straightforward or intuitive, you could see if vdiffr helps! More detailed comments on testing below under Additional comments/Tests. response: I added vdiffr tests ropensci/ohun@333817c Additional comments**comment:**The following sections have further details from above, and sometimes an accompanying pull request. I found it easier to work through the review by looking at the source code locally, and making changes as I noticed things. I also felt like it would be easier to highlight specific lines of code by just editing them, instead of referring to eg. Lines 83-85 in some file. No pressure to accept any of these pull requests directly, and I hope that you find it a clear and useful approach. Repository _response: I ACCEPTED ALL PULL REQUESTS. THANK YOU! **comment:**The repository size is ~ 120 mb.I recommend that you delete, at least, the following files:
response: pull request accepted comment: Some of these files were presumably added before they were listed in the .gitignore file. I have deleted them in the following pull request: ropensci/ohun#12 response: pull request accepted comment: Consider stating explicitly which package provides the parallel processing for ohun in the sentence starting "All functions allow the parallelization of tasks..." response: Done: "All functions allow the parallelization of tasks (using the packages parallel and pbapply)..." ropensci/ohun@2c332d4 The badge for R-CMD-check seems to point to an old workflow. The current workflow (tic) shows failing tests, whereas the R-CMD-check shows passing test. I would recommend removing this old badge if it is no longer relevant. response: Done ropensci/ohun@46a1eaf comment: A couple minor edits to the README are in the following pull request: response: pull request accepted comment: It might be helpful to users if the items in NEWS were accompanied with references to the relevant pull requests and issues (eg. #5). For example: https://github.com/tidyverse/ggplot2/blob/main/NEWS.md _response: OK. comment: Vignette As mentioned above, consider splitting the single vignette into multiple vignettes:
response: As mentioned I also split the vignette in 3: introduction, energy based detection, template based detection. ropensci/ohun@1b9c95f comment: Minor proposed edits to the vignette are in the following pull request: ropensci/ohun#14 The CONTRIBUTING.md document looks like it's based on the template from https://contributing.md/. Because of this, there were some places that links to sections didn't work, and it often still referred to the CONTRIBUTING.md version. response: pull request accepted comment: There isn't a code of conduct in the repository to link to, so I removed that dead link. But please consider including one (see the rOpenSci pages for more details: Contributing Guide, Code of Conduct). response: pull request accepted Relevant pull request: ropensci/ohun#15 comment: I propose removing the "last modification date" from the R/ files since this information is redundant and more error prone than the information provided by Git. This change is included in the pull request under R. response: Done ropensci/ohun@88fdf91 DESCRIPTIONcomment: I suggest moving packages under "Depends" to "Imports". The case of adding packages to the Depends field does not seem like a universal yes/no clear cut decision but from my understanding and reading around this topic, I feel like for the ohun package it is not necessary to include packages in the Depends field. From https://devguide.ropensci.org/building.html#pkgdependencies
See also https://github.com/leeper/Depends For an example where it might be relevant to list a package in Depends: From https://r-pkgs.org/dependencies-mindset-background.html#sec-dependencies-imports-vs-depends
In the case of ohun, some of the packages listed under Depends also list some packages under Depends:
The full dependency tree is here: If the recommendation to move some packages to move Imports instead of Depends, the following issues in the examples can be resolved, for example in feature_acoustic_data.R:
response: pull request accepted comment: Minor suggested edits to the Description include:
Relevant pull request: ropensci/ohun#16 response: pull request accepted comment: I suggested some edits to the test scripts to better take advantage of the variety and precision of available functions from testthat. Often, the tests were expect_true where you set up a logical statement, eg. nrow(DT) == 3. This works but when it fails, there is nothing informative about the error. Alternatively, if you set this test using expect_length then you would get a numeric difference between expectation and result. This might make it easier to identify why or how the test is failing. response: Added a few more tests ropensci/ohun@333817c comment: I also suggest using a skip_on_windows function. response: pull request accepted comment: Writing tests can be really hard (and rewarding!) - I found these resources
Specifically in the first link "avoid testing simple code [...] test what you're not sure about, what's fragile, ...". This is a great opportunity for you to put your deep understanding about this kind of analysis to work, because you know when results don't look right, beyond the basic "is it a data.frame", etc. To this point, a lot of tests are just length checks - do you have any other expectations from the functions other than that they will return the expected length? I've adjusted some of the tests where I noticed an alternative function that might be more informative. Here's the pull request: response: pull request accepted Note: if packages are moved from Depends, it may require them to be explicitly loaded with library() in the tests where their functions are used. I don't really understand why custom functions stop2 and warning2 are being used. They return the error or warning message, without the call being printed. Would this possibly make it harder for a new user to troubleshoot, if where the message originates from? stop2 is used 105 times across 14 files and warning2 is used 3 times across 3 files. For example with stop2: R: label_detection(reference = matrix(lbh_reference[-1, ]), detection = lbh_reference) and with stop: R: label_detection(reference = matrix(lbh_reference[-1, ]), detection = lbh_reference) response: I copied this from the package brms. I like the message to be less wordy. But most important, having the function name is not useful when the error comes from an internal function, which is often the case. RElated to this, I added more detailed argument evaluations with the new internal function check_arguments() ropensci/ohun@9fdd163 **comment:**Consider using seq_len instead of 1:nrow, 1:length, 1:ncol, ... For example: x <- data.frame(LETTERS[1:5]) 1:nrow(x) seq_len(nrow(x)) x <- data.frame() 1:nrow(x) seq_len(nrow(x)) response: I did when earlier during the first default ROpenSci revision Related to dependencies described above under Description:
response: OK **comment:**There is mixed used of importFrom and ::. If packages listed under Depends are moved to Imports, I would suggest using the :: syntax because it helps a reader understand if functions are from in ohun and, if not, which package they come from. response: Done. There is commented code in
response: Removed ropensci/ohun@9fdd163 Related to @sammlapp's review, the ohun could be improved by refactoring some functions. For example, there are nested functions in the following scripts:
response: I move those functions to "internal_functions.R" ropensci/ohun@7c47092. ropensci/ohun@4ca47bd Thank you for the very useful comments. Could you please add yourself to the package DESCRIPTION? (do not know your full name :) ) |
Thanks for your responses to the reviews, @maRce10! @sammlapp and @robitalec could you both take a look and let me know what you think? You can use the reviewer approval template to indicate your response: https://devdevguide.netlify.app/approval2template.html |
@robitalec I split the vignette in 3 but now the tarball is 15 MB. Do you have any suggestion on how to shrink the package a bit? |
Thanks @maRce10 for the detailed response, and for accepting those PRs. I am grateful that approach worked well for you too, I found it easier to look at logical chunks of changes as PRs personally. Reviewer ResponseFollowing the same headers as above: A statement of need
Installation instructions
Packaging guidelinesNice to see the {vdiffr} use. Should {vdiffr} be added to the Suggests list instead of Depends like {testthat}? If so, you can remove the @importFrom call for Relatedly though, the files under tests/testthat/_snaps are quite large:
Is there any way to reduce the size of these SVGs? Additional commentsSome files I deleted in the accepted PR have resurfaced/are still present in the repository:
stop2, warning2Ok for removing the call. My initial concern was that it would be harder for a user to follow the traceback, but I see now that the .call argument doesn't impact that. Since a user can still troubleshoot using the traceback, this should be fine. stop2 <- function(...) {
stop(..., call. = FALSE)
}
test <- function() {
stop2('test error')
}
test()
#> Error: test error
traceback()
#> 3: stop(..., call. = FALSE) at #2
#> 2: stop2("test error") at #1
#> 1: test() New commentsTestsTwo directories are in the Rbuildignore (https://github.com/maRce10/ohun/blob/master/.Rbuildignore#L4-L5):
This means that the tests aren't being run on CRAN eg. https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-clang/ohun-00check.html Compare this to eg. {dplyr} https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-clang/dplyr-00check.html and we can see the section "checking tests"... Surprised CRAN didn't notice there were no tests in the submitted package, but maybe it isn't a requirement of CRAN packages I'm not sure. It would be good to have tests running on CRAN. VignettesIf we run
Most importantly, a file is written
You could consider deleting files that correspond to this wildcard then adding this new Two options I can think of for reducing the size of your vignettes:
For example, the original output options for
8 MB -> 7.1 MB using theme: null and removing toc_float
7.1 MB -> 3.4 MB dropping css (might be due to fonts)
3.4 MB --> 2.5 MB switching from output type html_document to html_vignette
You could keep these changes from CRAN, then apply the CSS independently to your site?
Here is my author chunk from other packages.
Let me know what you think of these additional comments and responses. Thank you! |
@maRce10 Can you take a look at @robitalec responses and revise/respond to each? @sammlapp could you take a look at the revisions and let me know what you think? You can use the reviewer approval template to indicate your response: https://devdevguide.netlify.app/approval2template.html Thanks! |
Hi @robitalec Thanks again! I added you as a contributor and move vdiffr to "suggests" (ropensci/ohun@50b7a55), render the readme (ropensci/ohun@7f53096) and added a vignette gitignore file (ropensci/ohun@8bc198a). Will change the vignette size before submitting to CRAN. Any suggestion on how to keep the 2 versions of the vignettes (CRAN and site) without having to change manually the yaml header? cheers |
Reviewer ResponseFinal approval (post-review)
Estimated hours spent reviewing: 10 (second review: 1) |
All, I think we are very close to finishing this up! @sammlapp thank you for the reviewer response! @robitalec could you provide a final reviewer response based on the most recent changes? The link to that form is https://devdevguide.netlify.app/approval2template. @maRce10 Instead of listing the reviewers as contributors, I think it best to have them as reviewers. The role in that case would be "rev". For the purposes of rOpenSci, the vignettes do not need to adhere to the size suggestions for CRAN. Although I agree with @robitalec suggestion that you reduce their size prior to submission to CRAN. I like the suggestion to consider using these only as articles on the pkgdown site. Another place to look for more on that is https://usethis.r-lib.org/reference/use_vignette.html. The decision on that is yours and can take place after we complete the review and prior to submission to CRAN. Once we get these last few things resolved, I think we are good to go! |
Thanks @maRce10! Thanks for clarifying @jhollist that the vignettes do not need to adhere to CRAN size suggestions. Reviewer ResponseFinal approval (post-review)
Estimated hours spent reviewing: 17 |
Awesome! Is there anything else you need from me end at this point? |
Just make the change in the DESCRIPTION on @robitalec role from ctb to rev, and go ahead and add @sammlapp as rev as well. Once that is done, we are good to go! |
OK I just did it ropensci/ohun@cfa533c |
@ropensci-review-bot approve ohun |
Approved! Thanks @maRce10 for submitting and @sammlapp, @robitalec for your reviews! 😁 To-dos:
Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions. We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved. Last but not least, you can volunteer as a reviewer via filling a short form. |
Thank you all for your hard work on this. Great to get it approved! I don't have much to add. The important details are above. @maRce10 if you have any questions, feel free to ping my in this issue or you can reach me via email hollister . jeff epa . gov. Congratulations! |
@ropensci-review-bot invite me to ropensci/ohun |
Invitation sent! |
@ropensci-review-bot finalize transfer of ohun |
Transfer completed. |
Date accepted: 2023-09-13
Submitting Author Name: Marcelo Araya-Salas
Submitting Author Github Handle: @maRce10
Repository: https://github.com/maRce10/ohun
Version submitted: 0.1.0
Submission type: Standard
Editor: @jhollist
Reviewers: @sammlapp, @robitalec
Archive: TBD
Version accepted: TBD
Language: en
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
The packages allows to get data from animal acoustic signal recordings in an automated manner
Who is the target audience and what are scientific applications of this package?
Scientific community working with bioacoustic data. Useful to automate analysis of animal acoustic signals for a variety of research questions (e.g. behavior, ecology, evolution, conservation)
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
the warbleR package has similar functions for detecting but the implementation is not as friendly (I am also the maintainer) and has no tools for detection diagnose/optimization. monitoR also implements one of the detection approaches in ohun but no diagnosing/optimization tools are supplied.
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
Explain reasons for any
pkgcheck
items which your package is unable to pass.Technical checks
Confirm each of the following by checking the box.
This package:
Publication options
Do you intend for this package to go on CRAN?
Do you intend for this package to go on Bioconductor?
Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
MEE Options
Code of conduct
The text was updated successfully, but these errors were encountered: