ohun #568

maRce10 · 2022-12-25T02:12:42Z

Date accepted: 2023-09-13

Submitting Author Name: Marcelo Araya-Salas
Submitting Author Github Handle: @maRce10
Repository: https://github.com/maRce10/ohun
Version submitted: 0.1.0
Submission type: Standard
Editor: @jhollist
Reviewers: @sammlapp, @robitalec

Archive: TBD
Version accepted: TBD
Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: ohun
Type: Package
Title: Optimizing Acoustic Signal Detection
Version: 0.1.0
Maintainer: Marcelo Araya-Salas <[email protected]>
Description: Facilitates the automatic detection of acoustic signals, 
  providing functions to diagnose and optimize the performance of detection 
  routines. Detections from other software can also be explored and optimized. 
  Reference: Hossin & Sulaiman (2015) <doi:10.5121/ijdkp.2015.5201>.
License: GPL (>= 2)
Encoding: UTF-8
URL: https://marce10.github.io/ohun/
BugReports: https://github.com/maRce10/ohun/issues/
VignetteBuilder: knitr, rmarkdown
RoxygenNote: 7.2.1
Repository: CRAN
Language: en-US
Authors@R: c(person("Marcelo", "Araya-Salas", role = c("aut", "cre"), email = "[email protected]", comment = c(ORCID = "0000-0003-3594-619X")))
Date/Publication: 2016-04-19 08:12:11
Imports: pbapply, viridis, crayon, methods, stats, utils, seewave (>= 2.0.1), RCurl, fftw, knitr, rjson, rlang, sp, igraph, Sim.DiffProc
Depends: R (>= 3.2.1), tuneR, warbleR (>= 1.1.28)
Suggests: rmarkdown, testthat (>= 3.0.0), formatR
Config/testthat/edition: 3

Scope

Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- data retrieval
- data extraction
- data munging
- data deposition
- data validation and testing
- workflow automation
- version control
- citation management and bibliometrics
- scientific software wrappers
- field and lab reproducibility tools
- database software bindings
- geospatial data
- text analysis
Explain how and why the package falls under these categories (briefly, 1-2 sentences):

The packages allows to get data from animal acoustic signal recordings in an automated manner

Who is the target audience and what are scientific applications of this package?
Scientific community working with bioacoustic data. Useful to automate analysis of animal acoustic signals for a variety of research questions (e.g. behavior, ecology, evolution, conservation)
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
the warbleR package has similar functions for detecting but the implementation is not as friendly (I am also the maintainer) and has no tools for detection diagnose/optimization. monitoR also implements one of the detection approaches in ohun but no diagnosing/optimization tools are supplied.
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
Explain reasons for any pkgcheck items which your package is unable to pass.

Technical checks

Confirm each of the following by checking the box.

I have read the rOpenSci packaging guide.
I have read the author guide and I expect to maintain this package for at least 2 years or to find a replacement.

This package:

does not violate the Terms of Service of any service it interacts with.
has a CRAN and OSI accepted license.
contains a README with instructions for installing the development version.
includes documentation with examples for all functions, created with roxygen2.
contains a vignette with examples of its essential functions and uses.
has a test suite.
has continuous integration, including reporting of test coverage.

Publication options

Do you intend for this package to go on CRAN?
Do you intend for this package to go on Bioconductor?
Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options

The package is novel and will be of interest to the broad readership of the journal.
The manuscript describing the package is no longer than 3000 words.
You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
(Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
(Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
(Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

The text was updated successfully, but these errors were encountered:

ropensci-review-bot · 2022-12-25T02:12:43Z

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

annakrystalli · 2023-01-12T10:22:08Z

Dear @maRce10 ,

Many thanks for your submission and apologies for the delay.

The editorial team was taking a break for the holidays and are now catching up. We'll respond with a decision on whether to accept for review shortly.

annakrystalli · 2023-01-12T11:49:13Z

Hello again @maRce10 ,

I can confirm we are happy to accept the package for review!

I will shortly assign a handling editor.

In the meantime, I'll run our automated checks. If any issues are flagged, feel free to start addressing them.

annakrystalli · 2023-01-12T11:49:16Z

@ropensci-review-bot check package

ropensci-review-bot · 2023-01-12T11:49:17Z

Thanks, about to send the query.

ropensci-review-bot · 2023-01-12T11:49:21Z

🚀

Editor check started

👋

ropensci-review-bot · 2023-01-12T13:37:13Z

Checks for ohun (v0.1.0)

git hash: c1b9be0e

✔️ Package is already on CRAN.
✔️ has a 'codemeta.json' file.
✔️ has a 'contributing' file.
✔️ uses 'roxygen2'.
✔️ 'DESCRIPTION' has a URL field.
✔️ 'DESCRIPTION' has a BugReports field.
✔️ Package has at least one HTML vignette
✔️ All functions have examples.
✔️ Package has continuous integration checks.
✖️ Package coverage failed
✔️ R CMD check found no errors.
✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>= 2)

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	417
internal	warbleR	41
internal	ohun	25
internal	graphics	11
internal	parallel	7
internal	grDevices	1
depends	tuneR	1
imports	sp	16
imports	methods	10
imports	stats	10
imports	utils	3
imports	seewave	3
imports	igraph	3
imports	crayon	2
imports	rlang	1
imports	pbapply	NA
imports	viridis	NA
imports	fftw	NA
imports	knitr	NA
imports	Sim.DiffProc	NA
suggests	rmarkdown	NA
suggests	testthat	NA
suggests	formatR	NA
suggests	covr	NA
linking_to	NA	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

c (33), lapply (27), length (25), sapply (24), unique (22), data.frame (13), rbind (13), is.na (12), unlist (12), do.call (11), if (11), list (11), paste0 (11), sum (11), names (10), drop (9), nrow (9), table (9), for (8), grep (7), strsplit (7), as.data.frame (6), mean (6), which (6), col (5), file (5), ncol (5), round (5), setdiff (5), match.call (4), min (4), seq (4), cos (3), eval (3), getOption (3), paste (3), proc.time (3), sin (3), split (3), any (2), apply (2), as.list (2), by (2), call (2), deparse (2), diff (2), expand.grid (2), matrix (2), scale (2), abs (1), as.numeric (1), cat (1), colMeans (1), colnames (1), complex (1), file.path (1), floor (1), grepl (1), ifelse (1), is.null (1), is.numeric (1), max (1), order (1), pi (1), plot (1), regexpr (1), rep (1), rownames (1), seq.int (1), substring (1), summary (1), t (1), try (1), vector (1), which.min (1)

warbleR

pblapply_wrblr_int (17), read_sound_file (8), sound_pressure_level (4), info_sound_files (3), duration_sound_ (2), duration_sound_files (2), gaps (2), duration_wavs (1), envelope (1), stft_wrblr_int (1)

ohun

FUN (10), diagnose_detection (2), find_templates (2), internal_feature_reference (2), spc_FUN (2), XC_FUN (2), detect_FUN (1), energy_detector (1), feature_acoustic_data (1), label_detection (1), t2xy (1)

sp

over (3), Polygon (3), Polygons (3), Spatia (3), SpatialPoints (3), SpatialPolygons (1)

graphics

lines (5), par (4), clip (1), grid (1)

methods

is (10)

stats

poly (4), prcomp (2), smooth (2), na.omit (1), step (1)

parallel

makeCluster (4), makePSOCKcluster (3)

igraph

as_data_frame (1), graph_from_incidence_matrix (1), max_bipartite_match (1)

seewave

env (2), duration (1)

utils

packageVersion (2), head (1)

crayon

italic (1), silver (1)

grDevices

cm (1)

rlang

call_args (1)

tuneR

writeWave (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 21 files) and
1 authors
1 vignette
3 internal data files
13 imported packages
18 exported functions (median 65 lines of code)
39 non-exported functions in R (median 75 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	21	82.3
files_vignettes	1	68.4
files_tests	17	95.3
loc_R	2404	87.4
loc_vignettes	541	79.9
loc_tests	564	77.1
num_vignettes	1	64.8
data_size_total	385966	90.6
data_size_median	187188	93.8
n_fns_r	57	60.6
n_fns_r_exported	18	64.2
n_fns_r_not_exported	39	60.5
n_fns_per_file_r	2	29.4
num_params_per_fn	6	79.0
loc_per_fn_r	66	93.6
loc_per_fn_r_exp	66	85.7
loc_per_fn_r_not_exp	75	95.2	TRUE
rel_whitespace_R	23	90.3
rel_whitespace_vignettes	45	88.6
rel_whitespace_tests	39	86.3
doclines_per_fn_exp	92	88.7
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	120	82.1

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

id	name	conclusion	sha	run_number	date
3864222159	pages build and deployment	success	c1b9be	94	2023-01-07
3864222174	test-coverage	success	c1b9be	7	2023-01-07
3899091232	tic	success	c1b9be	49	2023-01-12

3b. `goodpractice` results

`R CMD check` with rcmdcheck

R CMD check generated the following note:

checking installed package size ... NOTE
installed size is 5.9Mb
sub-directories of 1Mb or more:
doc 5.2Mb

R CMD check generated the following check_fails:

cyclocomp
no_description_depends
no_import_package_as_a_whole
rcmdcheck_reasonable_installed_size

Test coverage with covr

ERROR: Test Coverage Failed

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
energy_detector	52
label_detection	35
template_correlator	34
split_acoustic_data	26
feature_reference	25
get_envelopes	21
optimize_energy_detector	20
summarize_diagnostic	20
diagnose_detection	19
label_spectro	18

Static code analyses with lintr

lintr found the following 558 potential issues:

message	number of times
Avoid 1:length(...) expressions, use seq_len.	2
Avoid 1:ncol(...) expressions, use seq_len.	3
Avoid 1:nrow(...) expressions, use seq_len.	34
Avoid library() and require() calls in packages	3
Avoid using sapply, consider vapply instead, that's type safe	27
Lines should not be more than 80 characters.	488
unexpected symbol	1

4. Other Checks

Details of other checks (click to open)

✔️ Package contains the following (potentially) obsolete packages:

sp

See our Recommended Scaffolding for alternatives.

Package Versions

package	version
pkgstats	0.1.3
pkgcheck	0.1.0.32

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

maRce10 · 2023-01-12T16:14:01Z

Hi

thanks for reviewing my package. Not sure why you got problems with coverage. It shows 78% coverage in codecov:

https://app.codecov.io/github/maRce10/ohun

as well as in the coverage status badge

https://github.com/maRce10/ohun

So I am not sure how to proceed now.

Cheers

Marcelo

annakrystalli · 2023-01-12T16:23:53Z

Hello @maRce10

Indeed, that's odd.

Let us look into it and get back to you.

maRce10 · 2023-01-12T20:33:47Z

I replaced with seq_len() and vapply() when possible, replace cat() with message() and removed unused dependencies/suggests: 6ea380d

mpadge · 2023-01-12T21:40:36Z

@maRce10 It took a bit of digging, but the test failures are because flac is not installed. I see you're doing explicit installs in your own GitHub action file. It's common to put these kind of "non-standard" system dependencies in the "SystemRequirements" field of your DESCRIPTION file. Our system would then pick that up and install it, but without that has no way of knowing which libraries your package might need. That said, most systems only match entries given there to the "rules" directory of rstudio/r-system-requirements. Any libraries not listed there, including "flac", are simply ignored. What you should at least do is:

List all libraries which do appear there, and which your package requires, in your DESCRIPTION file.
Let us know of any libraries on which your package depends are which are not listed there , and we'll add them to our build system. (Or you couuld submit a PR to this Dockerfile.)

Many of the libraries given in your actions file will automatically pull in several of the others, so you should reduce this set to the minimal possible length, either through checking disribution listings (at least for Debian or Ubuntu), or iterating in a docker container.

Please ping here when your DESCRIPTION has been updated, and let us know what else we might need to install. Thanks.

maRce10 · 2023-01-13T19:47:30Z

OK thanks. These non-standard dependencies are not from the ohun package but from a package it relies heavily on, warbleR. These dependencies are now included in the DESCRIPTION file of warbleR on github. Should I also add them to ohun?

mpadge · 2023-01-14T08:21:58Z

No, if they're from another package then they're dependencies of that package not yours. I'll just add flac to our system and ping here when checks can be re-run.

mpadge · 2023-01-17T15:21:05Z

@maRce10 Our system has been rebuilt, and your tests all run successfully now. You may call the bot to run the checks whenever you like.

annakrystalli · 2023-01-18T09:18:55Z

@ropensci-review-bot assign @jhollist as editor

ropensci-review-bot · 2023-01-18T09:18:58Z

Assigned! @jhollist is now the editor

jhollist · 2023-01-18T12:08:37Z

@ropensci-review-bot check package

ropensci-review-bot · 2023-01-18T12:08:38Z

Thanks, about to send the query.

ropensci-review-bot · 2023-01-18T12:08:42Z

🚀

Editor check started

👋

ropensci-review-bot · 2023-01-18T12:21:12Z

Checks for ohun (v0.1.0)

git hash: 62cdc87d

✔️ Package is already on CRAN.
✔️ has a 'codemeta.json' file.
✔️ has a 'contributing' file.
✔️ uses 'roxygen2'.
✔️ 'DESCRIPTION' has a URL field.
✔️ 'DESCRIPTION' has a BugReports field.
✔️ Package has at least one HTML vignette
✔️ All functions have examples.
✔️ Package has continuous integration checks.
✔️ Package coverage is 78.1%.
✔️ R CMD check found no errors.
✔️ R CMD check found no warnings.
👀 Package depends on the following obsolete packages: [sp]

(Checks marked with 👀 may be optionally addressed.)

Package License: GPL (>= 2)

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	448
internal	warbleR	41
internal	ohun	27
internal	graphics	12
internal	parallel	7
internal	grDevices	1
depends	tuneR	1
imports	sp	16
imports	stats	10
imports	utils	3
imports	seewave	3
imports	igraph	3
imports	cli	1
imports	methods	1
imports	rlang	1
imports	fftw	NA
suggests	knitr	NA
suggests	rmarkdown	NA
suggests	testthat	NA
suggests	viridis	NA
suggests	Sim.DiffProc	NA
linking_to	NA	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

c (35), lapply (27), length (26), unique (22), vapply (17), sum (14), data.frame (13), rbind (13), unlist (13), is.na (12), do.call (11), if (11), list (11), paste0 (11), names (10), seq_len (10), drop (9), nrow (9), table (9), for (8), mean (8), strsplit (8), grep (7), numeric (7), sapply (7), as.data.frame (6), which (6), col (5), ncol (5), round (5), setdiff (5), any (4), file (4), match.call (4), min (4), seq (4), cos (3), eval (3), getOption (3), logical (3), proc.time (3), sin (3), split (3), apply (2), as.list (2), by (2), call (2), deparse (2), diff (2), expand.grid (2), matrix (2), paste (2), scale (2), abs (1), as.numeric (1), character (1), colMeans (1), colnames (1), complex (1), file.path (1), floor (1), grepl (1), ifelse (1), is.null (1), is.numeric (1), match.arg (1), max (1), order (1), pi (1), plot (1), regexpr (1), rep (1), rownames (1), seq.int (1), substring (1), summary (1), t (1), try (1), vector (1), which.min (1)

warbleR

pblapply_wrblr_int (17), read_sound_file (8), sound_pressure_level (4), info_sound_files (3), duration_sound_ (2), duration_sound_files (2), gaps (2), duration_wavs (1), envelope (1), stft_wrblr_int (1)

ohun

FUN (10), diagnose_detection (2), find_templates (2), internal_feature_reference (2), spc_FUN (2), XC_FUN (2), colortext (1), detect_FUN (1), energy_detector (1), feature_acoustic_data (1), label_detection (1), message2 (1), t2xy (1)

sp

over (3), Polygon (3), Polygons (3), Spatia (3), SpatialPoints (3), SpatialPolygons (1)

graphics

lines (5), par (4), clip (1), grid (1), text (1)

stats

poly (4), prcomp (2), smooth (2), na.omit (1), step (1)

parallel

makeCluster (4), makePSOCKcluster (3)

igraph

as_data_frame (1), graph_from_incidence_matrix (1), max_bipartite_match (1)

seewave

env (2), duration (1)

utils

packageVersion (2), head (1)

cli

style_italic (1)

grDevices

cm (1)

methods

as (1)

rlang

call_args (1)

tuneR

writeWave (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 21 files) and
1 authors
1 vignette
3 internal data files
9 imported packages
18 exported functions (median 65 lines of code)
47 non-exported functions in R (median 58 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	21	82.3
files_vignettes	1	68.4
files_tests	17	95.3
loc_R	2429	87.5
loc_vignettes	541	79.9
loc_tests	571	77.3
num_vignettes	1	64.8
data_size_total	385966	90.6
data_size_median	187188	93.8
n_fns_r	65	64.8
n_fns_r_exported	18	64.2
n_fns_r_not_exported	47	65.9
n_fns_per_file_r	2	33.8
num_params_per_fn	6	79.0
loc_per_fn_r	65	93.5
loc_per_fn_r_exp	66	85.7
loc_per_fn_r_not_exp	58	92.5
rel_whitespace_R	23	90.5
rel_whitespace_vignettes	45	88.6
rel_whitespace_tests	39	86.4
doclines_per_fn_exp	92	88.7
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	154	85.5

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

id	name	conclusion	sha	run_number	date
3944417614	pages build and deployment	success	62cdc8	103	2023-01-18
3944417809	test-coverage	success	62cdc8	16	2023-01-18
3945647101	tic	success	62cdc8	64	2023-01-18

3b. `goodpractice` results

`R CMD check` with rcmdcheck

R CMD check generated the following note:

checking installed package size ... NOTE
installed size is 5.9Mb
sub-directories of 1Mb or more:
doc 5.2Mb

R CMD check generated the following check_fails:

cyclocomp
no_description_depends
no_import_package_as_a_whole
rcmdcheck_reasonable_installed_size

Test coverage with covr

Package coverage: 78.11

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
energy_detector	52
label_detection	35
template_correlator	34
split_acoustic_data	26
feature_reference	25
get_envelopes	21
optimize_energy_detector	20
summarize_diagnostic	20
diagnose_detection	19
label_spectro	18

Static code analyses with lintr

lintr found the following 512 potential issues:

message	number of times
Avoid 1:nrow(...) expressions, use seq_len.	3
Avoid library() and require() calls in packages	3
Avoid using sapply, consider vapply instead, that's type safe	10
Lines should not be more than 80 characters.	495
unexpected symbol	1

4. Other Checks

Details of other checks (click to open)

✖️ Package contains the following obsolete packages:

sp

See our Recommended Scaffolding for alternatives.

Package Versions

package	version
pkgstats	0.1.3
pkgcheck	0.1.1.11

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

maRce10 · 2023-01-18T19:36:23Z

🥳

jhollist · 2023-06-05T15:14:15Z

@ropensci-review-bot submit review #568 (comment) time 10

ropensci-review-bot · 2023-06-05T15:14:19Z

Logged review for sammlapp (hours: 10)

jhollist · 2023-06-05T15:14:52Z

@ropensci-review-bot submit review #568 (comment) time 15

ropensci-review-bot · 2023-06-05T15:14:54Z

Logged review for robitalec (hours: 15)

ropensci-review-bot · 2023-06-17T15:14:59Z

@maRce10: please post your response with @ropensci-review-bot submit response <url to issue comment> if you haven't done so already (this is an automatic reminder).

Here's the author guide for response. https://devguide.ropensci.org/authors-guide.html

maRce10 · 2023-07-07T17:06:12Z

Apologies for the delay. Many thanks to both reviewers. I am really grateful for all your time and effort. I have made changes and responded to your comments/suggestions below, quoting the commit the change was made in.

Response to @sammlapp

@ropensci-review-bot submit response #568 (comment)

comment: Vignette(s): demonstrating major functionality that runs successfully locally - I wasn't able to see or access the vignette in R studio with vignette()

response: I have added a link to each of the 3 vignettes to README ropensci/ohun@f857949

comment: I don't typically use R, so some things are not obvious to me. Where can I find the full API documentation? I opened the /docs/index.html to view the docs by rendering the HTML locally, but presumably they are online somewhere. I would recommend linking to the documentation near the top of the README.

response: I added a link to the package documentation to the top of README ropensci/ohun@1ffc441

comment: The use of the word "reference" to mean "audio annotations" or "labels" is unfamiliar to me, I'm not sure if this word is typically used in the related warbleR package.

response: The use of the term reference to refer to the ground truth annotations seems to be common in the field (check for instance Mesaros, A., Heittola, T., Virtanen, T., & Plumbley, M. D. (2021). Sound event detection:
A tutorial. IEEE Signal Processing Magazine, 38(5), 67-83).
I prefer "reference" because you can also compare annotations from different software to check agreement. So it is not always manual annotations vs computer generated annotations.

comment: The package should give the user some expectation for under what circumstances the detection methods are likely to be effective. Although it mentions that high SNR is required for amplitude thresholding and stereotyped sounds are necessary for template matching, there are other requirements of the data for these approaches to be effective.

response: I added that time-frequency overlaps with other sounds can affect performance. I am not aware of any other main requirement. ropensci/ohun@d85c0e0

comment: to improve documentation and function/variable/argument names to better match the action that is actually taken by a function;

response: The function filter_detection() has been renamed consensus_detection() ropensci/ohun@c1e108d

comment: to consider a moderate refactor of code to increase modularity, reduce redundancy, and aim to follow the Single Responsibility Principle;

response: a modular approach was used for checking arguments. Note that all task that are shared by different functions are carried out by a unique function. For instance, diagnosing detection performance in optimize_template_detector() or optimize_energy_detector() is done by calling diagnose_detection() and summarize_diagnostic(). diagnose_detection() also calls label_detection() which can also be called by users. get_templates() is called within template_detector(). Now all functions call check_arguments(). ropensci/ohun@c1e108d

comment: increase code readability by breaking complex logical lines into separate lines with additional comments, plus comments describing the general purpose of large blocks of code.

response: I simplify the code in diagnose_detection() and summarize_diagnostic() maRce10/ohun@69d81da42cc5b831779be1ebb3aaa2e794464877e. Note that functions that deal with envelopes tend to avoid duplicating envelope objects to avoid using to much memory_

comment: Template-based cross correlation is implemented in monitoR, while energy-based detection is implemented in bioacoustics (the implementation seems more generic and featured than the one in ohun). I wonder if the proposed value of Ohun is mostly in its specific ability to interact with the formats of annotations produced by warbleR. If so, the top-level documentation and README could clarify this purpose.

response: monitoR offers no way to evaluate detection performance. Ohun does, and can even be applied for optimizing detections.

comment: Specifically, when I started using the program without having any experience with warbleR, it was not clear to me how to create annotation tables or generally what the utility of most of the package functions are. Writing a description of the characteristics of data that this package expects as input would be helpful. For instance, "This package provides utilities for comparing detection and annotations of audio events described by frequency and time boxes. Creating annotations containing such frequency-time boxes [is possible in the warbleR package?]."

response: I added a paragraph similar to that proposed by the reviewer to README. ropensci/ohun@d4edd0d
Note that all argument descriptions clearly state the format of the annotations and they do not need to be created using warbleR. For instance the description of "reference" in all functions:_

Data frame or 'selection.table' (following the warbleR package format) with the reference selections (start and end of the sound events) that will be used to evaluate the performance of the detection, represented by those selections in 'detection'. Must contained at least the following columns: "sound.files", "selec", "start" and "end".

comment: In the language of rOpenSci software recommendations, this is a sign that the functions are not "modular". In other places it might be seen as a violation of the Single Responsibility Principle (SRP). For example, the label_specto function violates SRP because it doesn't just label a spectrogram: it computes a spectrogram, plots it, then overlays labels, and might additionally comput and plot an amplitude signal. As another example, I get_envelopes.R and energy_detector.R seem to both implement the process of getting an amplitude envelope, rather than using shared modular functionality specific to this purpose. Though it would take a significant refactoring effort, I believe creating a more modular code base that aims to achieve the SRP could improve the package and user experience overall.

response: The function label_spectro() was created for making the graphs in the vignette. This is stated in the description of the function in bold font ("Created for graphs included in the vignette, and probably only useful for that or for very short recordings"). So it is not call by any other function or calls any other ohun function. It just for the vignette. Not sure how to make this more evident. ropensci/ohun@f857949

comment: In general, the code is difficult to read because of the formatting and smushing lots of logic into a single functional "line" (which is broken into many lines anyways due to formatting). Separating out logical steps into sequential lines, with comments where appropriate, would make the code easier to read/understand/maintain. However, it would take a major refactoring effort.

response: As mentioned above I simplify the code in diagnose_detection() and summarize_diagnostic() which were the most cumbersome maRce10/ohun@69d81da42cc5b831779be1ebb3aaa2e794464877e. Note that functions that deal with envelopes tend to avoid duplicating envelope objects to avoid using to much memory_

comment: Consider adding units to variable names eg hold_time_ms

response: I think that is a good suggestion although it is extremely uncommon in R packages so I would rather not include units in names to keep match what R users are expecting.

split_acoustic_data.R:

comment: documentation should list supported sound file types

response: I added supported formats to all function working on sound files ropensci/ohun@63f2be1

comment: documentation should list required columns of table

response: The argument description already mentioned that: X = 'selection_table' object or a data frame with columns #' for sound file name (sound.files), selection number (selec), and start and end time of signal #' (start and end).

comment: documentation should say what will happen with incomplete audio clip at end of file

response: Not sure I follow. I think there is only when thing that can happen: create the short clip (not really incomplete)

comment: comments denoting the purpose of larger sections of code are helpful, eg # INPUT VALIDATION
or # Create the audio clips

response: I added more comments ropensci/ohun@05daadb

##summarize_diagnostic.R:

comment: The code style makes it difficult to review the logic or infer the operation performed by much of this code. For example, lines 131-141 of summarize_diagnostic.R. In general I'm not reading this code base to check for logical errors, but I think that the code style will make it difficult for others (or the authors) to debug, extend, understand or modify the code. In this example, separating the logical operation into a few separate and sequential lines of code could make it easier to digest, and would provide opportunities to add comments for any individual steps whose purposes are not obvious. (For example in this code, a comment could explain why we would want a weighted mean of true positive durations.)

response: as mention above I simplify the code to make it easier to understand (and debug) maRce10/ohun@69d81da42cc5b831779be1ebb3aaa2e794464877e

comment:* Input validation is redundant across many functions, eg validating the cores argument. Rather than repeating code in several modules, use modular helper functions. That way, if you change something it is consistent across the codebase.
response: I added the internal function check_arguments() which is now used by all other functions to validate arguments ropensci/ohun@9fdd16338cf
bda7cb096e9f3d6f737d6517fe714

comment: Comments should explain not just what's being done but also why (rather than re-stating the code in English). For example, instead of # set bottom and top freq if not supplied you could explain the bottom and top frequency are needed to make a spectrogram. if they're not supplied, we set them to 0 and Ns=sample_rate/2. Another example, in template_detector.R L177 "# if no detections" is not a helpful comment - the comment should explain what will happen and why.

response: More comments were added. ropensci/ohun@05daadb and ropensci/ohun@4e3449d

comment: Since I'm not an R programmer, I might not understand conventions for writing R modules. However, I don't think it's a good idea to define your helper functions inside another function. For example L218 of template_correlator.R begins a definition of a function to create a spectrogram. Even if this is not part of the API (its a hidden or internal function), I think it should be defined outside of the template_correlator function (1) because it doesn't make sense that it would be re-defined each time template_correlator is called, and (2) because it makes reading the code for template_correlator unnecessarily long.

response: I moved the internal function out of template_correlator() ropensci/ohun@7c47092. Same thing for energy_detector() ropensci/ohun@4ca47bd

comment: I'm not sure how namespaces work in R but it would be helpful for a code reader to know where functions are coming from. For instance, get_templates.R calls find_templates which is not explicitly imported or defined anywhere in the file. It apparently can be accessed because it exists in another .R file in the package, or because all functions from the package have been imported into the default namespace.

response: Internal functions are common in R. They can be called using namespace:::function. All internal functions in ohun are found in internal_functions.R

Notes on the Getting Started tutorial page

comment: The "main features" description is jargon-y and could be written in broader language. For example, "The use of signal detection theory indices to evaluate detection performance" -> "Evaluate detection accuracy with precision, recall, and other metrics". As is, I feel like it obscures the functionality of the package. As another example, I wasn't sure what "Optimize detection routines based on reference annotations" meant - after reading further, I think it basically means "Adjust detection parameters to optimize accuracy on a labeled set of recordings"

response: I have included "Adjust detection parameters to optimize accuracy " into the package description (ropensci/ohun@7617753). However, "Evaluate detection accuracy with precision, recall, and other metrics" assumes that users know what those metrics are. Most biologists working on bioacoustics are not familiar with those terms.

comment: "Template-based detection" It's not true that template-based detection "doesn’t depend on the signal-to-noise ratio" - high background noise or low signal level will decrease the accuracy of this approach

response: I reworded this: "As it is less affected by signal-to-noise ratio it's more robust to higher levels of background noise" ropensci/ohun@3fd70a7

comment: outputs (eg, tables) in the Getting Started tutorial are very hard to interpret in the current form, could they be rendered as human-readable tables?

response: I don't think humans cannot read them (I can). I agree they could more printed in a more friendly way but that would require adding more dependencies (e.g. kntr, kableExtra)

comment: An example of how to analyze a large set of files, for instance all files in a folder-subfolder structure, would be helpful. I had to do some googling to figure out how to do "globbing" ie find all paths mathing a pattern.

response: I agree, but the amount of example data that can be included is very limited for R packages.

Notes on API documentation and variable/function names

comment: Throughout the codebase, I recommend more informative variable and argument names: for example, in the function merge_overlaps, to a new user, pb is not interpreted as "progress bar"

response: I try to follow the same variable name use b other packages (including those creating progress bars)

comment: Throughout the documentation: why are arguments that are numbers/strings called "numeric vectors of length 1" and "character vectors of length 1"? This makes me thing I should pass c(0) or [0] instead of 0, for instance, as an argument.

response: c(1) and 1 are equivalent and are both numeric vectors of length 1 (0 is not used as an index in R). Not sure how else I can call them.

get_envelopes :

comment: smooth argument - should state the specific function/implementation of smoothing used

response: I added that is done using the function envelope from warbleR ropensci/ohun@6a2e11e

comment: paths and files arguments: using the current path as default is fragile and does not match the conventions used elsewhere in the package. I recommend making the argument(s) mandatory. It is also unclear that specifying a folder but not files means that the function should attempt to analyze any/all audio files in the folder (only a specific file extension? how does it know which files to analyze or not?). Is that what happens if files=NULL?_

response: I agree that making path mandatory would be safer. However many R users like to set the working directory (using setwd()) before running analyses. That's why is common for functions to look into the current working directory by default. All 5 functions with the argument files use the same default value (NULL). I added that: "If not supplied the function will work on all sound files (in the supported format) in 'path'." ropensci/ohun@a00e687

energy_detector:

comment: again, I don't think making files optional and defaulting to all audio in the working directory is a good idea.

response: Agree, but again it's friendly for most R users.

label_spectro:

comment: can you provide an example of how to use this function with an audio file? If I want to run it on anything other than a simulated array of numbers or the two built-in audio files, I need some prior line of code to load the audio file,

response: As mentioned above, I made this function just to get a nice looking visualization of the data in the vignette. I don't think is very helpful for most real data. I have this info in bold in the description: "Created for graphs included in the vignette, and probably only useful for that or for very short recordings". However, people seem to get confused. I thought about just having it inside the vignette without including it as a function in the package. However users always ask how those figures are made. So that's why I included it as a function.

comment: bp argument (in several functions): how is the signal bandpassed? Should state specifically, for example, a 9th order Butterworth bandpass filter

response: I added that: "Bandpass is done using the function code{\link[seewave]{ffilter}}, which applies a short-term Fourier transformation to first create a spectrogram in which the target frequencies are filtered and then is back transform into a wave object using a reverse Fourier transformation." ropensci/ohun@59a984f

feature_reference:

comment: "Extract quantitative features of references" - from the function name and short description, I expected this to extract acoustic features of the annotated audio events themselves, rather than summary statistics of the annotations per se. Perhaps the description could be "Summarize temporal and frequency dimensions of annotations and gaps" or something similar. Ideally, the function's name would reflect its functionality, something like summarize_annotation_boxes.

response: I addopted the suggested description and added an alternative name for the function: summarize_reference() ropensci/ohun@e0f0c75

comment: gap times: I tried using this summary to choose a hold.time argument for energy_detector, but the min, max, and mean are not very informative, because what matters is the typical gap time between close-together annotations. Maybe the median or 25th percentile, or some other statistic, would be more informative.

response: Yeah that is a tricky thing to get and depends a lot on duty cycle and signal duration. But I don't think there is a single metric that can get a good estimate. min and mean still
give you a sense of the range of gap values.

feature_acoustic_data

comment: Similarly, feature_acoustic_data sounds like it would measure some acoustic feature. Since the function returns information about the file types, I would call it something like summarize_audio_file_characteristics

response: I added an alternative name for the function: summarize_acoustic_data() ropensci/ohun@6a54b30

filter_detection

comment: I'm confused as to what this function is doing. The documentation says it "removes ambiguous detections (split and merged detections)", but the detailed documentation should explain what is meant by split and merged detections. I'm also confused whether this is meant to be applied to human-created annotations, automatically-generated detections, or can be used for either purpose.

response: I added an alternative name to the function: consensus_detection(). I also added that: "Useful when several detections match the same reference as in the case of template detection with multiple templates (see \code{\link{template_detector}})." ropensci/ohun@a6e8e3a

optimize_template_detector

comment: this function compares performance across parameter values, it does not itself optimize the algorithm, so its name is a bit misleading. Perhaps template_parameter_sweep is closer. I'm not sure what paramter values it sweeps over after reading the documentation, since it requires the user to have already run template_correlator. Also, I think the part of the documentation that says "Those parameters can be later used for performing a more efficient detection using optimize_template_detector." must be a typo, perhaps it means "using template_correlator" instead?

response: Thanks I fixed the typo. Naming functions is always hard. To me optimize_template_detector tells me more about what is going on that template_parameter_sweep.

label_spectro

comment: this is a plotting function, so it would be better named something like plot_spec_and_labels

response: Again naming functions is a tricky an subjective thing. In addition, as I said above, this function is mostly used to create spectrograms in the vignette.

get_templates

comment: can you explain what features are being used to create this principle component analysis of template similarity? Also, after understanding what this function does, it seems like it would be better named something like select_representative_templates or select_representative_annotations since it chooses representative templates from a set of annotations.

response: I expanded the explanation in the function documentation: "the function will estimate it by measuring several acoustic features using the function \code{\link[warbleR]{spectro_analysis}} (features related to energy distribution in the frequency and time domain as well as features of the dominant frequency contours, see \code{\link[warbleR]{spectro_analysis}} for more details) and summarizing it with Principal Component Analysis (after z-transforming parameters) using the function \code{\link[stats]{prcomp}}" ( ropensci/ohun@293efe7). I think get_templates suggests users what the function does. They should always read the documentation to get a full sense of what a function does.

split_acoustic_data

comment: it should be possible, if not required, to list the audio files to split. It seems like I must split every file within a folder in the current implementation.

response: Explained above. Behavior is due to common habits of R users.

label_detection

comment: function name does not match what it does. It seems this function evaluates detections against a set of annotations/labels/"references" so perhaps it should be called evaluate_detections or compare_annotations_and_detections

response: It add labels in a new column to each detection. To me the name makes sense. evaluate_detections sounds more like diagnose_detections which is another function.

template_correlator

comment: it's unclear to me how to use the files argument. I don't know what 'X' is as referred to in the documentation, and I would expect to simply be able to pass a vector of file paths.

response: Explained above. Behavior is due to common habits of R users. "There is no 'X' argument in template_correlator()

full_spectrograms

comment: it was surprising to me that this function automatically saved image files to the current working directory, rather than asking for a save dir or displaying them in the console. This doesn't seem like a good practice, because I might not want to save the results and if I do, I'd like to choose where they go.

response: That function is from another package.

comment: Other note: I would be very surprised if template matching is faster than amplitude-based detection, as claimed in the "Improving detection speed" section. Amplitude-based detection is about the lightest task you can perform on a time signal while 2d cross correlation is relatively heavy.

response: That have been my experience in R. It seems that getting and manipulating envelopes in R is particularly slow.

Errors during usage

comment: When I try to run get_templates on my own file set, I get this error:

Error in prcomp.default(spectral_parameters[, 2:27], scale. = TRUE) : 
  cannot rescale a constant/zero column to unit variance

I tried to run get_templates on a set of local files, but it seems somehow one of the features has no variance and this is causing the PCA to fail. I wasn't able to resolve this.

response: I added code to remove zero-variance variables from the PCA input data ropensci/ohun@a16ff19

comment: Running template_correlator then template_detector on the results produced a table containing NA values for many rows in the start, end, and scores columns. I did not expect this, and it caused errors when I tried to run other analyses on these detections.

response: That can happen when no detections were found for some sound files. Just remove the NAs.

comment: Once I ran template_correlator and template_detector, I wasn't sure how to inspect my detections although it seemed like I should be able to use full_spectrograms or some other package function. I eventually realized that if I used na.omit(selections) to drop rows with NA values, I could run full_spectrograms to look at the results. I wonder if there is a way to more selectively choose to look at some of the detections alone, rather than having to view the entire audio files (imagine if I had a 1 hour audio file with 3 1-second detections).

response: full_spectrograms is not a ohun function. I added a new function plot_detection() to visually inspect detections ropensci/ohun@4bca6b2

comment: When I first ran label_spectro, I got the message "Error in loadNamespace(x) : there is no package called ‘viridis’". I'm not sure if this suggests a missing dependency. I resolved the issue by running install.packages("viridis")

response: I added a checking step to make sure users have viridis installed ropensci/ohun@d6a4110

w <- readWave("~/sample_audio/birds_10s.wav")

label_spectro(wave = w@left,
              [email protected],
              env = TRUE)

Error in label_spectro(wave = w@left, f = [email protected], env = TRUE, fastdisp = TRUE) : 
  trying to get slot "samp.rate" from an object of a basic class ("integer") with no slots

response: I added the error message ropensci/ohun@9fdd163

comment: Add a more helpful error message for accidental use of Hz instead of kHz: If my detection table has Hz instead of kHz and I run get_templates, the error message says "Caution! This is a null spectrum. The spectral entropy is null!"

response: There is no easy way to tell if users mean Hz or kHz unfortunately

comment: When I run label_spectro with env = TRUE, I get an error message:

response: Not sure what is the problem. To me it runs and it is checked by testthat. The full name of the argument is "envelope"

Response to @robitalec

A statement of need

comment: Please consider adding a section to the README that 1) clarifies how ohun fits into a broader workflow of working with sound event and 2) if any other packages or approaches are available that share similar functionality to ohun. For 1), this could be adding few sentences describing what typical steps (and related packages) would come before and after using ohun. You could also add the workflow from the vignette to the README in this section. Consider a new user looking at the Reference page on the website or the list of available functions in ohun... how do they fit together?

response: I added that "The implementation of detection diagnostics that can be applied to both built in detection methods and to those obtained from other software packages makes the package ohun an useful tool for conducting direct comparisons of the performance of different routines. In addition, the compatibility of 'ohun' with data formats already used by other sound analysis R packages (e.g. seewave, warbleR) enables the integration of 'ohun' into more complex acoustic analysis workflows in a popular programming enviroment within the research community." ropensci/ohun@78658c7

Installation instructions

comment: The dependency seewave brings system dependencies that might not be immediately obvious to a new user of ohun. It might be helpful to include a link to seewave's installing and troubleshooting system requirements page, for example: https://rug.mnhn.fr/seewave/inst.html

response: I added "Further system requirements due to the dependency seewave may be needed. Take a look a this link for instruction on how to install/troubleshoot these external dependencies." ropensci/ohun@b237f58

Vignette(s)

comment: It might be worth splitting this vignette into multiple vignettes. At its current length, I find it a bit hard to digest all the moving pieces and options. I personally find it easier to learn and use R packages if there are many, shorter vignettes that describe the different major functions or steps offered. You could have maybe four vignettes: introduction, energy based detection, template based detection, and diagnosing/improving. The introductory could expand on how ohun integrates with other R packages in this field, and provide the background given in sections "Automatic sound event detection" and "Signal detection theory applied to bioacoustics". Then the following vignettes using the sections already written from the current vignette. This might just be my opinion - so if you feel like it is better for ohun users to have everything in one place, let me know.

Throughout the vignette, the warnings and messages were not shown using the chunk options warnings=FALSE and messages=FALSE. I would recommend leaving all messages and warnings in the rendered vignette, because a user may be surprised and confused to see them appear locally, when they are not shown in the vignette.

response: I accepted the commit ti show warnings and messages in the vignette. I also split the vignette in 3: Introduction, energy based detection, template based detection. ropensci/ohun@1b9c95f

Community guidelines

comment: Please consider also including the link to the repository (https://github.com/maRce10/ohun) in the URL field in the DESCRIPTION as it should have the most consistently up to date information for the package, for example in the case that the website is not updated.

response: Done (accepted the commit)

Functionality

Packaging guidelines

comment: Looks good: package name, snake_case, object_verb function naming, citation, and code metadata. Consider using this following options for keeping the code metadata up to date and for keeping the CITATION file up to date. Code style improved with a recent commit using the styler package. Before that, logical statements with if else were not easy to read. It is not clear whether or not functions in ohun are pipeable (see the second point here). If you think that is a relevant option, consider including an example in the vignette that shows this.

response: They are not planned to be pipeable.

comment: As mentioned above, the README could use more information about how ohun fits into the broader ecosystem, comparing ohun to other packages that provide related functionality and details on the system requirements for seewave. The README is also missing a brief demonstration of functionality (see the sixth point here). The long form examples in the vignettes are useful for a user that is already intending to use ohun but shorter examples in the README help new users understand what ohun is offering.

response: I added more details about how it fits into the broader ecosystem following another comment above. I agree that some examples in the README might be helpful. However, it's is hard to pick up a short example that can fit in the README as sound event detection tend to require several steps

comment: Documentation looks good with all functions having accompanying examples, detailed return values and references included in the vignette. There is good use of @Seealso to link users to related functions in ohun. Only one URL needed editing (according to urlchecker), see below under Additional comments/README.

You could consider using vdiffr for testing plots. Currently the tests for functions that produce plots are just testing that they return NULL. This isn't a very useful test, and while testing plots is not straightforward or intuitive, you could see if vdiffr helps! More detailed comments on testing below under Additional comments/Tests.

response: I added vdiffr tests ropensci/ohun@333817c

Additional comments

**comment:**The following sections have further details from above, and sometimes an accompanying pull request. I found it easier to work through the review by looking at the source code locally, and making changes as I noticed things. I also felt like it would be easier to highlight specific lines of code by just editing them, instead of referring to eg. Lines 83-85 in some file. No pressure to accept any of these pull requests directly, and I hope that you find it a clear and useful approach.

Repository

_response: I ACCEPTED ALL PULL REQUESTS. THANK YOU!

**comment:**The repository size is ~ 120 mb.I recommend that you delete, at least, the following files:

.Rproj.user/
.Rhistory
..Rcheck/
..pdf
.Rdata
testing/
docs/ (in this case since you are using the pkgdown workflow with the gh-pages branch)

response: pull request accepted

comment: Some of these files were presumably added before they were listed in the .gitignore file. I have deleted them in the following pull request:

ropensci/ohun#12
README

response: pull request accepted

comment: Consider stating explicitly which package provides the parallel processing for ohun in the sentence starting "All functions allow the parallelization of tasks..."

response: Done: "All functions allow the parallelization of tasks (using the packages parallel and pbapply)..." ropensci/ohun@2c332d4

The badge for R-CMD-check seems to point to an old workflow. The current workflow (tic) shows failing tests, whereas the R-CMD-check shows passing test. I would recommend removing this old badge if it is no longer relevant.

response: Done ropensci/ohun@46a1eaf

comment: A couple minor edits to the README are in the following pull request:

ropensci/ohun#13

response: pull request accepted

comment: It might be helpful to users if the items in NEWS were accompanied with references to the relevant pull requests and issues (eg. #5). For example: https://github.com/tidyverse/ggplot2/blob/main/NEWS.md

_response: OK.

comment: Vignette As mentioned above, consider splitting the single vignette into multiple vignettes:

Get started vignette
    Background and theory
    How does ohun fit in
    What other packages does ohun interface with
    Overview of functions provided by ohun
Energy based detection
Template based detection
FAQ / best practices / advice / tips

response: As mentioned I also split the vignette in 3: introduction, energy based detection, template based detection. ropensci/ohun@1b9c95f

comment: Minor proposed edits to the vignette are in the following pull request:

ropensci/ohun#14
Contributing
response: pull request accepted

The CONTRIBUTING.md document looks like it's based on the template from https://contributing.md/. Because of this, there were some places that links to sections didn't work, and it often still referred to the CONTRIBUTING.md version.

response: pull request accepted

comment: There isn't a code of conduct in the repository to link to, so I removed that dead link. But please consider including one (see the rOpenSci pages for more details: Contributing Guide, Code of Conduct).

response: pull request accepted

Relevant pull request:

ropensci/ohun#15
Documentation

comment: I propose removing the "last modification date" from the R/ files since this information is redundant and more error prone than the information provided by Git. This change is included in the pull request under R.

response: Done ropensci/ohun@88fdf91

DESCRIPTION

comment: I suggest moving packages under "Depends" to "Imports". The case of adding packages to the Depends field does not seem like a universal yes/no clear cut decision but from my understanding and reading around this topic, I feel like for the ohun package it is not necessary to include packages in the Depends field.

From https://devguide.ropensci.org/building.html#pkgdependencies

Use Imports instead of Depends for packages providing functions from other packages.

See also https://github.com/leeper/Depends

For an example where it might be relevant to list a package in Depends:

From https://r-pkgs.org/dependencies-mindset-background.html#sec-dependencies-imports-vs-depends

A more classic example of Depends is how the censored package depends on the parsnip and survival packages. Parsnip provides a unified interface for fitting models, and censored is an extension package for survival analysis. Censored is not useful without parsnip and survival, so it makes sense to list them in Depends.

In the case of ohun, some of the packages listed under Depends also list some packages under Depends:

From warbleR
    tuneR, seewave (>= 2.0.1), NatureSounds
From tuneR
    None
From seewave
    None
From NatureSounds
    knitr

The full dependency tree is here:
Click to expand

If the recommendation to move some packages to move Imports instead of Depends, the following issues in the examples can be resolved, for example in feature_acoustic_data.R:

could not find function "writeWave" fixed by adding library(tuneR) at the top of the example

response: pull request accepted

comment: Minor suggested edits to the Description include:

Removing non-standard fields including "Date/Publication",
"Repository", "Config/testthat/edition"
Adding the link to the repository under URL
Using line breaks to make the Description file easier

Relevant pull request:

ropensci/ohun#16
Tests

response: pull request accepted

comment: I suggested some edits to the test scripts to better take advantage of the variety and precision of available functions from testthat. Often, the tests were expect_true where you set up a logical statement, eg. nrow(DT) == 3. This works but when it fails, there is nothing informative about the error. Alternatively, if you set this test using expect_length then you would get a numeric difference between expectation and result. This might make it easier to identify why or how the test is failing.

response: Added a few more tests ropensci/ohun@333817c

comment: I also suggest using a skip_on_windows function.

response: pull request accepted

comment: Writing tests can be really hard (and rewarding!) - I found these resources
helpful:

https://r-pkgs.org/testing-design.html#what-to-test
https://devguide.ropensci.org/building.html#testing
https://mtlynch.io/good-developers-bad-tests/

Specifically in the first link "avoid testing simple code [...] test what you're not sure about, what's fragile, ...". This is a great opportunity for you to put your deep understanding about this kind of analysis to work, because you know when results don't look right, beyond the basic "is it a data.frame", etc. To this point, a lot of tests are just length checks - do you have any other expectations from the functions other than that they will return the expected length?

I've adjusted some of the tests where I noticed an alternative function that might be more informative. Here's the pull request:

ropensci/ohun#17

response: pull request accepted

Note: if packages are moved from Depends, it may require them to be explicitly loaded with library() in the tests where their functions are used.
R

I don't really understand why custom functions stop2 and warning2 are being used. They return the error or warning message, without the call being printed. Would this possibly make it harder for a new user to troubleshoot, if where the message originates from? stop2 is used 105 times across 14 files and warning2 is used 3 times across 3 files.

For example with stop2:

R: label_detection(reference = matrix(lbh_reference[-1, ]), detection = lbh_reference)
Error: 'reference' is not of a class 'data.frame' or 'selection_table'

and with stop:

R: label_detection(reference = matrix(lbh_reference[-1, ]), detection = lbh_reference)
Error in label_detection(reference = matrix(lbh_reference[-1, ]), detection = lbh_reference) :
'reference' is not of a class 'data.frame' or 'selection_table'

response: I copied this from the package brms. I like the message to be less wordy. But most important, having the function name is not useful when the error comes from an internal function, which is often the case. RElated to this, I added more detailed argument evaluations with the new internal function check_arguments() ropensci/ohun@9fdd163

**comment:**Consider using seq_len instead of 1:nrow, 1:length, 1:ncol, ...

For example:

x <- data.frame(LETTERS[1:5])

1:nrow(x)
#> [1] 1 2 3 4 5

seq_len(nrow(x))
#> [1] 1 2 3 4 5

x <- data.frame()

1:nrow(x)
#> [1] 1 0

seq_len(nrow(x))
#> integer(0)

response: I did when earlier during the first default ROpenSci revision

Related to dependencies described above under Description:

internal functions from another package are used in 12 cases, though given all 12 are from warbleR and the maintainer of ohun is also the maintainer of warbleR this seems likely ok

response: OK

**comment:**There is mixed used of importFrom and ::. If packages listed under Depends are moved to Imports, I would suggest using the :: syntax because it helps a reader understand if functions are from in ohun and, if not, which package they come from.

response: Done.

There is commented code in

optimize_template_detector L226 - 231
split_acoustic_data L265-267

response: Removed ropensci/ohun@9fdd163

Related to @sammlapp's review, the ohun could be improved by refactoring some functions. For example, there are nested functions in the following scripts:

detect_FUN within energy_detector
internal_feature_reference within feature_reference
env_ohun_int within get_envelopes
XC_FUN and spc_FUN within template_correlator

response: I move those functions to "internal_functions.R" ropensci/ohun@7c47092. ropensci/ohun@4ca47bd

Thank you for the very useful comments. Could you please add yourself to the package DESCRIPTION? (do not know your full name :) )

jhollist · 2023-07-07T17:30:41Z

Thanks for your responses to the reviews, @maRce10!

@sammlapp and @robitalec could you both take a look and let me know what you think? You can use the reviewer approval template to indicate your response: https://devdevguide.netlify.app/approval2template.html

maRce10 · 2023-07-07T22:56:57Z

@robitalec I split the vignette in 3 but now the tarball is 15 MB. Do you have any suggestion on how to shrink the package a bit?

robitalec · 2023-07-27T19:29:05Z

Thanks @maRce10 for the detailed response, and for accepting those PRs. I am grateful that approach worked well for you too, I found it easier to look at logical chunks of changes as PRs personally.

Reviewer Response

Following the same headers as above:

A statement of need

It appears that while these changes have been added to the README.Rmd, they are not present in the README.md (and therefore not on the GitHub or pkgdown home page). Consider using a pre-commit hook to make sure these files stay synchronized (see ?usethis::use_readme_rmd())

Installation instructions

same as above under "A statement of need", these changes are not reflected in the README, GitHub page, or website

Packaging guidelines

Nice to see the {vdiffr} use.

Should {vdiffr} be added to the Suggests list instead of Depends like {testthat}? If so, you can remove the @importFrom call for vdffir::expect_doppelganger.

Relatedly though, the files under tests/testthat/_snaps are quite large:

get-templates.svg 36.9 MB
get-templates-env.svg 38.2 MB

Is there any way to reduce the size of these SVGs?

Additional comments

Some files I deleted in the accepted PR have resurfaced/are still present in the repository:

.Rproj.user
.Rhistory,
docs/

stop2, warning2

Ok for removing the call. My initial concern was that it would be harder for a user to follow the traceback, but I see now that the .call argument doesn't impact that. Since a user can still troubleshoot using the traceback, this should be fine.

stop2 <- function(...) {
    stop(..., call. = FALSE)
}
test <- function() {
    stop2('test error')
}
test()
#> Error: test error
traceback()
#> 3: stop(..., call. = FALSE) at #2
#> 2: stop2("test error") at #1
#> 1: test()

New comments

Tests

Two directories are in the Rbuildignore (https://github.com/maRce10/ohun/blob/master/.Rbuildignore#L4-L5):

testing (doesn't seem to be used)
tests (added in this commit)

This means that the tests aren't being run on CRAN

eg. https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-clang/ohun-00check.html

Compare this to eg. {dplyr}

https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-clang/dplyr-00check.html

and we can see the section "checking tests"...

Surprised CRAN didn't notice there were no tests in the submitted package, but maybe it isn't a requirement of CRAN packages I'm not sure. It would be good to have tests running on CRAN.

Vignettes

If we run usethis::use_vignette("test") we get the following changes:

✔ Setting active project to '/home/alecr/Documents/Local-git/ohun'
✔ Adding 'inst/doc' to '.gitignore'
✔ Adding '*.html', '*.R' to 'vignettes/.gitignore'
✔ Writing 'vignettes/test.Rmd'
• Modify 'vignettes/test.Rmd'

Most importantly, a file is written vignettes/.gitignore and to it two lines are added (ignoring rendered HTML and .R from the repo)

*.html
*.R

You could consider deleting files that correspond to this wildcard then adding this new .gitignore file.

Two options I can think of for reducing the size of your vignettes:

Adjust your output options

For example, the original output options for intro_to_ohun.Rmd (8 MB)

output:
  rmarkdown::html_document:
    css: extra.css
    theme: null
    self_contained: yes
    toc: true
    toc_depth: 3
    toc_float:
      collapsed: false
      smooth_scroll: true

8 MB -> 7.1 MB using theme: null and removing toc_float

output:
  rmarkdown::html_document:
    css: extra.css
    self_contained: yes
    toc: true
    toc_depth: 3

7.1 MB -> 3.4 MB dropping css (might be due to fonts)

output:
  rmarkdown::html_document:
    theme: null
    self_contained: yes
    toc: true
    toc_depth: 3

3.4 MB --> 2.5 MB switching from output type html_document to html_vignette

output:
  rmarkdown::html_vignette:
    self_contained: yes
    toc: true
    toc_depth: 3

You could keep these changes from CRAN, then apply the CSS independently to your site?

Or you could consider using articles instead of vignettes: https://r-pkgs.org/vignettes.html#sec-vignettes-article

Here is my author chunk from other packages.

    c(person(given = "Alec L.",
             family = "Robitaille",
             role = ,
             email = "[email protected]",
             comment = c(ORCID = "0000-0002-4706-1762"))

Let me know what you think of these additional comments and responses. Thank you!

jhollist · 2023-08-08T18:11:07Z

@maRce10 Can you take a look at @robitalec responses and revise/respond to each?

@sammlapp could you take a look at the revisions and let me know what you think? You can use the reviewer approval template to indicate your response: https://devdevguide.netlify.app/approval2template.html

Thanks!

maRce10 · 2023-08-31T14:49:56Z

Hi @robitalec

Thanks again! I added you as a contributor and move vdiffr to "suggests" (ropensci/ohun@50b7a55), render the readme (ropensci/ohun@7f53096) and added a vignette gitignore file (ropensci/ohun@8bc198a).

Will change the vignette size before submitting to CRAN. Any suggestion on how to keep the 2 versions of the vignettes (CRAN and site) without having to change manually the yaml header?

cheers

sammlapp · 2023-09-06T15:37:55Z

Reviewer Response

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 10 (second review: 1)

jhollist · 2023-09-06T20:37:53Z

All,

I think we are very close to finishing this up!

@sammlapp thank you for the reviewer response! @robitalec could you provide a final reviewer response based on the most recent changes? The link to that form is https://devdevguide.netlify.app/approval2template.

@maRce10 Instead of listing the reviewers as contributors, I think it best to have them as reviewers. The role in that case would be "rev". For the purposes of rOpenSci, the vignettes do not need to adhere to the size suggestions for CRAN. Although I agree with @robitalec suggestion that you reduce their size prior to submission to CRAN. I like the suggestion to consider using these only as articles on the pkgdown site. Another place to look for more on that is https://usethis.r-lib.org/reference/use_vignette.html. The decision on that is yours and can take place after we complete the review and prior to submission to CRAN.

Once we get these last few things resolved, I think we are good to go!

robitalec · 2023-09-08T13:48:53Z

Thanks @maRce10!
Looks good, thanks for considering the suggestions.

Thanks for clarifying @jhollist that the vignettes do not need to adhere to CRAN size suggestions.

Reviewer Response

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 17

maRce10 · 2023-09-13T16:39:12Z

Awesome! Is there anything else you need from me end at this point?

jhollist · 2023-09-13T16:45:03Z

Just make the change in the DESCRIPTION on @robitalec role from ctb to rev, and go ahead and add @sammlapp as rev as well. Once that is done, we are good to go!

maRce10 · 2023-09-13T17:21:11Z

OK I just did it ropensci/ohun@cfa533c

jhollist · 2023-09-13T17:38:56Z

@ropensci-review-bot approve ohun

ropensci-review-bot · 2023-09-13T17:38:59Z

Approved! Thanks @maRce10 for submitting and @sammlapp, @robitalec for your reviews! 😁

To-dos:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

jhollist · 2023-09-13T17:41:08Z

Thank you all for your hard work on this. Great to get it approved! I don't have much to add. The important details are above. @maRce10 if you have any questions, feel free to ping my in this issue or you can reach me via email hollister . jeff epa . gov.

Congratulations!

maRce10 · 2023-09-14T03:37:17Z

@ropensci-review-bot invite me to ropensci/ohun

ropensci-review-bot · 2023-09-14T03:37:19Z

Invitation sent!

maRce10 · 2023-09-14T03:40:49Z

@ropensci-review-bot finalize transfer of ohun

ropensci-review-bot · 2023-09-14T03:40:52Z

Transfer completed.
The ohun team is now owner of the repository and the author has been invited to the team

annakrystalli added 0/editorial-team-prep package topic:data-extraction topic:workflow-automation labels Jan 12, 2023

mpadge mentioned this issue Jan 13, 2023

additional checks flagged as ✔️ instead of 👀 ropensci-review-tools/pkgcheck#163

Closed

mpadge mentioned this issue Jan 14, 2023

Add flac to dockerfile ropensci-review-tools/pkgcheck#164

Closed

ropensci-review-bot assigned jhollist Jan 18, 2023

ropensci-review-bot added 1/editor-checks and removed 0/editorial-team-prep labels Jan 18, 2023

jhollist added the 4/review(s)-in-awaiting-changes label Jun 5, 2023

jhollist added 5/awaiting-reviewer(s)-response and removed 4/review(s)-in-awaiting-changes labels Jul 7, 2023

ropensci-review-bot added 6/approved and removed 5/awaiting-reviewer(s)-response labels Sep 13, 2023

ropensci-review-bot closed this as completed Sep 13, 2023

ohun #568

ohun #568

Comments

maRce10 commented Dec 25, 2022 • edited by ropensci-review-bot Loading

Archive: TBD Version accepted: TBD Language: en

Scope

Technical checks

Publication options

Code of conduct

ropensci-review-bot commented Dec 25, 2022

annakrystalli commented Jan 12, 2023

annakrystalli commented Jan 12, 2023

annakrystalli commented Jan 12, 2023

ropensci-review-bot commented Jan 12, 2023

ropensci-review-bot commented Jan 12, 2023

ropensci-review-bot commented Jan 12, 2023

Checks for ohun (v0.1.0)

1. Package Dependencies

2. Statistical Properties

2a. Network visualisation

3. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

4. Other Checks

Editor-in-Chief Instructions:

maRce10 commented Jan 12, 2023

annakrystalli commented Jan 12, 2023

maRce10 commented Jan 12, 2023

mpadge commented Jan 12, 2023

maRce10 commented Jan 13, 2023

mpadge commented Jan 14, 2023

mpadge commented Jan 17, 2023

annakrystalli commented Jan 18, 2023

ropensci-review-bot commented Jan 18, 2023

jhollist commented Jan 18, 2023

ropensci-review-bot commented Jan 18, 2023

ropensci-review-bot commented Jan 18, 2023

ropensci-review-bot commented Jan 18, 2023

Checks for ohun (v0.1.0)

1. Package Dependencies

2. Statistical Properties

2a. Network visualisation

3. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

4. Other Checks

Editor-in-Chief Instructions:

maRce10 commented Jan 18, 2023

jhollist commented Jun 5, 2023

ropensci-review-bot commented Jun 5, 2023

jhollist commented Jun 5, 2023

ropensci-review-bot commented Jun 5, 2023

ropensci-review-bot commented Jun 17, 2023

maRce10 commented Jul 7, 2023

Response to @sammlapp

split_acoustic_data.R:

Notes on the Getting Started tutorial page

Notes on API documentation and variable/function names

get_envelopes :

energy_detector:

label_spectro:

feature_reference:

feature_acoustic_data

filter_detection

optimize_template_detector

label_spectro

get_templates

split_acoustic_data

label_detection

template_correlator

full_spectrograms

Errors during usage

maRce10 commented Dec 25, 2022 •

edited by ropensci-review-bot

Loading

Archive: TBD
Version accepted: TBD
Language: en

3. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck

3. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck

jhollist commented Jul 7, 2023 •

edited

Loading

sammlapp commented Sep 6, 2023 •

edited

Loading

maRce10 commented Sep 13, 2023 •

edited

Loading