Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix recurrent substitution issues with tibbles in check_latex #51

Merged
merged 1 commit into from
Dec 8, 2023

Conversation

giocomai
Copy link
Contributor

@giocomai giocomai commented Dec 6, 2023

If the data frame given as input for either of the functions running check_latex is a tibble, then the gsub in check_latex() start with recurrent substitutions of "/" characters and ultimately an almost endless process that gobbles up memory until R crashes (or anyway, one wouldn't get anything resembling the desired output).

Tibbles are very common, as they are the default output of common functions such as readr::read_csv or googlesheets4::read_sheet and many other packages.

The problem can be easily solved by selecting columns in a way that is compatible to both base R and tibbles, basically replacing df[,column] with df[[column]]

library("labeleR")
column <- "List"
df <- badges.table
gsub("&", "\\\\&", df[,column])
#> [1] "Cornelius Fudge"    "Rita Skeeter"       "Bartemius Crouch"  
#> [4] "Newt Scamander"     "Garrick Ollivander" "Lily Potter"       
#> [7] "Peter Pettigrew"    "Sirius Black"

df <- tibble::as_tibble(badges.table)
gsub("&", "\\\\&", df[,column])
#> [1] "c(\"Cornelius Fudge\", \"Rita Skeeter\", \"Bartemius Crouch\", \"Newt Scamander\", \"Garrick Ollivander\", \"Lily Potter\", \"Peter Pettigrew\", \"Sirius Black\")"

# see output, and with every gsub the mess will only grow, and very quickly

# the following may run for very long:
# check_latex(tibble::as_tibble(badges.table), "List")

Created on 2023-12-06 with reprex v2.0.2

@Pakillo
Copy link
Member

Pakillo commented Dec 6, 2023

Dear @giocomai

Thanks a lot for the pull request. The check_latex function was added recently and it seemed to work but we still hadn't quite settled on its design (#50). Your suggested change makes sense to me (at least by now). @iramosgutierrez I think we can merge this quickly, it's not good to crash R session for early labeleR users!

@Pakillo
Copy link
Member

Pakillo commented Dec 6, 2023

@iramosgutierrez Also this should have been caught by tests... We need to include some test with tibbles, not only data.frame

@Pakillo Pakillo mentioned this pull request Dec 6, 2023
@giocomai
Copy link
Contributor Author

giocomai commented Dec 6, 2023

Great, thanks! Yes, as tibble are so common nowadays I initially assumed there was some issue with my data inputs, special characters or the likes, and did not think about this, as everything worked fine with the test dataset and some early tests I did with data.frame, so troubleshooting took some time, as initially I would let the process run for many minutes.

Anyway, great package, looks like we'll have nice conference badges for our next event. Thanks, cheers!

@Pakillo
Copy link
Member

Pakillo commented Dec 6, 2023

@giocomai Apologies for the inconvenience, but many thanks again for taking the time to investigate and send a fix. Glad to hear labeleR it's going to be useful for you

@iramosgutierrez I think all functions end up converting the data to data.frame before rendering, right? (eg.) Not sure, but maybe we can do this conversion earlier in each function, as another way to circumvent potential problems with tibbles?

@iramosgutierrez
Copy link
Collaborator

Hi @giocomai , thank you for te time used to check this error and for fixing it! I will take a look on it and accept it ASAP! I agree with @Pakillo , maybe converting tibbles to data.frame earlier in the function may be a way to avoid problems!

@iramosgutierrez iramosgutierrez merged commit 26d8151 into EcologyR:master Dec 8, 2023
5 checks passed
@iramosgutierrez
Copy link
Collaborator

@iramosgutierrez Also this should have been caught by tests... We need to include some test with tibbles, not only data.frame

Concerning the tests, having to create a tibble in the test files does not make labeleR to depend on "tibble" package? If it does not, I will immediately include some!

@Pakillo
Copy link
Member

Pakillo commented Dec 8, 2023

If you include tibble in tests, you'd need to include tibble in Suggests in DESCRIPTION.
But if we convert all data to data.frame at the beginning of the function then no need to do anything else I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants