Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: as_tibble() calls as.data.frame() for objects that are not subclasses of "tbl_df" #1557

Merged
merged 6 commits into from
Dec 19, 2023

Conversation

TimTaylor
Copy link
Contributor

@TimTaylor TimTaylor commented Oct 27, 2023

This patch ensures as_tibble.data.frame first calls as.data.frame
to allow for methods provided by extended data frame classes. The
idea was proposed by Jan Gorecki in
Rdatatable/data.table#5698.

Fixes #1556.

@TimTaylor
Copy link
Contributor Author

TimTaylor commented Oct 27, 2023

Failing on pre 4.0.0 release due to use of .S3method() which is "new". Unsure if this is an issue for you or not as the usage is only within the test, not package, code.

Copy link
Member

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Running revdepchecks now.

x <- structure(mtcars, extra = "extra", class = c("ext_df_", "data.frame"))
y <- as_tibble(head(mtcars))
fun <- function(x, row.names = NULL, optional = FALSE, ...) y
.S3method("as.data.frame", "ext_df_", fun)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now updated using with_mocked_bindings().

test_that("as_tibble.data.frame coerces extended data.frames first", {
x <- structure(mtcars, extra = "extra", class = c("ext_df_", "data.frame"))
y <- as_tibble(head(mtcars))
fun <- function(x, row.names = NULL, optional = FALSE, ...) y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this method should be doing something more exciting to verify that it's actually called. Maybe set a flag in the test, or remove some attribute of the object?

Copy link
Contributor Author

@TimTaylor TimTaylor Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was pretty extreme as is (returning head of global variable with one less attribute) but I may be misunderstanding

@krlmlr
Copy link
Member

krlmlr commented Oct 27, 2023

I suspect it's worth understanding the dplyr failure -- the other failures could be a consequence of that.

@TimTaylor
Copy link
Contributor Author

TimTaylor commented Oct 27, 2023

I suspect it's worth understanding the dplyr failure -- the other failures could be a consequence of that.

hmm - I hadn't seen that there was no explicit method for tbl_df.

Now fixed (hopefully) with latest commit. I've rechecked dplyr and seems ok so 🤞 . Note the actions are currently failing but not due to package code.

@krlmlr
Copy link
Member

krlmlr commented Oct 27, 2023

Perhaps we want a separate as_tibble.tbl_df() method that would do less work? I vaguely remember this existed at some point.

It's always amazing to see how an innocuous change can quickly turn into a deep rabbit hole. I can't spend too much time digging myself, but happy to assist.

I'm seeing the GHA failure elsewhere, need to fix it.

@TimTaylor
Copy link
Contributor Author

Yeah - I wondered that too but didn't want to make the hole any bigger. I think one or two other packages (e.g. LexisNexisTools) may be genuine breakages (only due to an unneeded attribute that has been tested against). I'll dig in to these over the next few days and update with any info. No rush to get this sorted - we will get there in the end.

@TimTaylor
Copy link
Contributor Author

@krlmlr - I've looked through the revdep problems in more detail and I'm pretty sure it is only {LexisNexisTools} that will need patching. Is it possible to bump the actions and then if possible your revdep checks? (tests failed after 23dd204 due to the r-project domain issue).

@krlmlr
Copy link
Member

krlmlr commented Dec 17, 2023

I wonder what's wrong with the GHA checks. The main branch looks good.

@TimTaylor
Copy link
Contributor Author

Happy to poke at the revdeps again if it's not too costly and they can be rerun? (unsure on your CI setup for the gazillion revdeps you have).

TimTaylor and others added 6 commits December 19, 2023 17:18
This patch ensures as_tibble.data.frame first calls as.data.frame
to allow for methods provided by extended data frame classes. The
idea was proposed by Jan Gorecki in
Rdatatable/data.table#5698.
@krlmlr krlmlr changed the title Fixes #1556 feat: as_tibble() calls as.data.frame() for objects that are not subclasses of "tbl_df" Dec 19, 2023
@krlmlr
Copy link
Member

krlmlr commented Dec 19, 2023

Thanks. I think it's good as is, I'll wait for a bit until the CRAN release.

@krlmlr krlmlr merged commit 3c4744a into tidyverse:main Dec 19, 2023
3 checks passed
y <- as_tibble(head(mtcars))
with_mocked_bindings(
as.data.frame = function(x, row.names = NULL, optional = FALSE, ...) y,
code = expect_identical(as_tibble(x), y),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just from looking at the code here it feels it should fail because y is only a head of mtcars while x is complete mtcars.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should it fail? Are you expecting as_tibble(x) to be identical to mtcars converted to a tibble?

Would the test would be clearer if we used two completely unrelated data frames for x and y?

Copy link
Contributor Author

@TimTaylor TimTaylor Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the test would be clearer if we used two completely unrelated data frames for x and y?

It's been a while since I wrote the test and it did take me a moment to grok what was going on when rereading so it may be worth changing to y <- airquality . I can also add some commentary and/or link to the issue/PR for context/clarity if useful.

Copy link

@jangorecki jangorecki Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because y was created from already filtered out object, using head(mtcars) 3 lines above while x contains complete mtcars

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the tests succeed, and the example should also work when running it line by line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Could as_tibble.data.frame() be stricter?
3 participants