Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

special handling of national accounts cubes #14

Closed
GregorDeCillia opened this issue Aug 9, 2021 · 2 comments
Closed

special handling of national accounts cubes #14

GregorDeCillia opened this issue Aug 9, 2021 · 2 comments
Labels
feature New feature or request

Comments

@GregorDeCillia
Copy link
Contributor

GregorDeCillia commented Aug 9, 2021

national accounts cubes such as sc_example("foreign_trade") do not provide values for total codes.

"database" : "str:database:denatec06",

Therefore, they should be aggregated directly in $tabulate() because otherwise the result would be a table filled with NAs in all measure columns.

sc_example("foreign_trade") %>%
  sc_table() %$%
  tabulate("Reference year")
# A STATcubeR tibble: 11 x 5
   `Reference year` `Import, number… `Import, value … `Export, number… `Export, value …
 * <date>                      <dbl>            <dbl>            <dbl>            <dbl>
 1 2008-01-01                     NA               NA               NA               NA
 2 2009-01-01                     NA               NA               NA               NA
 3 2010-01-01                     NA               NA               NA               NA
 4 2011-01-01                     NA               NA               NA               NA
 5 2012-01-01                     NA               NA               NA               NA
 6 2013-01-01                     NA               NA               NA               NA
 7 2014-01-01                     NA               NA               NA               NA
 8 2015-01-01                     NA               NA               NA               NA
 9 2016-01-01                     NA               NA               NA               NA
10 2017-01-01                     NA               NA               NA               NA
11 2018-01-01                     NA               NA               NA               NA

In one of our internal projects, we currently use the condition

"T" %in% table$annotation_legend$annotation

to determine wether a direct aggregation via rowsum() should be applied.

@GregorDeCillia GregorDeCillia added the feature New feature or request label Aug 9, 2021
@GregorDeCillia
Copy link
Contributor Author

Update: this problem also occurs in the following cube from the environmental accounts

x <- STATcubeR::sc_table_custom(
  db = 'str:database:deeehh02',
  measures = "str:statfn:deeehh02:F-DATA:F-EEGJ:SUM", 
  dimensions = c(
    "str:field:deeehh02:F-DATA:C-ENEETRAEG0-0",
    "str:field:deeehh02:F-DATA:C-C57-0",
    "str:field:deeehh02:F-DATA:C-ENEZEIT-0",
    "str:field:deeehh02:F-DATA:C-ENEVERW-0"
  )
)

@GregorDeCillia
Copy link
Contributor Author

For now, it is probably best to use the following workaround

  • Use sc_table(json_file, add_totals=FALSE)
  • Make sure the totals are not explicitly specified inside the json-request document

This will make sure that $tabulate() falls back to aggeegating via sums and the above issue is resolved if unweighted sums are an appropriate way of aggregating the data

It would be useful if STATcubeR would do this fallback automatically in certain situations, but the danger here is that "real missings" could be replaced by sums in situations where this is not meaningful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant