Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-implement annotatons #39

Open
3 of 5 tasks
Tracked by #32
GregorDeCillia opened this issue Feb 20, 2023 · 0 comments
Open
3 of 5 tasks
Tracked by #32

re-implement annotatons #39

GregorDeCillia opened this issue Feb 20, 2023 · 0 comments
Assignees
Labels
feature New feature or request render
Milestone

Comments

@GregorDeCillia
Copy link
Contributor

GregorDeCillia commented Feb 20, 2023

In early versions, STATcubeR used to include annotations in the output of as.dataframe.sc_table(). This was dropped when support for OGD Datasets was introduced in #11 . Back then, the annotations were included using separate columns.

It is planned to re-implement this feature in a slightly different manner using {tibble} and {vctrs} by providing a custom vector class that acts as a "annotated numeric". The result of printing those values should look something like this

image

Annotations should either replace the values while printing or use color coding to reference a specific annotation

image

The "annotation legend" (which color corresponds to which annotation) can then be included in the footer of the tibble. Some technical details

  • In order to keep things backwards compatible, the default behavior of sc_tabulate() and as.data.frame.sc_table() should be to return simple tibbles that only include columns of type numeric and factor. Adding annotations should be "opt-in"
  • Annotated cell values containing a zero can usually be interpreted as not available. Therefore, it makes sense to show the annotation code instead of the zero value (first screenshot). For annotated non-zero values, the values will be color-coded based on the annotation (second screnshot)
  • The "annotated numeric" class used to represent the columns will have a as.numeric() method which drops the annotations and returns a canonical double-type
  • Aggregating annotations will not be pursued. If a sc_tabulate() is called in a way where aggregation via rowsums() is necessary and annotations is set to TRUE, an error will be thrown.
  • Color-coding values with multiple annotations will not be pursued. Instead, one of the annotations will be selected for the color.
@GregorDeCillia GregorDeCillia added feature New feature or request render labels Feb 20, 2023
@GregorDeCillia GregorDeCillia self-assigned this Feb 20, 2023
@GregorDeCillia GregorDeCillia added this to the Version 1.0 milestone Feb 20, 2023
@GregorDeCillia GregorDeCillia linked a pull request Feb 20, 2023 that will close this issue
5 tasks
GregorDeCillia added a commit that referenced this issue Mar 29, 2023
this is the first step to resolving
#27 by adding a function that creates
sc_table() like objects based on sdmx
archives

The sdmx format contains all metadata
that is necessary for STATcubeR to reuse
the existing $tabulate() workflow and this
first version already provides support for
various features via the base class (sc_data)

- $tabulate() to aggregate data
- $total_codes() to set/unset total codes
- $recoder to recode datasets (change labels)
  change codes, toggle visibility of
  elements, reorder elements, etc.
- importing german and english labels
  simultaniously (both languages are included
  in a zip download) and allowing to swhitch
  between them using $language<-().

New features
- sdmx arcives provide a $parent column
  in the $fields() table which are used
  to represent hierarchical classifications.
  Previously, this was only possible with
  od_table()

There are still some improvements. See
the issue #27 for more details

- properly parse time variables -
  currently they are treated as generic
  categories.
- parse element annotations (detailed
  descriptions for classification
  elements) and add them to
  $field()$de_desc just like with
  OGD dataset
- parse value annotations (see #39)
- provide a print/fromat method
- add a reasonable logic for total
  codes that takes the parent codes into
  account
- fill meta$measures$fun and
  $meta$measures$precision based on
  the sdmx metadata
- modify very long codes which use
  the @-symbol (probably for escapes)
- extend documentation
- possibly check SuperCROSS compability
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request render
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant