Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow recoding of sc_data objects #18

Merged
merged 22 commits into from
Aug 31, 2021
Merged

allow recoding of sc_data objects #18

merged 22 commits into from
Aug 31, 2021

Conversation

GregorDeCillia
Copy link
Contributor

Allow recoding such as changing labels via a recode method

closes #17

@GregorDeCillia
Copy link
Contributor Author

The initial version only works with open data and recodes are only applied when the language is set explicitly. Example

x <- od_table("OGD_krebs_ext_KREBS_1")

x$recode$
  label_field("C-KRE_GESCHLECHT-0", "en", "SEX")$
  label_measure("F-KRE", "en", "NUMBER")$
  level("C-KRE_GESCHLECHT-0", "GESCHLECHT-1", "en", "MALE")

x$language <- "en"
x$tabulate("C-KRE_GESCHLECHT-0", "F-KRE")

currently, this has no effect on labeling
right now, there is a redundancy with $language
in od_table, but $language will be moved to
the base class and then it is required that
no labels are empty
@GregorDeCillia
Copy link
Contributor Author

GregorDeCillia commented Aug 19, 2021

The recodes shoud also work with API tables now

x <- sc_table(json_file = sc_example("population_timeseries.json"))

x$recode$
  label_field("C-BESC11-0", "en", "SEX")$
  label_measure("F-ISIS-1", "en", "PERSONS")$
  level("C-BESC11-0", "1", "en", "MALE")

x$tabulate("C-BESC11-0")

Also the line x$language <- "en" in the comment above is no longer necessary.

@GregorDeCillia
Copy link
Contributor Author

GregorDeCillia commented Aug 19, 2021

Recoding operations that should be supported

  • rename fields
  • rename measures
  • rename levels
  • reorder fields
  • set total codes
  • set visibility of levels

@GregorDeCillia GregorDeCillia changed the title allow recoding of scdata objects allow recoding of sc_data objects Aug 19, 2021
@GregorDeCillia GregorDeCillia added the feature New feature or request label Aug 19, 2021
@GregorDeCillia GregorDeCillia added this to the Version 1.0 milestone Aug 19, 2021
@GregorDeCillia
Copy link
Contributor Author

this now works

earnings <- od_table("OGD_veste309_Veste309_1")
earnings$recode$
  total_codes("C-A11-0", "A11-1")$
  total_codes("C-STAATS-0", "STAATS-9")$
  total_codes("C-VEBDL-0", "VEBDL-10")$
  total_codes("C-BESCHV-0", "BESCHV-1")

earnings$total_codes()

@GregorDeCillia
Copy link
Contributor Author

It might make more sense to set the visibility automatically via $total_codes() and replace the "drop totals" logic with a "drop invisibles" logic

drop_total <- intersect(fields, has_total)
for (field_code in drop_total) {
i <- match(field_code, mf$code)
x <- x[x[[field_code]] != mf$total_code[i], ]
}

@GregorDeCillia
Copy link
Contributor Author

It turns out we also need a way to import both languages (german and english) into one sc_table object. Currently, only one language can be specified in the factory functions

STATcubeR/R/table.R

Lines 184 to 185 in 4175326

sc_table <- function(json_file, language = c("en", "de"), add_totals = TRUE,
key = sc_key()) {

We could either add a new language option "both" for all factory functions or define a method which adds the missing language

x <- sc_example("accomodation") %>% sc_table(language = "en")
x$recode$add_language("de")

The $add_language() method would send a new api request and then use $recode$label_field(), $recode$label_measure() and $recode$level() to add the new labels.

In any case this requires an additional request against the /table endpoint or an (additional) request against /schema. The /schema version is probably faster, but certainly more complicated to implement.

@GregorDeCillia
Copy link
Contributor Author

The add_language() method has been added. It makes a request against /table, extracts only the labels and adds it to the object

pop <- sc_example("pop") %>% sc_table(language = "en")
pop$add_language("de")
pop
#> An object of class sc_table
#> 
#> Database    Population at the beginning of the quarter since 2002 
#> Measures    Number of persons 
#> Fields      Quarter, Age in single years <96>, Sex <2>, Commune <2383> (Province-District) 
#> 
#> Request     2021-08-30 08:01:02 
#> STATcubeR   0.2.5
pop$language <- "de"
pop
#> An object of class sc_table
#>
#> Database    Bevölkerung zu Quartalsbeginn ab 2002 
#> Measures    Anzahl der Personen 
#> Fields      Quartal, Alter in Einzeljahren, Geschlecht, Gemeinde (Vergröberung über Politischen Bezirk) 
#> 
#> Request     2021-08-30 08:01:02 
#> STATcubeR   0.2.5

for convenience, this makes two requests
against table and returns both languages
in one object
the param language now behaves differently
for sc_table compared to other REST API
functions
if present, this language will be used for
labelling the dataset. Otherwise
table$language is used
Format: YYYY-WW with an isoweek in WW
closes #15
@GregorDeCillia GregorDeCillia marked this pull request as ready for review August 31, 2021 14:48
@GregorDeCillia GregorDeCillia merged commit 5243c49 into master Aug 31, 2021
@GregorDeCillia
Copy link
Contributor Author

Closing this because soon we will add some updates that are unrelated to the recoder class. The documentation of the new features is currently limited to the class reference which schuld be enough for now.

@GregorDeCillia GregorDeCillia deleted the recodes branch November 5, 2021 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow recoding of sc_data objects
1 participant