Create data frame of specified fields from database collection
Source:R/utils.R
dbGetFieldsIntoDf.Rd
Fields in the collection are retrieved into a data frame (or tibble). Note that fields within the record of a trial can be hierarchical and structured, that is, nested. Names of fields can be found with dbFindFields. The function uses the field names to appropriately type the values that it returns, harmonising original values (e.g. "Information not present in EudraCT" to `NA`, "Yes" to `TRUE`, "false" to `FALSE`, date strings to class Date, number strings to numbers). The function also attempts to simplify the structure of nested data and may concatenate multiple strings in a field using " / " (see example). For full handling of complex nested data, use function dfTrials2Long followed by dfName2Value to extract the sought nested variable(s).
Arguments
- fields
Vector of one or more strings, with names of sought fields. See function dbFindFields for how to find names of fields. "item.subitem" notation is supported.
- con
A connection object, see section `Databases` in ctrdata.
- verbose
Printing additional information if set to
TRUE
; (defaultFALSE
).- stopifnodata
Stops with an error (detaul
TRUE
) or with a warning (FALSE
) if the sought field is empty in all, or not available in any of the records in the database collection.
Value
A data frame (or tibble, if tibble
is loaded)
with columns corresponding to the sought fields.
A column for the records' `_id` will always be included.
Each column can be either a simple data type (numeric, character, date)
or a list (typically for nested data, see above). For complicated lists,
use function dfTrials2Long followed by function dfName2Value
to extract values for sought nested variables. The maximum number of rows of the returned data frame is equal to, or less than the number of records of trials in the database collection.
Examples
dbc <- nodbi::src_sqlite(
dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
collection = "my_trials")
# get fields that are nested within another field
# and can have multiple values with the nested field
dbGetFieldsIntoDf(
fields = "b1_sponsor.b31_and_b32_status_of_the_sponsor",
con = dbc)
#> b1_sponsor.b31_and_b32_status_of_the_sponsor...
#>
#> .
#> _id b1_sponsor.b31_and_b32_status_of_the_sponsor
#> 1 2012-003632-23-AT Commercial
#> 2 2012-003632-23-CZ Commercial
#> 3 2012-003632-23-DE Commercial
#> 4 2012-003632-23-ES Commercial
#> 5 2012-003632-23-GB Commercial
#> 6 2012-003632-23-IT Commercial
#> 7 2012-003632-23-SE Commercial
#> 8 2014-002606-20-3RD Commercial
#> 9 2014-002606-20-AT Commercial
#> 10 2014-002606-20-DE Commercial
#> 11 2014-002606-20-ES Commercial
#> 12 2014-002606-20-GB Commercial
#> 13 2014-002606-20-IT Commercial
#> 14 2014-002606-20-PT Commercial
#> 15 2014-003556-31-GB Commercial
#> 16 2014-003556-31-IT Commercial
#> 17 2014-003556-31-NL Commercial
#> 18 2014-003556-31-PL Commercial
#> 19 2014-003556-31-SE Commercial
# fields that are lists of string values are
# returned by concatenating values with a slash
dbGetFieldsIntoDf(
fields = "keyword",
con = dbc)
#> keyword...
#>
#> _id
#> 1 NCT01035190
#> 2 NCT01190995
#> 3 NCT01265589
#> 4 NCT01389856
#> 5 NCT01490580
#> 6 NCT01517828
#> 7 NCT01528852
#> 8 NCT01536483
#> 9 NCT01720524
#> 10 NCT01723501
#> 11 NCT01783041
#> 12 NCT01875757
#> 13 NCT01899677
#> 14 NCT01954082
#> 15 NCT02056223
#> 16 NCT02374281
#> 17 NCT02499393
#> 18 NCT03022253
#> 19 NCT03058666
#> 20 NCT03082001
#> 21 NCT03272594
#> keyword
#> 1 Preterm Infants / Inhaled Corticosteroids / Bronchopulmonary Dysplasia
#> 2 Pain / Neonate / Procedure / Neurobehavioural outcome
#> 3 RDS / Infant, newborn / Vitamin A / Surfactant
#> 4 Persistent pulmonary hypertension, newborn / PPHN
#> 5 Pain / Newborn / Anaesthesia / opioids / propofol / neuromuscular blocker / atropine
#> 6 Intubation / Delivery room / Sedation / Analgesia
#> 7 neonatal Mortality / chlorhexidine cord care / cord infections / african settings / clinical trials
#> 8 Butyrate / Preterms / Very Low Birth Weight (<1250g) / Rectocolonic enemas / Digestive maturation / Parenteral nutrition weaning / ECUN / Whole gut transit time
#> 9 persistent pulmonary hypertension / newborn / neonates / iv sildenafil / hypoxic respiratory failure and at risk of persistent pulmonary hypertension of the newborn
#> 10 chlorhexidine / sepsis / neonatal / whole-body skin cleansing
#> 11 Prematurity / Carnitine supplementation / Neonatal Intensive Care / MRI / Amplitude-integrated EEG / NICU Network Neurobehavioral Scale / Bayley Scale of Infant Development III
#> 12 Vitamin D / Acute bronchitis / Upper respiratory tract infection / Recurrent bronchitis / Acute bronchiolitis / Infants
#> 13 Symbiotic,cytokines, necrotising enterocolitis
#> 14 Retinopathy / Prematurity / Infant, Newborn, Diseases
#> 15 Patent ductus arteriosus / Preterm infant / Paracetamol / Ibuprofen
#> 16 Newborn / Neonatal Screening / Autonomic Nervous System / Pain / non-nutritive sucking / sucrose administration / nociceptive action / Electro acoustical characteristics of crying
#> 17 newborn / therapeutic hypothermia / magnesium sulphate
#> 18 PDA / Platelets
#> 19 Aerosol / surfactant / calfactant / Infasurf / Respiratory Distress Syndrome / RDS / Premature / Neonates
#> 20 Neonate / Pain / ELBW
#> 21 Neonatal pain / Pediatric pain / Pain assessment / Breastfeeding / Sucrose / Randomized controlled trial / Behavioural pain response / Neurophysiological pain response