GitHub - CC-HIC/icnarc2omop: Easily convert ICNARC XML to OMOP CDM

Overview

The role of this package is to help convert the most elementary components of the ICNARC XML to OMOP CDM. This facilitates the set up of an OMOP style CDM, acting as a base from which users can then populate with data from their local EHR. This was a somewhat non-trivial task, since the ICNARC specification is episode centric, and the OMOP CDM is patient centric. This necessitated a number of opinionated design decisions, especially in places where inforamtion of a person’s hospital journey is lacking. These decisions may not work universally for all study designs.

Installation

# install directly from github with
remotes::install_github("cc-hic/icnarc2omop")

Prerequisites

A running target database that is empty. PostgreSQL and MySQL are currently supported.
R (R is mandatory, R Studio is useful)
The ICNARC xmls you wish to convert
A text file describing the ETL process that generated the ICNARC files (for data provenance)
A download of the vocabuaries you would like to use from (Athena)[http:https://athena.ohdsi.org/]
A PC with at least 8Gb of memory (moving the OMOP vocabularies through memory is the main bottleneck). If this is a problem, please let me know and I will write a chunking function to mitigate this issue. You could also load the vocabularies yourself, and omit that functionality from the muncher.

Usage

Before using this package, you will need to:

Set up a new empty database (e.g. CREATE DATABASE omop;)
Create a project folder and populate it with the necessary files. They MUST follow this convention EXACTLY:

|--xml/
    |--icnarc01.xml <- names must be unequviocal in order.
    |--icnarc02.xml
|--meta/
    |--src_desc.txt
|--vocab/
    |--CONCEPT_ANCESTOR.csv
    |--CONCEPT.csv
    |--CONCEPT_SYNONYM.csv
    |--DRUG_STRENGTH.csv
    |--VOCABULARY.csv
    |--CONCEPT_CLASS.csv
    |--CONCEPT_RELATIONSHIP.csv
    |--DOMAIN.csv
    |--RELATIONSHIP.csv

Metadata text file.

“src_desc.txt” must be placed in the meta folder. This file contains a description of the source data origin and purpose for collection. The description may contain a summary of the period of time that is expected to be covered by this dataset.

Vocabularies

Please download the following vocabularies from athena, and add them to your project vocabulary folder:

1 SNOMED
6 LOINC
8 RxNorm
12 Gender
13 Race
14 CMS Place of Service
34 ICD10
44 Ethnicity
55 OPCS4
57 HES Specialty
75 dm+d
82 RxNorm Extension
87 Specimen Type
111 Episode Type
115 Provider
116 Supplier

Omopification

You can now run the muncher by calling:

library(icnarc2omop)

omopify_xml(
  project_path = ".",
  nhs_trust = "Demo St. Elsewhere",
  cdm_version = "5.3.1",
  database_name = "omop",
  database_engine = "postgres",
  host_name = "localhost",
  port_no = 5432,
  username = "db username goes here"
  )

You will be asked for your password to connect to the database. If all goes well. You should now have a lovely barebones set up of OMOP. Enjoy these magical good times.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
R		R
docs		docs
inst		inst
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
icnarc2omop.Rproj		icnarc2omop.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Installation

Prerequisites

Usage

Metadata text file.

Vocabularies

Omopification

About

Releases

Packages

Languages

License

CC-HIC/icnarc2omop

Folders and files

Latest commit

History

Repository files navigation

Overview

Installation

Prerequisites

Usage

Metadata text file.

Vocabularies

Omopification

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages