Skip to content
/ hermes Public

A library and microservice implementing the health and care terminology SNOMED CT with support for cross-maps, inference, fast full-text search, autocompletion, compositional grammar and the expression constraint language.

License

Notifications You must be signed in to change notification settings

wardle/hermes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hermes : terminology tools, library and microservice.

Hermes: "Herald of the gods."

Scc Count Badge Scc Cocomo Badge

Hermes provides a set of terminology tools built around SNOMED CT including:

  • a fast RESTful terminology server with full-text search functionality; ideal for driving autocompletion in user interfaces
  • an inference engine in order to analyse SNOMED CT expressions and concepts and derive meaning
  • cross-mapping to and from other code systems
  • support for SNOMED CT compositional grammar and the SNOMED CT expression constraint language.

It is designed as both library for embedding into larger applications, or as a microservice.

It is fast, both for import and for use. It imports and indexes the International and UK editions of SNOMED CT in less than 5 minutes; you can have a server running seconds after that.

It replaces previous similar tools written in java and golang and is designed to fit into a wider architecture with identifier resolution, mapping and semantics as first-class abstractions.

Rather than a single monolithic terminology server, it is entirely reasonable to build multiple services, each providing an API around a specific edition or version of SNOMED CT, and to use an API gateway to manage client access. Hermes is lightweight and designed to be composed with other services.

It is part of my PatientCare v4 development; previous versions have been operational within NHS Wales since 2007.

You can have a working terminology server running by typing only a few lines at a terminal. There's no need for any special hardware, or any special dependencies such as setting up your own elasticsearch or solr cluster. You just need a filesystem! Many other tools take hours to import the SNOMED data; you'll be finished in less than 10 minutes!

A HL7 FHIR terminology facade is under development : hades. This exposes the functionality available in hermes via a FHIR terminology API. This already supports search and autocompletion using the $expand operation.

Quickstart

You can have a terminology server running in minutes. Full documentation is below, but here is a quickstart.

  1. Install clojure

e.g on Mac OS X

brew install clojure
  1. Clone the repository and change directory
git clone https://github.com/wardle/hermes
cd hermes
  1. Download and install a distribution

If you're a UK user and want to use automatic downloads, you can do this:

clj -M:run --db snomed.db download uk.nhs/sct-clinical api-key trud-api-key.txt cache-dir /tmp/trud

Ensure you have a TRUD API key.

If you've downloaded a distribution manually, import like this:

clj -M:run --db snomed.db import ~/Downloads/snomed-2021/
  1. Compact and index
clj -M:run --db snomed.db compact
clj -M:run --db snomed.db index
  1. Run a server!
clj -M:run --db snomed.db --port 8080 serve

You can use hades with the 'snomed.db' index to give you a FHIR terminology server.

Common questions

What is the use of hermes?

hermes provides a simple library, and optionally a microservice, to help you make use of SNOMED CT.

A library can be embedded into your application; this is easy using Clojure or Java. You make calls using the API just as you'd use any regular library.

A microservice runs independently and you make use of the data and software by making an API call over the network.

Like all PatientCare components, you can use hermes in either way. Usually, when you're starting out, it's best to use as a library but larger projects and larger installations will want to run their software components independently, optimising for usage patterns, resilience, reliability and rate of change.

Most people who use a terminology run a server and make calls over the network.

How is this different to a national terminology service?

Previously, I implemented SNOMED CT within an EPR. Later I realised how important it was to build it as a separate module; I created terminology servers in java, and then later in golang; hermes is written in clojure. While I support the provision of a national terminology server for convenience, I think it's important to recognise that it is the data that matters most. We need to cooperate and collaborate on semantic interoperability, but the software services that make use of those data can be centralised or distributed; when I do analytics, I can't see me making server round-trips for every check of subsumption! That would be silly; I've been using SNOMED for analytics for longer than most; you need flexibility in provisioning terminology services. I want tooling that can both provide services at scale, while is capable of running on my home computer as well.

Unlike other available terminology servers, hermes is lightweight and has no other dependencies except a filesystem, which can be read-only when in operation.

I don't believe in the idea of uploading codesystems and value sets in place. My approach to versioning is to run different services; I simply switch API endpoints.

Why are you building so many small repositories?

Small modules of functionality are easier to develop, easier to understand, easier to test and easier to maintain. I design modules to be composable so that I can stitch different components together in order to solve problems.

In larger systems, it is easy to see code rotting. Dependencies become outdated and the software becomes difficult to change easily because of software that depend on it. Small, well-defined modules are much easier to build and are less likely to need ongoing changes over time; my goal is to need to update modules only in response to changes in domain not software itself. I aim for an accretion of functionality.

It is very difficult to 'prove' software is working as designed when there are lots of moving parts.

What are you using hermes for?

I have embedded it into clinical systems; I use it for a fast autocompletion service so users start typing and the diagnosis, or procedure, or occupation, or ethnicity, or whatever, pops up. Users don't generally know they're using SNOMED CT. I use it to populate pop-ups and drop-down controls, and I use it for decision support to switch functionality on and off in my user interface - e.g. does this patient have a type of 'x' such as motor neurone disease - as well as analytics. A large number of my academic publications are as a result of using SNOMED in analytics.

What is this graph stuff you're doing?

I think health and care data are and always will be heterogenous, incomplete and difficult to process. I do not think trying to build entities or classes representing our domain works at scale; it is fine for toy applications and trivial data modelling such as e-observations, but classes and object-orientation cannot scale across such a complete and disparate environment. Instead, I find it much easier to think about first-class properties - entity - attribute - value - and use such triples as a way of building and navigating a complex, hierarchical graph.

I am using a graph API in order to decouple subsystems and can now navigate from clinical data into different types of reference data seamlessly. For example, with the same backend data, I can view an x.500 representation of a practitioner, or a FHIR R4 Practitioner resource model. The key is to recognise that identifier resolution and mapping are first class problems within the health and care domain. Similarly, I think the semantics of reading data are very different to one of writing data. I cannot shoehorn health and care data into a REST model in which we read and write to resources representing the type. Instead, just as in real-life, we record event data which can effect change. In the end, it is all data.

Documentation

A. How to download and build a terminology service

Ensure you have a pre-built jar file, or the source code checked out from github. See below for build instructions.

I'd recommend installing clojure and running using source code but use the pre-built jar file if you prefer.

1. Download and install at least one distribution.

If your local distributor is supported, hermes can do this automatically for you. Otherwise, you will need to download your local distribution(s) manually.

i) Use a registered SNOMED CT distributor to automatically download and import

There is currently only support for automatic download and import for the UK, but other distribution sources can be added if those services provide an API.

The basic command is:

clj -M:run --db snomed.db download <distribution-identifier> [properties] 

or if you are using a precompiled jar:

java -jar hermes.jar --db snomed.db download <distribution-identifier> [properties]

The distribution, as defined by distribution-identifier, will be downloaded and imported to the file-based database snomed.db.

Distribution-identifier Description
uk.nhs/sct-clinical UK SNOMED CT clinical - incl international release
uk.nhs/sct-drug-ext UK SNOMED CT drug extension - incl dm+d

Each distribution might require custom configuration options. These can be given as key value pairs after the command, and their use will depend on which distribution you are using.

For example, the UK releases use the NHS Digital TRUD API, and so you need to pass in the following parameters:

  • api-key : path to a file containing your NHS Digital TRUD api key
  • cache-dir : directory to use for downloading and caching releases

For example, these commands will download, cache and install the International release, the UK clinical edition and the UK drug extension:

clj -M:run --db snomed.db download uk.nhs/sct-clinical api-key trud-api-key.txt cache-dir /tmp/trud
clj -M:run --db snomed.db download uk.nhs/sct-drug-ext api-key trud-api-key.txt cache-dir /tmp/trud

hermes will tell you what configuration parameters are required:

clj -M:run download uk.nhs/sct-drug-ext

Will result in:

Invalid parameters for provider ' uk.nhs/sct-drug-ext ':

should contain keys: :api-key, :cache-dir

| key        | spec    |
|============+=========|
| :api-key   | string? |
|------------+---------|
| :cache-dir | string? |

So we know we need to pass in api-key and cache-dir as above.

ii) Download and install SNOMED CT distribution file(s) manually

Depending on where you live in the World, download the most appropriate distribution(s) for your needs.

In the UK, we can obtain these from TRUD.

For example, you can download the UK "Clinical Edition", containing the International and UK clinical distributions as part of TRUD pack 26/subpack 101.

Optionally, you can also download the UK SNOMED CT drug extension, that contains the dictionary of medicines and devices (dm+d) is available as part of TRUD pack 26/subpack 105.

Once you have downloaded what you need, unzip them to a common directory and then you can use hermes to create a file-based database.

If you are running using the jar file:

java -jar hermes.jar --db snomed.db import ~/Downloads/snomed-2020

If you are running from source code:

clj -M:run --db snomed.db import ~/Downloads/snomed-2020/

The import of both International and UK distribution files takes a total of less than 3 minutes on my machine.

2. Compact database (optional).

This reduces the file size by around 20% and takes about 1 minute. This is an optional step, but recommended.

java -jar hermes.jar --db snomed.db compact

or

clj -M:run --db snomed.db compact

You may need to give java more memory for compaction; I only need to do so after importing three different distributions, but it will depend on the size of each.

For example

java -Xmx8g -jar hermes.jar --db snomed.db compact
clj -J-Xmx8g -M:run --db snomed.db compact

3. Build search index

Run

java -jar hermes.jar --db snomed.db index

or

clj -M:run --db snomed.db index

This will build the search index; it takes about 2 minutes on my machine.

4. Run a REPL (optional)

When I first built terminology tools, either in java or in golang, I needed to also build a custom command-line interface in order to explore the ontology. This is not necessary as most developers using Clojure quickly learn the value of the REPL; a read-evaluate-print-loop in which one can issue arbitrary commands to execute. As such, one has a full Turing-complete language (a lisp) in which to explore the domain.

Run a REPL and use the terminology services interactively. I usually use a REPL from within my IDE.

clj -A:dev

5. Get the status of your installed index

You can obtain status information about any index by using:

clj -M:run --db snomed.db status

Result:

{:installed-releases
 ("SNOMED Clinical Terms version: 20200731 [R] (July 2020 Release)"
  "31.3.0_20210120000001 UK clinical extension"),
 :concepts 574414,
 :descriptions 1720404,
 :relationships 3263996,
 :refsets 9424174,
 :indices
 {:descriptions-concept 1720404,
  :concept-parent-relationships 1210561,
  :concept-child-relationships 1210561,
  :installed-refsets 293,
  :component-refsets 6094742,
  :map-target-component 1125516}}

The result will be different after I also import the UK dm+d (dictionary of medicines and devices) distribution.

6. Run a terminology web service

By default, data are returned using edn but of course, simply add "Accept:application/json" in the request header and it will return JSON instead. You can see examples below.

java -jar hermes.jar --db snomed.db --port 8080 serve 

or

clj -M:run --db snomed.db --port 8080 serve

Example usage of search endpoint.

curl "http:https://localhost:8080/v1/snomed/search?s=mnd\&constraint=<64572001&maxHits=5" -H "Accept: application/json"  | jq
[
  {
    "id": 486696014,
    "conceptId": 37340000,
    "term": "MND - Motor neurone disease",
    "preferredTerm": "Motor neuron disease"
  }
]

Here I use the httpie command-line tool:

http -j localhost:8080/v1/snomed/concepts/24700007/extended

The result is an extended concept definition - all the information needed for inference, logic and display. For example, at the client level, we can then check whether this is a type of demyelinating disease or is a disease affecting the central nervous system without further server round-trips. Each relationship also includes the transitive closure tables for that relationship, making it easier to execute logical inference. Note how the list of descriptions includes a convenient acceptableIn and preferredIn so you can easily display the preferred term for your locale. If you provide an Accept-Language header, then you will also get a preferredDescription that is the best choice for those language preferences given what is installed.

HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 08 Mar 2021 22:01:13 GMT

{
    "concept": {
        "active": true,
        "definitionStatusId": 900000000000074008,
        "effectiveTime": "2002-01-31",
        "id": 24700007,
        "moduleId": 900000000000207008
    },
    "descriptions": [
        {
            "acceptableIn": [],
            "active": true,
            "caseSignificanceId": 900000000000448009,
            "conceptId": 24700007,
            "effectiveTime": "2017-07-31",
            "id": 41398015,
            "languageCode": "en",
            "moduleId": 900000000000207008,
            "preferredIn": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "refsets": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "term": "Multiple sclerosis",
            "typeId": 900000000000013009
        },
        {
            "acceptableIn": [],
            "active": false,
            "caseSignificanceId": 900000000000020002,
            "conceptId": 24700007,
            "effectiveTime": "2002-01-31",
            "id": 41399011,
            "languageCode": "en",
            "moduleId": 900000000000207008,
            "preferredIn": [],
            "refsets": [],
            "term": "Multiple sclerosis, NOS",
            "typeId": 900000000000013009
        },
        {
            "acceptableIn": [],
            "active": false,
            "caseSignificanceId": 900000000000020002,
            "conceptId": 24700007,
            "effectiveTime": "2015-01-31",
            "id": 41400016,
            "languageCode": "en",
            "moduleId": 900000000000207008,
            "preferredIn": [],
            "refsets": [],
            "term": "Generalized multiple sclerosis",
            "typeId": 900000000000013009
        },
        {
            "acceptableIn": [],
            "active": false,
            "caseSignificanceId": 900000000000020002,
            "conceptId": 24700007,
            "effectiveTime": "2015-01-31",
            "id": 481990016,
            "languageCode": "en",
            "moduleId": 900000000000207008,
            "preferredIn": [],
            "refsets": [],
            "term": "Generalised multiple sclerosis",
            "typeId": 900000000000013009
        },
        {
            "acceptableIn": [],
            "active": true,
            "caseSignificanceId": 900000000000448009,
            "conceptId": 24700007,
            "effectiveTime": "2017-07-31",
            "id": 754365011,
            "languageCode": "en",
            "moduleId": 900000000000207008,
            "preferredIn": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "refsets": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "term": "Multiple sclerosis (disorder)",
            "typeId": 900000000000003001
        },
        {
            "acceptableIn": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "active": true,
            "caseSignificanceId": 900000000000448009,
            "conceptId": 24700007,
            "effectiveTime": "2017-07-31",
            "id": 1223979019,
            "languageCode": "en",
            "moduleId": 900000000000207008,
            "preferredIn": [],
            "refsets": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "term": "Disseminated sclerosis",
            "typeId": 900000000000013009
        },
        {
            "acceptableIn": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "active": true,
            "caseSignificanceId": 900000000000017005,
            "conceptId": 24700007,
            "effectiveTime": "2003-07-31",
            "id": 1223980016,
            "languageCode": "en",
            "moduleId": 900000000000207008,
            "preferredIn": [],
            "refsets": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "term": "MS - Multiple sclerosis",
            "typeId": 900000000000013009
        },
        {
            "acceptableIn": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "active": true,
            "caseSignificanceId": 900000000000017005,
            "conceptId": 24700007,
            "effectiveTime": "2003-07-31",
            "id": 1223981017,
            "languageCode": "en",
            "moduleId": 900000000000207008,
            "preferredIn": [],
            "refsets": [
                900000000000509007,
                900000000000508004,
                999001261000000100
            ],
            "term": "DS - Disseminated sclerosis",
            "typeId": 900000000000013009
        }
    ],
    "directParentRelationships": {
        "116676008": [
            409774005,
            32693004
        ],
        "116680003": [
            6118003,
            414029004,
            39367000
        ],
        "363698007": [
            21483005
        ],
        "370135005": [
            769247005
        ]
    },
    "parentRelationships": {
        "116676008": [
            138875005,
            107669003,
            123037004,
            409774005,
            32693004,
            49755003,
            118956008
        ],
        "116680003": [
            6118003,
            138875005,
            404684003,
            123946008,
            118234003,
            128139000,
            23853001,
            246556002,
            363170005,
            64572001,
            118940003,
            414029004,
            362975008,
            363171009,
            39367000,
            80690008,
            362965005
        ],
        "363698007": [
            138875005,
            21483005,
            442083009,
            123037004,
            25087005,
            91689009,
            91723000
        ],
        "370135005": [
            138875005,
            769247005,
            308489006,
            303102005,
            281586009,
            362981000,
            719982003
        ]
    },
    "refsets": [
        991381000000107,
        999002271000000101,
        991411000000109,
        1127581000000103,
        1127601000000107,
        900000000000497000,
        447562003
    ]
}

Here we use the expression constraint language to search for a term "mnd" ensuring we only receive results that are a type of 'Disease' ("<64572001")

http -j 'localhost:8080/v1/snomed/search?s=mnd\&constraint=<64572001'

Results:

http -j 'localhost:8080/v1/snomed/search?s=mnd\&constraint=<64572001'
[
    {
        "conceptId": 37340000,
        "id": 486696014,
        "preferredTerm": "Motor neuron disease",
        "term": "MND - Motor neurone disease"
    }
]

More complex expressions are supported, and no search term is actually needed.

Let's get all drugs with exactly three active ingredients:

http -j 'localhost:8080/v1/snomed/search?constraint=<373873005|Pharmaceutical / biologic product| : [3..3]  127489000 |Has active ingredient|  = <  105590001 |Substance|'

Or, what about all disorders of the lung that are associated with oedema?

http -j 'localhost:8080/v1/snomed/search?constraint= <  19829001 |Disorder of lung|  AND <  301867009 |Edema of trunk|'

The ECL can be written in a more concise fashion:

http -j 'localhost:8080/v1/snomed/search?constraint= <19829001 AND <301867009'

There are endpoints for crossmapping to and from SNOMED.

Let's map one of our diagnostic terms into ICD-10:

http -j localhost:8080/v1/snomed/concepts/24700007/map/999002271000000101

Result:

[
    {
        "active": true,
        "correlationId": 447561005,
        "effectiveTime": "2020-08-05",
        "id": "57433204-2371-5c6f-855f-94ff9dad7ba6",
        "mapAdvice": "ALWAYS G35.X",
        "mapCategoryId": 1,
        "mapGroup": 1,
        "mapPriority": 1,
        "mapRule": "",
        "mapTarget": "G35X",
        "moduleId": 999000031000000106,
        "referencedComponentId": 24700007,
        "refsetId": 999002271000000101
    }
]

And of course, we can crossmap back to SNOMED as well:

http -j localhost:8080/v1/snomed/crossmap/999002271000000101/G35X

7. Embed into another application

In your deps.edn file (make sure you change the commit-id):

[com.eldrix.hermes {:git/url "https://github.com/wardle/hermes.git"
                    :sha     "097e3094070587dc9362ca4564401a924bea952c"}

Or, build a library jar (see below)

B. How to use a running service in your own applications

The terminology server can be embedded into your own applications or, more commonly, you would use as a standalone web service. Further documentation will follow.

C. How to run or build from source code

Run compilation checks (optional)

clj -M:check

Run unit tests and linters (optional)

clj -M:test
clj -M:lint/kondo
clj -M:lint/eastwood

Run direct from the command-line

You will get help text.

clj -M -m com.eldrix.hermes.core

View outdated dependencies

clj -M:outdated

You can view a complete list of dependencies; try:

clj -X:deps tree

Building uberjar

Build the uberjar:

clojure -X:uberjar

Building library jar

A library jar contains only hermes-code, and none of the bundled dependencies.

clojure -X:jar

About

A library and microservice implementing the health and care terminology SNOMED CT with support for cross-maps, inference, fast full-text search, autocompletion, compositional grammar and the expression constraint language.

Topics

Resources

License

Stars

Watchers

Forks