Streams Healthcare Analytics Platform Roadmap

Our roadmap and goals are always subject to discussion. Please open an issue if you have any feedback or request.

Goals

Allow users with little / no Streams experience to be able to effectively design and construct a real-time healthcare analytics application. Users should be able to create an application with little or no coding.
Allow users to focus on the analytics of a Streams Healthcare application.
Handle all the plumbing and infrastructure of an application.
Define and promote reference architecture and best practises.

Platform Design

This diagram describes the Streams Healthcare Analytics Platform. This demonstrates the major components of the platform and the types of services that we would like to provide.

The platform is designed to employ the microservice architecture. A microservice is a small application written in SPL, Java, or Python that fulfills a specific task in a bigger healthcare application. An application is made up of one of more of microservices, loosely connected to each other using the dynamic connection feature (Import/Export operators) in Streams. To learn more about the microservice architecture in Streams, refer to this post.

In the diagram above, the blue boxes represent services that are part of the platform. The purple and yellow boxes represent areas where we want our users to focus on.

Data Ingest

The first problem that everyone gets into is data ingest. In this part of the application, we are trying to solve these main problems:

How to ingest data from medical devices or device integrators?
How to transform the device data into a format that can be analyzed in a Streams application?
What are the common data schema types when analyzing medical device data?

As part of the framework, we will provide adapters to some of the most common device integrators. For example:

Bedmaster
Capsule
Vines from True Processs
Websphere IIB
EPIC
MLLP with HL7 (for any device integrator that uses MLLP/HL7 as their protocol)
FHIR
what else?

We will also attempt to define a set of common data schema types. Regardless of where and how the device data is coming from, the data will be represented in a standard manner for downstream applications to analyze.

Prepare

Next, we need to prepare the data for downstream applications to analyze. Some of the common problems around this space include:

Deduplication
Resampling
Normalization
Cleaning
Noise reduction
etc.,

Base Analytics

The platform should provide a set of common / basic analytics for physiological data. These analytics can be used as building blocks for the complex analytics and prediction models. This is an area where we need help and contribution: to define and implement the core analytics. Here are some of the things we can do:

Simple vital analytics - calculating rolling average, raising alerts when vitals going beyond normal range
Common ECG Analytics
- R-Peak Detection
- R-R Interval
- Heart rate variability (HRV)
Common EEG Analytics
- What are they?
ICP Waveform Analytics
- What analytics are useful here?

Aggregated Analytics

This is an area where we combine and aggregate results from the base analytics to form more sophisticated analytic rules. This is where we would like our end-users to focus on. The goal of the platform is to facilitate the development and usage of these more complex analytics.

There are two areas for this work:

Aggregating base analytics result - In this area, we would like our users to easily take the results from the base analytics, and express what the more complex rules are. For example, for septic shock detection, the user should be able to describe a rule like this:
- if temperature is > 38 degree celsius or < 36 degrees celsicus
- and if heart rate is > 90
- and respiratory rate is > 20 or PaCO2 < 32 mmHG
- and WBC > 12,000/mm3, or < 4000/mm3, or > 10% band
- then raise an alert for early septic shock detection.
Reusing complex analytics developed in R, Matlab, Python or others.
- Many of the healthcare analytics are first developed in Matlab, Python or R. We would like to make it easy for our users to test, validate and reuse these analytics.

Patient Data Correlation / EMR Integration

For some of the more complex analytics, we may need to incorporate EMR data or doctor's notes as part of the analysis. For example, we may want to retrieve patient's medical history. Or we may want to retrieve doctor's notes that include some of the doctor's observations.

These types of data can be ingested from existing hospital infrastructure. They are usually communicated using the HL7 / FHIR protocol.

As part of this, we are going to provide an integration with Watson Explorer in order to take advantage of their cognitive and analytics capabilities in Streams.

Central Monitoring Dashboard

As part of the platform, we would like to create a simple dashboard. The simple dashboard will help users visualize their data and validate their analytic results. The dashboard can be web-based, mobile, or be integrated with a third-party tool.

This simple dashboard can be used as an example / reference implementation on how to implement a dashboard for a streams healthcare application.

Alert and Notification Framework

When an important event occurs, we need to be able to notify and alert the right people to help the patient. In this part of the framework, we want to:

Allow clinicians to subscribe to patient events. Event subscription should be role-based. A nurse maybe be interested in a different set of events from an attending physician.
Deliver notification to the right people based on alert types and patient information
Alerts can be delivered via email, text messaging, etc
Alerts should be displayed onto a dashboard

Persistence

It is common for a hospital to store patient's data into some persistent storage. In this module, we will look at:

what persistent storage should be used (HBase? Hive? Databases?)
How to organize patient's data in the persistent storage?
From a Streams application, how to write patient's data into persistent storage?

Research

Another important aspect of the platform is to support researchers to develop analytics for real-time monitoring and prediction. Researchers need access to real patient data to spot patterns, develop algorithms and validate their work. Therefore, it is important for the framework to provide the following services:

Adapter to public online patient databases (like Physio.net)
Collect and anonymize patient data - this involves removing or encrypting any personally identifiable information from the data set.
Persist research data into the research database

Researchers need to test and validate their analytics. We would like to provide the following services for test / validation purposes:

Replaying data from research data base - A research database can be an internal database, or a popular external database like Physionet.
Create a test framework to help researchers validate results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly