Henosis is a cloud-native, lightweight Python-based recommender framework that facilitates providing recommendations to users of applications, like this:
Henosis is in active development and is released as alpha software. Henosis is being developed at the NASA Jet Propulsion Laboratory (JPL) using user-driven development (UDD) with generous support from the Office of Safety and Mission Success (5X).
Henosis brings together model training and testing, storage and deployment, and querying under a single framework. Henosis provides Data Scientists with a straight-forward and generalizable environment in which to train, test, store, and deploy categorical machine learning models for making form field recommendations, while also providing software engineers and web developers with a REST API that can be easily queried for recommendations and integrated across different enterprise applications.
This framework brings together model training and testing, storage and deployment, and querying under one framework that data scientists and developers can collectively use to provide recommendations to users. Henosis provides data scientists with a straight-forward interface in which to train, test, store, and deploy categorical machine learning models. For developers, it provides an API that can be used to provide form field recommendations to users of enterprise applications.
Henosis is intended to work well with data scientists' modeling workflow, and can be used to create training and test splits, fit models, and deploy models to a running Henosis instance.
from Henosis.model import Data, Models
# read in data from a csv
d = Data()
d.load(csv_path='file.csv')
print(d.all.head())
# split data
d.test_train_split(
d.all[X_vars],
d.all[y_var],
share_train=0.8
)
# fit a model (use any categorical scikit-learn model)
# this is where you do your "magic"!
m = Models().SKModel(MultinomialNB(alpha=0.25))
m.train(d)
m.test(d)
# print results
print(m.train_results)
print(m.test_results)
# connect to the deployed Henosis instance by referencing a configuration file
from Henosis.utils import Connect
s = Connect().config(config_yaml_path='config.yaml')
# store the model in S3 and Elasticsearch
m.store(
server_config=s,
model_path='model_' + y_var + '_1.pickle',
encoder_path='encoder_' + y_var + '_1.pickle',
encoder=count_vect
)
# deploy a model for use
m.deploy(
server_config=s,
deploy=True
)
More detailed information and examples are available in the rest of the documentation.
Running a Henosis instance for serving recommendations requires first setting up an Elasticsearch index (details in the configuration section) and an Amazon S3 bucket. Following that, deploying an instance of Henosis for making recommendations is as easy as placing the following in a Python file.
from Henosis.server import Connect, Server
# run the server
c = Connect().config(config_yaml_path='config.yaml')
s = Server(c).config()
s.run(port=5005)
Once a Henosis instance is running, developers can query for recommendations, model information, and API request information from the available Henosis API (REST) endpoints.
Henosis works by acting as a bridge between end users and data scientists which train recommendation (predictive) models. There are several classes that facilitate the interaction between data, scikit-learn models, and a REST API that provides recommendations and other information.
Henosis classes (bold borders) interface with the data scientist or statistician, developers querying for recommendations, and trained models. When queried for recommendations, Henosis references model information in Elasticsearch, loads appropriate models from AWS S3, and serves recommendations by using data available in the query (REST API request).
- Python 3.6+ (untested on lower versions) with the following packages:
- boto3
- dill
- gevent
- Flask
- Flask-CORS
- Flask-HTTPAuth
- Flask-RESTful
- gevent
- imbalanced-learn
- Jinja2
- jwt
- gevent
- numpy
- pandas
- PyYAML
- requests
- scikit-learn
- A working Amazon Web Services (AWS) S3 bucket, along with:
- AWS key
- AWS secret
- A running Elasticsearch 6+ server (untested on lower versions). You'll need to create an index and specify a mapping (documentation is provided and helpful scripts are available in the scripts directory).
Simply do a pip install and import the Henosis library as follows:
pip install Henosis
If you'd like, you can also fork the repository and pull Henosis to a local directory.
The latest Henosis documentation is available here and covers how to use Henosis for modeling and providing recommendations within your applications.
In October 2017, our data science team at the NASA Jet Propulsion Laboratory (JPL) was approached by the Office of Safety and Mission Success (5X) to improve processes for reporting in the Problem Reporting System (PRS). PRS is an internal tool that allows engineers to submit Problem Failure Reports (PFRs) and Incident Surprise, Anomaly reports (ISAs), which document pre-launch test failures and post-launch operational anomalies experienced by spacecraft. These reports not only serve as a record of past problems but also of past solutions to the problems described.
Despite their value, the reports contained within the PRS are costly to fill out and submit. With dozens of textual, categorical, and other inputs in the forms, the PFRs and ISAs draw valuable time away from mission staff to the annotation of internal forms — time better spent with spacecraft operations and mission work. A solution was needed that would reduce the time needed to file reports in PRS while ensuring ease of use for users already familiar with the current PRS system, such as a recommendation system for form fields. What we needed from a data science and IT operations perspective was a straightforward process to deploy a simple recommendation system for use in enterprise applications containing categorical form inputs (like dropdown menus).
While the initial effort focused on one internal use case, Henosis was developed as a generalized, open-source framework and is freely available for use in other applications.
See a bug that needs fixing or want to add a new feature? Fork the repository with the name of the issue (e.g. 'issue_12', open an issue if there isn't one already) or the name of your new feature. When ready, submit a pull request!
This project is licensed under the Apache 2.0 license and released by the California Institute of Technology.
- NASA Jet Propulsion Laboratory
- Ian Colwell
- Leslie Callum
- Harald Schone
- Kyle Hundman
- Paul Ramirez