metalnetes/ae at master · jay-johnson/metalnetes

History

Name		Name	Last commit message	Last commit date
parent directory ..
ae-backup		ae-backup
ae-daily		ae-daily
ae-intraday		ae-intraday
ae-jupyter		ae-jupyter
ae-restore		ae-restore
ae-weekly		ae-weekly
ae		ae
cron		cron
grafana		grafana
minio		minio
prometheus		prometheus
redis		redis
Makefile		Makefile
README.rst		README.rst
_delete-secrets.sh		_delete-secrets.sh
_uninstall.sh		_uninstall.sh
build.sh		build.sh
deploy-latest.sh		deploy-latest.sh
describe-backtester.sh		describe-backtester.sh
describe-engine.sh		describe-engine.sh
describe-grafana.sh		describe-grafana.sh
describe-ingress-grafana.sh		describe-ingress-grafana.sh
describe-ingress-jupyter.sh		describe-ingress-jupyter.sh
describe-ingress-minio.sh		describe-ingress-minio.sh
describe-ingress-prometheus.sh		describe-ingress-prometheus.sh
describe-intraday.sh		describe-intraday.sh
describe-jupyter.sh		describe-jupyter.sh
describe-minio.sh		describe-minio.sh
describe-redis.sh		describe-redis.sh
describe-service-jupyter.sh		describe-service-jupyter.sh
describe-service-minio.sh		describe-service-minio.sh
install-ceph-secret.sh		install-ceph-secret.sh
install-registry-secret.sh		install-registry-secret.sh
install-tls.sh		install-tls.sh
logs-backtester.sh		logs-backtester.sh
logs-engine.sh		logs-engine.sh
logs-grafana.sh		logs-grafana.sh
logs-job-backup.sh		logs-job-backup.sh
logs-job-daily.sh		logs-job-daily.sh
logs-job-intraday.sh		logs-job-intraday.sh
logs-job-restore.sh		logs-job-restore.sh
logs-job-weekly.sh		logs-job-weekly.sh
logs-jupyter.sh		logs-jupyter.sh
monitor-start.sh		monitor-start.sh
run-backup-job.sh		run-backup-job.sh
run-daily-job.sh		run-daily-job.sh
run-intraday-job.sh		run-intraday-job.sh
run-restore-job.sh		run-restore-job.sh
run-weekly-job.sh		run-weekly-job.sh
set-storage-class.sh		set-storage-class.sh
show-ingresses.sh		show-ingresses.sh
show-minio-secret.sh		show-minio-secret.sh
show-pods.sh		show-pods.sh
show-pvc.sh		show-pvc.sh
show-secrets.sh		show-secrets.sh
show-services.sh		show-services.sh
ssh-backtester.sh		ssh-backtester.sh
ssh-engine.sh		ssh-engine.sh
start.sh		start.sh
stop.sh		stop.sh
view-ticker-data-in-redis.sh		view-ticker-data-in-redis.sh

README.rst

Use Helm to Run the Analysis Engine on Kubernetes

This guide outlines how to use helm to deploy and manage the Analysis Engine (AE) on kubernetes (tested on 1.13.3).

It requires the following steps are done before getting started:

Access to a running Kubernetes cluster
Helm is installed
A valid account for IEX Cloud
A valid account for Tradier
Optional - Install Ceph Cluster for Persistent Storage Support
Optional - Install the Stock Analysis Engine for Local Development Outside of Kubernetes

Getting Started

AE builds multiple helm charts that are hosted on a local helm repository, and everything runs within the ae kubernetes namespace.

Please change to the ./helm directory:

cd helm

Build Charts

This will build all the AE charts, download stable/redis and stable/minio, and ensure the local helm server is running:

./build.sh

Configuration

Each AE chart supports attributes for connecting to a:

Depending on your environment, these services may require you to edit the associated helm chart's values.yaml file(s) before starting everything with the start.sh script to deploy AE.

Below are some of the common integration questions on how to configure each one (hopefully) for your environment:

Configure Redis

The start.sh script installs the stable/redis chart with the included ./redis/values.yaml for configuring as needed before the start script boots up the included Bitnami Redis cluster

Configure Minio

The start.sh script installs the stable/minio chart with the included ./minio/values.yaml for configuring as needed before the start script boots up the included Minio

Configure AE Stack

Each of the AE charts can be configured prior to running the stack's core AE chart found in:

./ae/values.yaml

Configure the AE Backup to AWS S3 Job

Please set your AWS credentials (which will be installed as kubernetes secrets) in the file:

./ae-backup/values.yaml

Configure Data Collection Jobs

Data collection is broken up into three categories of jobs: intraday, daily and weekly data to collect. Intraday data collection is built to be fast and pull data that changes often vs weekly data that is mostly static and expensive for IEX Cloud users. These chart jobs are intended to be used with cron jobs that fire work into the AE workers which compress + cache the pricing data for algorithms and backtesting.

Set your IEX Cloud account up in each chart:

Supported IEX Cloud Attributes

# IEX Cloud
# https://iexcloud.io/docs/api/
iex:
  addToSecrets: true
  secretName: ae.k8.iex.<intraday|daily|weekly>
  # Publishable Token:
  token: ""
  # Secret Token:
  secretToken: ""
  apiVersion: beta

Set your Tradier account up in each chart:

Supported Tradier Attributes

# Tradier
# https://developer.tradier.com/documentation
tradier:
  addToSecrets: true
  secretName: ae.k8.tradier.<intraday|daily|weekly>
  token: ""
  apiFQDN: api.tradier.com
  dataFQDN: sandbox.tradier.com
  streamFQDN: sandbox.tradier.com

ae-intraday
- Set the intraday.tickers to a comma-delimited list of tickers to pull per minute.
ae-daily
- Set the daily.tickers to a comma-delimited list of tickers to pull at the end of each trading day.
ae-weekly
- Set the weekly.tickers to a comma-delimited list of tickers to pull every week. This is used for pulling "quota-expensive" data that does not change often like IEX Financials or Earnings data every week.

Set Jupyter Login Credentials

Please set your Jupyter login password that works with a browser:

jupyter:
  password: admin

View Jupyter

By default, Jupyter is hosted with nginx-ingress with TLS encryption at:

https://aejupyter.example.com

Default login password is:

password: admin

View Minio

By default, Minio is hosted with nginx-ingress with TLS encryption at:

https://aeminio.example.com

Default login credentials are:

Access Key: trexaccesskey
Secret Key: trex123321

Optional - Set Default Storage Class

The AE pods are using a Distributed Ceph Cluster for persistenting data outside kubernetes with ~300 GB of disk space.

To set your kubernetes cluster StorageClass to use the ceph-rbd use the script:

./set-storage-class.sh ceph-rbd

Optional - Set the Charts to Pull from a Private Docker Registry

By default the AE charts use the Stock Analysis Engine container, and here is how to set up each AE component chart to use a private docker image in a private docker registry (for building your own algos in-house).

Each of the AE charts values.yaml files contain two required sections for deploying from a private docker registry.

Set the Private Docker Registry Authentication values in each chart

Please set the registry address, secret name and docker config json for authentication using this format.
- ae
- ae-backup
- ae-intraday
- ae-daily
- ae-weekly
- ae-jupyter
Note

The imagePullSecrets attribute uses a naming convention format: <base key>.<component name>. The base is ae.docker.creds. and the approach allows different docker images for each component (for testing) like intraday data collection vs running a backup job or even hosting jupyter.

Supported Private Docker Registry Authentication Attributes
```
registry:
addToSecrets: true
address: <FQDN to docker registry>:<PORT registry uses a default port 5000>
imagePullSecrets: ae.docker.creds.<core|backtester|backup|intraday|daily|weekly|jupyter>
dockerConfigJSON: '{"auths":{"<FQDN>:<PORT>":{"Username":"username","Password":"password","Email":""}}}'
```
Set the AE Component's docker image name, tag, pullPolicy and private flag

Please set the registry address, secret name and docker config json for authentication using this format.
Supported Private Docker Image Attributes per AE Component
```
image:
private: true
name: YOUR_IMAGE_NAME_HERE
tag: latest
pullPolicy: Always
```

Start Stack

This command can take a few minutes to download and start all the components:

./start.sh

Manually Starting Components With Helm

If you do not want to use start.sh you can start the charts with helm using:

Start the AE Stack

helm install \
    --name=ae \
    ./ae \
    --namespace=ae \
    -f ./ae/values.yaml

Start Redis

helm install \
    --name=ae-redis \
    stable/redis \
    --namespace=ae \
    -f ./redis/values.yaml

Start Minio

helm install \
    --name=ae-minio \
    stable/minio \
    --namespace=ae \
    -f ./minio/values.yaml

Start Jupyter

helm install \
    --name=ae-jupyter \
    ./ae-jupyter \
    --namespace=ae \
    -f ./ae-jupyter/values.yaml

Start Backup Job

helm install \
    --name=ae-backup \
    ./ae-backup \
    --namespace=ae \
    -f ./ae-backup/values.yaml

Start Intraday Data Collection Job

helm install \
    --name=ae-intraday \
    ./ae-intraday \
    --namespace=ae \
    -f ./ae-intraday/values.yaml

Start Daily Data Collection Job

helm install \
    --name=ae-daily \
    ./ae-daily \
    --namespace=ae \
    -f ./ae-daily/values.yaml

Start Weekly Data Collection Job

helm install \
    --name=ae-weekly \
    ./ae-weekly \
    --namespace=ae \
    -f ./ae-weekly/values.yaml

Verify Pods are Running

./show-pods.sh
------------------------------------
getting pods in ae:
kubectl get pods -n ae
NAME                              READY   STATUS    RESTARTS   AGE
ae-minio-55d56cf646-87znm         1/1     Running   0          3h30m
ae-redis-master-0                 1/1     Running   0          3h30m
ae-redis-slave-68fd99b688-sn875   1/1     Running   0          3h30m
backtester-5c9687c645-n6mmr       1/1     Running   0          4m22s
engine-6bc677fc8f-8c65v           1/1     Running   0          4m22s
engine-6bc677fc8f-mdmcw           1/1     Running   0          4m22s
jupyter-64cf988d59-7s7hs          1/1     Running   0          4m21s

Run Intraday Pricing Data Collection

Once your ae-intraday/values.yaml is ready, you can automate intraday data collection by using the helper script to start the helm release for ae-intraday:

./run-intraday-job.sh <PATH_TO_VALUES_YAML>

And for a cron job, include the -r argument to ensure the job is recreated.

./run-intraday-job.sh -r <PATH_TO_VALUES_YAML>

View Collected Pricing Data in Redis

After data collection, you can view compressed data for a ticker within the redis cluster with:

./view-ticker-data-in-redis.sh TICKER

Run Daily Pricing Data Collection

Once your ae-daily/values.yaml is ready, you can automate daily data collection by using the helper script to start the helm release for ae-daily:

./run-daily-job.sh <PATH_TO_VALUES_YAML>

And for a cron job, include the -r argument to ensure the job is recreated.

./run-daily-job.sh -r <PATH_TO_VALUES_YAML>

Run Weekly Pricing Data Collection

Once your ae-weekly/values.yaml is ready, you can automate weekly data collection by using the helper script to start the helm release for ae-weekly:

./run-weekly-job.sh <PATH_TO_VALUES_YAML>

And for a cron job, include the -r argument to ensure the job is recreated.

./run-weekly-job.sh -r <PATH_TO_VALUES_YAML>

Run Backup Collected Pricing Data to AWS

Once your ae-backup/values.yaml is ready, you can automate backing up your collected + compressed pricing data from within the redis cluster and publish it to AWS S3 with the helper script:

Warning

Please remember AWS S3 has usage costs. Please set only the tickers you need to backup before running the ae-backup job.

./run-backup-job.sh <PATH_TO_VALUES_YAML>

And for a cron job, include the -r argument to ensure the job is recreated.

./run-backup-job.sh -r <PATH_TO_VALUES_YAML>

Cron Automation with Helm

Add these lines to your cron with crontab -e for automating data collection:

Pull Data Per Minute of each Trading Day

Note

This will pull data on holidays or closed trading days, but PR's welcomed!

Minute

Every minute M-F between 9 AM and 5 PM (assuming system time is EST)

# intraday job:
# min hour day  month dayofweek job script path              job    KUBECONFIG
*     9-17 *    *     1,2,3,4,5 /opt/sa/helm/cron/run-job.sh intra  /opt/k8/config

Daily

Friday at 6:01 PM (assuming system time is EST)

# daily job:
# min hour day  month dayofweek job script path              job   KUBECONFIG
1     18   *    *     1,2,3,4,5 /opt/sa/helm/cron/run-job.sh daily /opt/k8/config

Weekly

Friday at 7:01 PM (assuming system time is EST)

# weekly job:
# min hour day  month dayofweek job script path              job    KUBECONFIG
1     19   *    *     5         /opt/sa/helm/cron/run-job.sh weekly /opt/k8/config

Backup

Friday at 8:01 PM (assuming system time is EST)

# backup job:
# min hour day  month dayofweek job script path              job    KUBECONFIG
1     20   *    *     1,2,3,4,5 /opt/sa/helm/cron/run-job.sh backup /opt/k8/config

Restore on Reboot

Friday at 8:01 PM (assuming system time is EST)

# restore job:
# on a server reboot (assuming your k8 cluster is running on just 1 host)
@reboot /opt/sa/helm/cron/run-job.sh restore /opt/k8/config

Debugging

Engine

Describe:

./describe-engine.sh

View Logs:

./logs-engine.sh

Intraday Data Collection

Describe:

./describe-intraday.sh

View Logs:

./logs-job-intraday.sh

Daily Data Collection

Describe:

./describe-daily.sh

View Logs:

./logs-job-daily.sh

Weekly Data Collection

Describe:

./describe-weekly.sh

View Logs:

./logs-job-weekly.sh

Jupyter

Describe Pod:

./describe-jupyter.sh

View Logs:

./logs-jupyter.sh

View Service:

./describe-service-jupyter.sh

Backtester

Jupyter uses the backtester pod to peform asynchronous processing like running an algo backtest. To debug this run:

Describe:

./describe-backtester.sh

View Logs:

./logs-backtester.sh

Minio

Describe:

./describe-minio.sh

Describe Service:

./describe-service-minio.sh

Describe Ingress:

./describe-ingress-minio.sh

Redis

Describe:

./describe-redis.sh

Stop

To stop AE run:

./stop.sh

Full Delete

And if you really, really want to permanently delete ae-minio and ae-redis run:

Warning

Running this can delete cached pricing data. Please be careful.

./stop.sh -f

Files

ae

Directory actions

More options

Directory actions

More options

Latest commit

History

ae

Folders and files

parent directory

README.rst

Use Helm to Run the Analysis Engine on Kubernetes

Getting Started

Build Charts

Configuration

Configure Redis

Configure Minio

Configure AE Stack

Configure the AE Backup to AWS S3 Job

Configure Data Collection Jobs

Set Jupyter Login Credentials

View Jupyter

View Minio

Optional - Set Default Storage Class

Optional - Set the Charts to Pull from a Private Docker Registry

Start Stack

Manually Starting Components With Helm

Start the AE Stack

Start Redis

Start Minio

Start Jupyter

Start Backup Job

Start Intraday Data Collection Job

Start Daily Data Collection Job

Start Weekly Data Collection Job

Verify Pods are Running

Run Intraday Pricing Data Collection

View Collected Pricing Data in Redis

Run Daily Pricing Data Collection

Run Weekly Pricing Data Collection

Run Backup Collected Pricing Data to AWS

Cron Automation with Helm

Minute

Daily

Weekly

Backup

Restore on Reboot

Debugging

Engine

Intraday Data Collection

Daily Data Collection

Weekly Data Collection

Jupyter

Backtester

Minio

Redis

Stop

Full Delete