OpenShift Container Platform is built on top of Kubernetes, and offers a consistent hybrid cloud foundation for building and scaling containerized applications.
Note: These instructions were tested on OpenShift 4.12, with KServe 0.10
KServe can use Knative as a model serving layer. In OpenShift Knative comes bundled as a product called OpenShift Serverless. OpenShift Serverless can be installed with two different ingress layers:
Follow the corresponding installation guide:
# Install OpenShift Serverless operator
oc apply -f openshift/serverless/operator.yaml
oc wait --for=condition=ready pod --all -n openshift-serverless --timeout=300s
# Create an Knative instance
oc apply -f openshift/serverless/knativeserving-kourier.yaml
# Wait for all pods in `knative-serving` to be ready
oc get pod -n knative-serving
# Install cert-manager operator
oc apply -f openshift/cert-manager/operator.yaml
# Install KServe
export KSERVE_VERSION=v0.10.1
oc apply -f "https://github.com/kserve/kserve/releases/download/${KSERVE_VERSION}/kserve.yaml"
# Patch the inferenceservice-config according to https://kserve.github.io/website/0.10/admin/serverless/kourier_networking/
oc edit configmap/inferenceservice-config --namespace kserve
# Add the flag `"disableIstioVirtualHost": true` under the ingress section
ingress : |- {
"disableIstioVirtualHost": true
}
oc rollout restart deployment kserve-controller-manager -n kserve
oc wait --for=condition=ready pod -l control-plane=kserve-controller-manager -n kserve --timeout=300s
# Install KServe built-in servingruntimes and storagecontainers
oc apply -f "https://github.com/kserve/kserve/releases/download/${KSERVE_VERSION}/kserve-cluster-resources.yaml"
# Install Service Mesh operators
oc apply -f openshift/service-mesh/operators.yaml
oc wait --for=condition=ready pod --all -n openshift-operators --timeout=300s
# Create an istio instance
oc apply -f openshift/service-mesh/namespace.yaml
oc apply -f openshift/service-mesh/smcp.yaml
oc wait --for=condition=ready pod --all -n openshift-operators --timeout=300s
oc wait --for=condition=ready pod --all -n istio-system --timeout=300s
# Make sure to add your namespaces to the ServiceMeshMemberRoll and create a
# PeerAuthentication Policy for each of your namespaces
oc create ns kserve
oc create ns kserve-demo
oc apply -f openshift/service-mesh/smmr.yaml
oc apply -f openshift/service-mesh/peer-authentication.yaml
# Install OpenShift Serverless operator
oc apply -f openshift/serverless/operator.yaml
oc wait --for=condition=ready pod --all -n openshift-serverless --timeout=300s
# Create an Knative instance
oc apply -f openshift/serverless/knativeserving-istio.yaml
# Wait for all pods in `knative-serving` to be ready
oc get pod -n knative-serving
# Create the Knative gateways
# This contains a self-signed TLS certificate, you can change this to your own
# Please consider https://access.redhat.com/documentation/en-us/openshift_container_platform/4.12/html/serverless/integrations#serverlesss-ossm-external-certs_serverless-ossm-setup for more information
oc apply -f openshift/serverless/gateways.yaml
# Install cert-manager operator
oc apply -f openshift/cert-manager/operator.yaml
# Install KServe
export KSERVE_VERSION=v0.10.1
oc apply -f "https://github.com/kserve/kserve/releases/download/${KSERVE_VERSION}/kserve.yaml"
# Install KServe built-in serving runtimes and storagecontainers
oc wait --for=condition=ready pod -l control-plane=kserve-controller-manager -n kserve --timeout=300s
oc apply -f "https://github.com/kserve/kserve/releases/download/${KSERVE_VERSION}/kserve-cluster-resources.yaml"
# Add NetworkPolicies to allow traffic to kserve webhook
oc apply -f openshift/networkpolicies.yaml
# Create a namespace for testing
oc create ns kserve-demo
# Allow pods to run as user `knative/1000` for the KServe python images, see python/*.Dockerfile
oc adm policy add-scc-to-user anyuid -z default -n kserve-demo
OpenShift Serverless integrates with the OpenShift Ingress controller. So in contrast to KServe on Kubernetes, in OpenShift you automatically get routable domains for each KServe service.
Create an inference service. From the docs of the kserve
repository, run:
oc apply -f ./samples/v1beta1/sklearn/v1/sklearn.yaml -n kserve-demo
Give it a minute, then check the InferenceService status:
oc get inferenceservices sklearn-iris -n kserve-demo
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
sklearn-iris http:https://sklearn-iris-predictor-default-kserve-demo.apps.<your-cluster-domain> True 100 sklearn-iris-predictor-default-00001 44s
Note: It is possible that the InferenceService
shows http
as protocol. Depending on your OpenShift installation, this usually is https
, so try calling the service with https
in that case.
Now you try curling the InferenceService using the routable URL from above:
curl -k https://sklearn-iris-predictor-default-kserve-demo.apps.<your-cluster-domain>/v1/models/sklearn-iris:predict -d '{"instances": [[6.8, 2.8, 4.8, 1.4],[6.0, 3.4, 4.5, 1.6]]}'
You should see an output like:
{"predictions": [1, 1]}
You can now try more examples from https://kserve.github.io/website/.
Service Mesh in OpenShift Container Platform requires some annotations to be present on a KnativeService
. Those annotations can be propagated from the InferenceService
and InferenceGraph
. For this, you need to add the following annotations to your resources:
📝 Note: OpenShift runs istio with istio-cni enabled. To allow init-containers to call out to DNS and other external services like S3 buckets, the KServes storage-initializer init-container must run as the same user id as the istio-proxy. In OpenShift, the istio-proxy gets the user-id of the namespace incremented by 1 assigned. You have to specify the annotation
serving.kserve.io/storage-initializer-uid
with the same value. You can get your annotation range from your namespace using:oc describe namespace <your-namespace>and check for
openshift.io/sa.scc.uid-range=1008050000/10000
More details on the root cause can be found here: https://istio.io/latest/docs/setup/additional-setup/cni/#compatibility-with-application-init-containers.
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
...
annotations:
sidecar.istio.io/inject: "true"
sidecar.istio.io/rewriteAppHTTPProbers: "true"
serving.knative.openshift.io/enablePassthrough: "true"
serving.kserve.io/storage-initializer-uid: "1008050001" # has to be changed to your namespaces value, see note above
spec:
...
apiVersion: "serving.kserve.io/v1alpha1"
kind: "InferenceGraph"
metadata:
...
annotations:
sidecar.istio.io/inject: "true"
sidecar.istio.io/rewriteAppHTTPProbers: "true"
serving.knative.openshift.io/enablePassthrough: "true"
serving.kserve.io/storage-initializer-uid: "1008050001" # has to be changed to your namespaces value, see note above
spec:
...
So you can do the same example as above, including the annotations:
cat <<EOF | oc apply -f -
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "sklearn-iris"
namespace: kserve-demo
annotations:
sidecar.istio.io/inject: "true"
sidecar.istio.io/rewriteAppHTTPProbers: "true"
serving.knative.openshift.io/enablePassthrough: "true"
spec:
predictor:
sklearn:
storageUri: "gs:https://kfserving-examples/models/sklearn/1.0/model"
EOF
Give it a minute, then check and call it the same was as described above in the Kourier example.