tensorflow-serving 在k8s中的模型部署方案

简介

tensorflow-serving是一个tensorflow模型部署的方案,其在设计时,就考虑了非常灵活的设计,比如:

  • 支持不同的文件系统,并且易扩展
  • 将模型发现、加载、使用和卸载和模型生命周期的管理,以及对外提供服务解耦合,因此非常容易扩展它的模型发现方式,以及同样可以支持其他框架下模型的整合。
  • 整个服务是无状态的,因此方便在k8s上进行部署

下图是 tensorflow serving 的整体架构:

tensorflow serving architecture

模型加载方式

tensorflow-serving支持从不同的地方,以不同的方式去加载模型。比如我们可以直接在启动tensorflow-serving时加上模型的地址,也可以提供模型配置文件来启动服务。

启动时加上参数:

tensorflow_model_server --port=9000 --rest_api_port=8500 --model_name=resnet --model_base_path=/home/jiang/data/yolov3

从配置文件中加载模型:

/etc/config/models.config

model_config_list {
    config {
        name: 'fashion'
        base_path: 's3://models/fashion/'
        model_platform: 'tensorflow'
    }
    config {
        name: 'resnet'
        base_path: 's3://models/resnet/'
        model_platform: 'tensorflow'
    }
}

执行以下命令来加载fashionresnet两个模型:

tensorflow_model_server --port=9000", "--rest_api_port=8500", "--model_config_file=/etc/config/models.config"

模型存储系统

tensorflow-serving的另一个特点就是支持从不同类型的存储系统中加载模型。比如本地的文件系统、s3、hdfs等等

从本地文件系统中加载

tensorflow_model_server --port=9000 --rest_api_port=8500 --model_name=resnet --model_base_path=/home/jiang/data/yolov3

从s3加载

从s3(兼容s3的对象存储系统都可以)中加载模型,需要配置一些环境变量

export AWS_ACCESS_KEY_ID=<key id>
export AWS_SECRET_ACCESS_KEY=<key>
export S3_ENDPOINT=minio-service.minio:9000
export S3_USE_HTTPS=0
export S3_VERIFY_SSL=0
export AWS_REGION=us-west-1
export S3_REGION=us-west-1
export AWS_LOG_LEVEL=3

然后通过以下命令启动服务即可

tensorflow_model_server --port=9000 --rest_api_port=8500 --model_name=resnet --model_base_path=s3://models/resnet/

从hdfs中加载

从hdfs中加载需要设置以下的环境变量

  • JAVA_HOME: Java 的安装路径

  • HADOOP_HDFS_HOME: HDFS 的安装路径,如果在LD_LIBRARY_PATH中设置了 libhdfs.so 的路径,那么这个环境变量可以不要。

  • LD_LIBRARY_PATH: 引入 libjvm.so 的路径。如果你的 HADOOP 发行版在 ${HADOOP_HDFS_HOME}/lib/native 这个目录下没有包含 libhdfs.so,也需要引入它。

export LD_LIBRARY_PATH={LD_LIBRARY_PATH}:{JAVA_HOME}/jre/lib/amd64/server
  • CLASSPATH: 注意仅仅是设置 CLASSPATH 环境变量是不行的,需要用以下的方式使用:
CLASSPATH=({HADOOP_HDFS_HOME}/bin/hadoop classpath --glob) tensorflow_model_server --port=9000 --rest_api_port=8500 --model_name=yolov3 --model_base_path=hdfs://worknode2:9000/pipeline/models/yolov3

在k8s中部署

s3

tensorflow-serving 官方提供了docker镜像,因此使用 s3 的方式加载模型部署是很简单的。

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tfserving-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: tfserving
    spec:
      containers:
      - name: serving-container
        image: tensorflow/serving:1.14.0  
        ports:
        - containerPort: 8500
        - containerPort: 9000
        env:
        - name: AWS_ACCESS_KEY_ID
          value: J5WW5NKKV7AE9S0WZCM1
        - name: AWS_SECRET_ACCESS_KEY
          value: TbG0Y6nnUV8nQNLL9n4B3u3UPMMCJvqs2COx3and
        - name: S3_ENDPOINT
          value: minio-service.minio:9000
        - name: S3_USE_HTTPS
          value: "0"
        - name: S3_VERIFY_SSL
          value: "0"
        - name: AWS_REGION
          value: us-west-1
        - name: S3_REGION
          value: us-west-1
        - name: AWS_LOG_LEVEL
          value: "3"
        command: ["/usr/bin/tensorflow_model_server"]
        args: ["--port=9000", "--rest_api_port=8500", "--model_name=resnet", "--model_base_path=s3://models/resnet/"]

---
apiVersion: v1
kind: Service
metadata:
  labels:
    run: tf-service 
  name: tf-service
spec:
  ports:
  - name: rest-api-port
    port: 8500
    targetPort: 8500
  - name: grpc-port
    port: 9000
    targetPort: 9000
  selector:
    app: tfserving 
  type: NodePort

hdfs

在官方提供的 docker 镜像中,并没有打包 hdfs 的环境,因此我们需要自己构建一个镜像:

Dockerfile 所在目录如下:

hdfs_dockerfile
├── Dockerfile
└── hadoop-2.10.0
    ├── bin
    ├── etc
    ├── include
    ├── lib
    ├── libexec
    ├── LICENSE.txt
    ├── logs
    ├── NOTICE.txt
    ├── README.txt
    ├── sbin
    └── share

Dockerfile 如下:

FROM tensorflow/serving:1.14.0

RUN apt update && apt install -y openjdk-8-jre

COPY hadoop-2.10.0 /root/hadoop

ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64/
ENV HADOOP_HDFS_HOME /root/hadoop
ENV LD_LIBRARY_PATH {LD_LIBRARY_PATH}:{JAVA_HOME}/jre/lib/amd64/server

RUN echo '#!/bin/bash \n\n\
CLASSPATH=({HADOOP_HDFS_HOME}/bin/hadoop classpath --glob) tensorflow_model_server --port=8500 --rest_api_port=9000 \
--model_name={MODEL_NAME} --model_base_path={MODEL_BASE_PATH}/{MODEL_NAME} \
"@"' > /usr/bin/tf_serving_entrypoint.sh \
&& chmod +x /usr/bin/tf_serving_entrypoint.sh

EXPOSE 8500
EXPOSE 9000
ENTRYPOINT ["/usr/bin/tf_serving_entrypoint.sh"]

进行构建:

docker build -t tensorflow_serving:1.14-hadoop-2.10.0 .

运行:

docker run -p 9000:9000 --name tensorflow-serving -e MODEL_NAME=yolov3 -e MODEL_BASE_PATH=hdfs://192.168.50.166:9000/pipeline/models -t tensorflow_serving:1.14-hadoop-2.10.0

这样将上面的部署文件稍微修改一下即可使用。

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tfserving-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: tfserving
    spec:
      containers:
      - name: serving-container
        image: joyme/tensorflow_serving:1.14-hadoop-2.10.0
        ports:
        - containerPort: 8500
        - containerPort: 9000
        env:
        - name: MODEL_NAME
          value: yolov3
        - name: MODEL_BASE_PATH
          value: hdfs://192.168.50.166:9000/pipeline/models

---
apiVersion: v1
kind: Service
metadata:
  labels:
    run: tf-service 
  name: tf-service
spec:
  ports:
  - name: rest-api-port
    port: 8500
    targetPort: 8500
  - name: grpc-port
    port: 9000
    targetPort: 9000
  selector:
    app: tfserving 
  type: NodePort

模型调用

tensorflow-serving 支持两种方式调用模型进行预测: GRPC 和 RESTful api

GRPC的方式如下:

from __future__ import print_function

import grpc
import requests
import tensorflow as tf

from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc

IMAGE_URL = 'https://tensorflow.org/images/blogs/serving/cat.jpg'

tf.app.flags.DEFINE_string('server', '192.168.50.201:30806', 'PredictionService host:port')
tf.app.flags.DEFINE_string('image', '', 'path to image in jpeg format')
FLAGS = tf.app.flags.FLAGS

def main(_):
    if FLAGS.image:
        with open(FLAGS.image, 'rb') as f:
            data = f.read()
    else:
        dl_request = requests.get(IMAGE_URL, stream=True)
        dl_request.raise_for_status()
        data = dl_request.content

    channel = grpc.insecure_channel(FLAGS.server)
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

    # Send request
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'resnet'
    request.model_spec.signature_name = 'serving_default'
    request.inputs['image_bytes'].CopyFrom(
            tf.contrib.util.make_tensor_proto(data, shape=[1]))

    result = stub.Predict(request, 10.0) # 10 secs timeout
    print(result)

if __name__ == '__main__':
    tf.app.run()

RESTful API的方式如下

import requests
import json
import base64

with open("cat.jpg", "rb") as image_file:
    encoded_string = base64.b64encode(image_file.read())

headers = {"content-type": "application/json"}
body = {
        "instances": [
            {'b64': encoded_string}
           ]
        }
r = requests.post('http://192.168.50.201:32063/v1/models/resnet:predict', data = json.dumps(body), headers = headers)

print(r.text)