bentoml：把你的机器学习模型变成专业级API服务！

大家好，我是阿图！今天要给大家介绍一个超级实用的Python库 – bentoml。如果你学过机器学习，一定遇到过这个烦恼：训练好的模型要怎么部署？怎么让别人也能用上我的模型？bentoml就是来解决这个问题的！它能帮你轻松把模型打包成标准化的服务，还能自动生成API文档，简直不要太方便！

bentoml是什么？

bentoml就像是一个神奇的包装工具，它能把你的机器学习模型打包成一个标准化的服务。打个比方：你做了一道美味的菜（模型），bentoml就是那个能把菜打包成外卖的小哥，让所有人都能方便地品尝到你的美食（使用你的模型）。

首先，让我们安装bentoml：

pip install bentoml

一个简单的示例

我们先用一个简单的scikit-learn模型来演示：

import bentoml

import numpy as np

from sklearn.ensemble import RandomForestClassifier



# 训练一个简单的模型

model = RandomForestClassifier()

X = np.random.randn(100, 4)

y = np.random.randint(0, 2, 100)

model.fit(X, y)



# 保存模型到bentoml

bentoml.sklearn.save_model(

    "iris_classifier",

    model,

    signatures={

        "predict": {"batchable": True}

    }

)

创建Service类

接下来，我们创建一个服务类来包装我们的模型：

import bentoml

import numpy as np

from bentoml.io import NumpyNdarray



# 加载保存的模型

iris_clf_runner = bentoml.sklearn.get("iris_classifier:latest").to_runner()



# 创建服务

svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])



# 创建API端点

@svc.api(input=NumpyNdarray(), output=NumpyNdarray())

async def predict(input_array: np.ndarray) -> np.ndarray:

    result = await iris_clf_runner.predict.async_run(input_array)

    return result

部署和使用

写好Service后，我们可以把它保存成service.py，然后用命令行启动服务：

bentoml serve service:svc

接下来就可以用curl或者Python请求这个服务了：

import requests

import numpy as np



# 准备测试数据

test_data = np.random.randn(1, 4)



# 发送请求

response = requests.post(

    "http://localhost:3000/predict",

    json=test_data.tolist()

)



print(response.json())

高级功能展示

bentoml还支持很多高级功能，比如模型版本管理和API文档自动生成：

import bentoml

from bentoml.io import JSON, NumpyNdarray

from pydantic import BaseModel



class IrisInput(BaseModel):

    sepal_length: float

    sepal_width: float

    petal_length: float

    petal_width: float



svc = bentoml.Service(

    "iris_classifier_advanced",

    runners=[iris_clf_runner]

)



@svc.api(

    input=JSON(pydantic_model=IrisInput),

    output=JSON(),

    description="预测鸢尾花品种"

)

async def predict_species(input_data: IrisInput):

    input_array = np.array([[

        input_data.sepal_length,

        input_data.sepal_width,

        input_data.petal_length,

        input_data.petal_width

    ]])

    result = await iris_clf_runner.predict.async_run(input_array)

    return {"predicted_species": int(result[0])}