Skip to main content

put your model into **a bottle** then you get a working server and more.

Project description

abottle

trition/tensorrt/onnxruntim/pytorch python server wrapper

put your model into a bottle then you get a working server and more.

usage: abottle [-h] [--wrapper WRAPPER] [--as AS_] [--config CONFIG] [--host HOST] [--port PORT] usermodel_name

Warp you python object with a bottle

positional arguments:
  usermodel_name     your python object moudle

optional arguments:
  -h, --help         show this help message and exit
  --wrapper WRAPPER  which model wrapper you want to use? abottle.TritonModel? abottle.ONNXModel, abottle.TensorRTModel?,
                     abottle.PytrochModel? or any wrapper class that implemented abottle.BaseModel!
  --as AS_           server? tester?
  --config CONFIG    config yaml file path or content in string
  --host HOST
  --port PORT

Demo

write any class which contain a function named predict and receive a list as input, like below:

import numpy as np
from transformers import AutoTokenizer


class MiniLM:
    def __init__(self):
        self.tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")

    def predict(self, X):
        encode_dict = self.tokenizer(
            X, padding="max_length", max_length=128, truncation=True
        )
        input_ids = np.array(encode_dict["input_ids"], dtype=np.int32)
        attention_mask = np.array(encode_dict["attention_mask"], dtype=np.int32)

        outputs = self.model.infer(
            {"input_ids": input_ids, "attention_mask": attention_mask}, ["y"]
        )

        return outputs['y']


    #you can write config in class or provide it as a yaml file or yaml string
    class Config:
        class TritionModel:
            trt_url = "triton.triton-system"
            name = "minilm"
            version = "2"

start with abottle, pass your path file and class with format 'a.b.c', like below:

abottle main.MiniLM

in default abottle will start a HTTP server and use abottle.TritonModel to wrap your class, which will help you to talk with triton serve, you can config Triton server information and model information in a class name Config.TritonModel.

also you can config with shell string input, and don't write Config class in your code

abottle main.MiniLM --config """TritonModel:
        triton_url: localhost
        name: minilm
        version: 2
    """

and you can also config with a yaml file

abottle main.MiniLM --config <config yaml file path>

if you choice another model wrapper like abottle.ONNXModel, your config key shuold be ONNXModel, etc.

Class Template

class YourClass:
    def predict(self, X):
        return
    def evaluate(self, **kwargs):
        return

Type Hint your Code

import typing
class YourClass:
    def predict(self, X:typing.List[str]) -> typing.String
        pass

if you add type hint in your code, the server start with abottle can generate a OpenSchema metadata

and you can do more things with abottle

import numpy as np
import pandas as pd
from transformers import AutoTokenizer
from typing import List


class MiniLM:
    def __init__(self):
        self.tokenizer = AutoTokenizer.from_pretrained(
            "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
        )

    def cosine(self, a: List[List[float]], b: List[List[float]]) -> float:
        a, b = np.array(a), np.array(b)
        # |A|
        sqrt_sqare_A = np.tile(
            np.sqrt(np.sum(np.square(a), axis=1)).reshape((a.shape[0], 1)),
            (1, a.shape[0]),
        )
        # |B|
        sqrt_sqare_B = np.tile(
            np.sqrt(np.sum(np.square(b.T), axis=0)).reshape((1, b.shape[0])),
            (b.shape[0], 1),
        )
        # cosine similarity
        score_matrix = np.divide(np.dot(a, b.T), sqrt_sqare_A * sqrt_sqare_B)
        return score_matrix

    def predict(self, X: List[str]) -> List[List[float]]:
        encode_dict = self.tokenizer(
            X, padding="max_length", max_length=128, truncation=True
        )
        input_ids = np.array(encode_dict["input_ids"], dtype=np.int32)
        attention_mask = np.array(encode_dict["attention_mask"], dtype=np.int32)

        outputs = self.model.infer(
            {"input_ids": input_ids, "attention_mask": attention_mask}, ["y"]
        )

        return outputs['y']

    def evaluate(self, file_path: str) -> float:
        test_data = pd.read_csv(file_path, sep=", ", names=["query", "label"])
        query, label = test_data["query"].tolist(), test_data["label"].tolist()
        assert len(query) == len(label)

        query_embedding, label_embedding = [], []
        for i in range(len(query)):
            query_embedding.append(self.predict(query[i]))
            label_embedding.append(self.predict(label[i]))
        assert len(query_embedding) == len(label_embedding)

        # 分数矩阵
        score_matrix = self.cosine(query_embedding, label_embedding)
        # 算法性能
        raw_result = np.argmax(score_matrix, axis=0) == np.array(
            [i for i in range(score_matrix.shape[0])]
        )
        unique, counts = np.unique(raw_result, return_counts=True)
        top_1_accuracy = counts[unique.tolist().index(True)] / np.sum(counts)

        return top_1_accuracy

def evaluate can be used as a tester like below, tester means to test your model's accuracy

abottle main.MiniLM --as tester file_path='test.csv'

the arguments you defined in the evaluate function can be set in CLI args with format xxx=xxx

you can use different wrapper for your model, including:

  • abottle.ONNXModel
  • abottle.TensorRTModel
  • abottle.TritonModel
  • abottle.PytorchModel

if you want to add more wrappers you can just implement abottle.BaseModel

abottle main.MiniLM --as server --wrapper abottle.TritonModel
abottle main.MiniLM --as server --wrapper anything.you.write.which.implemented.abottle.BaseModel

Configs, Model Creator don't need to read this.

every wrapper has it's own config fileds, but in general

config with class

shuold follow, notice the WrapperNameHere means replace it as your Wrapper's name, it's means the class name, like abottle.ONNXModel' wrapper name is ONNXModel

class YourClass:
    def predict(self, X):
        return
    def evaluate(self, **kwargs):
        return
    class Config:
        class WrapperNameHere:
            pass

and if you want to use outside configs like yaml strings or yaml file, remove the Config class in your code, otherwise, the outside config will be ignored

config with yaml shuold follow notice the WrapperNameHere means replace it as your Wrapper's name, it's means the class name, like abottle.ONNXModel' wrapper name is ONNXModel

WrapperNameHere:
    wrappers_fileds: here

abottle's wrapper configs

abottle.ONNXModel

ONNXModel:
    ort_file: "the file where .onnx file path"

abottle.PytorchModel

PytorchModel:
    model: "should be a importable string(in fact it's not implemented while this doc write"

abottle.TersorRTModel

TensorRTModel:
    trt_file: "the .plan or .trt file path"

abottle.TritonModel

TritonModel:
    name: "your model name in your Triton server, you can found it out from server's log"
    version: "your model version in your Triton server, you can found it out from server's log"
    triton_url: "your triton server's url `should not` contain schema like `http://`"

Motivation

as a DL model creator, you don't need to focus on how to serve or test the performance of a model on a target platform or how to optimize your model and don't lose accuracy, just find a bottle and put your logic code into it, the DL engineer people can do those things for you, all you need to do is export your model to a onnx file, and write logic code like above examples.

Feature

we will build this bottle as strong as possible, make this bottle become a standardization interface of the MLOps cycles, you can see more and more scenarios like optimization, graph fusing, performance test, deployment, data gathering, etc using this bottle.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abottle-0.0.10.tar.gz (10.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page