Creating a prediction service in Python in less than 5 min

At the heart of any machine learning application, be it production optimization, anomaly detection, or a simple information dashboard, there is a prediction service. The role of the prediction service is to receive (raw or processed) data and provide prediction, which is used by other services to solve the problem at hand.

Regardless of the final use case, converting a trained model to a stable service with an interface that allows outside world to access the service is probably one the most frequently recurring tasks in every machine learning project. In this post, I would like to show you how to create a prediction service using Daeploy in less than 5 minutes!

Note: the source code for this and several other examples are available in our public example repository.

Anatomy of a prediction service using Daeploy

Let us start with an example code for a prediction service for the famous iris flowers dataset. There are tons of examples on how to train a classifier for the iris flowers dataset. There is a train_model.py in the “ml_model_serving” folder in our example repository as well. Here I would like to focus on the prediction service and its deployment.

Let us start by taking a look at the service.py. As you can see, it is just 24 lines of code.

Anatomy of predictoin service by Daeploy

Barebone service (1 min)

A service always starts by importing the needed libraries. The lines 1–3 import required standard Python libraries. Lines 5 and 6 import the service and data types from the Daeploy SDK.

Line 8 sets up a logger using the Python standard logging library. Finally, lines 23 and 24 convert our scripts to a service. When we deploy our scripts using the Daeploy command line interface (CLI), the service.run() tells the Daeploy manager that this script is a service. The Daeploy manager then automatically builds a docker image, installs all the dependencies defined in the requirements.txt, and runs the script (more on this later).

Loading the model (1min)

There are different ways to export models after training. In this example, we are using the scikit learn library and the model is exported using the pickle library. Create a folder called “models” in your project and copy your classifier.pkl file into this folder.

Lines 9 to 12 together set up the right path for the location of the classifier and load it from the pickle file.

Creating an API (1min)

One of the powerful features of the Daeploy SDK is that it converts any Python function to an API by decorating it. Lines 16–20 define a simple predict function that receives dataframe as an input data type and returns an array as an output. The service.entrypoint decorator on line 15 converts this function to an API.

It took us approximately 3 minutes and 24 lines of code to write the code for a fully functioning prediction service. Next, we are going to deploy and test our service.

Deploying the service

To deploy our service, we need to run the Daeploy manager at the target machine that could be a server in a factory, a virtual machine in the cloud or locally on your own computer. The manager packages your code into services which can be communicated with by using a HTTP-based REST API.

Run the Daeploy manager (1 min)

I am going to deploy my prediction service locally on my computer. But the same procedure applies regardless of the location of the target host. The Daeploy manager is a docker image which is freely available on docker hub. To start a free trial manager running on the localhost, run the following command:

$ docker run -v /var/run/docker.sock:/var/run/docker.sock -p 80:80 -p 443:443 -d daeploy/manager:latest

You can check that it started correctly by opening http://localhost/ in your browser. The trial manager can be used without restrictions for 12 hours before it has to be restarted.

Note: The Manager is highly configurable using environment variables. To learn about manager setup for production environment, see here.

Deploy the prediction service (1 min)

The Daeploy SDK conveniently comes with a command line interface (CLI) which can be used to communicate with the Daeploy manager. If you haven’t done that already, install the daeploy library:

$ pip install daeploy

The Daeploy manager comes with built-in authentication. Hence, to be able to communicate with the manager, we need to login first. The default username and password are “admin” and “admin”.

$ daeploy login
Enter daeploy host: http://localhost

After a successful login, we are connected to our specified host and able to communicate with the Daeploy Manager. We are now ready to deploy our prediction service. Using the Daeploy CLI, we can deploy our service using the deploy command. Three inputs are necessary: service name, service version, and path to the service folder. In this example, we call our service prediction_service with version 1.0.0.

$ daeploy deploy prediction_service 1.0.0 <path to the project>

It will take few seconds for the manager to build a docker image from our service.py, install the dependencies defined in the requirements.txt and run it as a service.

Testing our prediction service

There are several ways to test our newly running prediction service. We are going to look at two methods: Interactive API and Python requests library

Interactive API

The Daeploy manager comes with a dashboard. Visit localhost and use the default username (admin) and password (admin) to log in. You should be able to see your prediction_service running:

Daeploy dashboard — prediction_service is running.

Every service comes with two links: Logs and Docs. Clicking on the service Docs opens the interactive API documentation.

The interactive API for prediction_service

As you can see, the Daeploy manager has added our predict method to an API. By opening the predict entry point, you will have the possibility of using the “Try it out” button to interact with the API. Provide the test data and get back a prediction from our service.

Use the interactive API to try out your endpoint

Python requests library

We can also interact with our service using any software to send a POST request. In Python, we can use the requests library to communicate with our service. If you haven’t, install the requests library:

$ pip install requests

If your service is running on a manager with authentication enabled, which it should for a production setup, then any requests to that service must be authenticated by including a valid authentication token. Tokens are generated by the Daeploy Manager. To generate a new token, use the Daeploy CLI:

$ daeploy token

This will generate a long-lived authentication token. To get a prediction from our prediction service, we need run the following command:

TOKEN = "your_token"
response = requests.post(
    "services/prediction_service/predict",
    json={"data": <test data>},
    headers={"Authorization": f"Bearer {TOKEN}"})print(f"Response: {response.status_code} - {response.reason}")

Converting a model to a running prediction service is one of the main building blocks of most of machine learning solutions. Using Daeploy to create such a service results in a single python script with less than 30 lines of code. The Daeploy manager solves the deployment difficulties and converts our scripts to a running service. The entire process took less than 5 minutes. You should try it! Happy Daeploying!