Here at Engel & Völkers Technologies our mission is to enhance user experience for our real estate agents and property owners. One part of this is to provide data based tools. As a first task out of this mission we built a predictive model for property price evaluation. During this process we came up with the following model for implementing machine learning tools in a real time productive environment.
Developing and bringing a machine learning model into production is a task with a lot of challenges, like model and attribute selection, dealing with missing values, normalization and others. Finding a workflow that puts all the gears, from data preprocessing and analysis over building models and selecting the best performing one to serving the model in a real time API, into motion is the one we want to share here.
The life cycle of machine learning is basically described by the iteration of the following four steps:
Each of these steps is under constant evaluation in case model performance can be enhanced by adding different data attributes or different preprocessing methods.For our approach we split the process of modelling into two other parts. Part one contains the above mentioned four steps and we call it Manual Run Modeling and step two is automating the steps of part one.
In this manual part we first analyse our new task and then come up with a hypothesis we want to prove and test.
Development and Prototyping Environment
First we set up a development environment for working on the new task. For this we spin up a Jupyter notebook server, which can easily be deployed on Google Cloud AI Platform. The notebook approach enables us to develop fast and share results with the team using a browser. With the ability to easily visualize data online in a notebook, this approach is especially useful in the data extraction and preprocessing process.
Data preparation and visualization
Python provides some nice packages for generating graphics on data for faster insights, which speeds up our prototyping process in the notebook. We are especially fond of using Seaborn. After loading the data identified for this model into the notebook, normally using a dataframe, we begin by looking at each attribute and its values, often in combination with the other attributes. For this first overview we use a pairplot provided by Seaborn.
In combination with other visualizations, such as e.g. a correlation matrix we decide which attributes to use and how to handle outliers and missing values. After this process we then use one hot encoding for categorical attributes and normalize the continuous attributes to get an input into our models.
Model Selection and Evaluation
When the data is ready we choose several models to find a solution for our problem. These models can range from a multilinear regression model over random forests to deep neural networks with tensorflow. After splitting the data for training, evaluation and test we decide on a measure each model has to optimize for, e.g. mean squared error or precision, depending on the kind of problem. Once we identify the best model - in our opinion - from our choice of models we start by transforming the code for Google Cloud AI Platform.
After manual evaluation of preprocessing and modeling, we start the task of automating training and deployment for our production environment. This can be split into three tasks:
Training on Google Cloud AI Platform
After deciding on a model to go forward into production with, we optimize our code for data extraction and preprocessing to make it reusable and compliant with Google Cloud AI Platform rules. This means basically we have to create a Python package out of the first three steps.
A project could be set up as shown in the picture below.
This Python package is then deployed to Google Cloud platform and executed there. If you have custom packages you need to include in this process there is an option to supply those too. An example call for training on the cloud would then look like the following example:
One advantage of using Google Cloud AI platform is that there is the possibility of using automated hyperparameter tuning for models. This enables us to train a model automatically with different configurations and then select the one performing best for the defined measure in hptuning_config.yaml.
In the AI platform dashboard you can then see, which hyperparameter combination of your defined values in params had the best results for the defined hyperparameterMetricTag and goal.
The identified model is then ready to be deployed to the platform, where Google provides an URL to access the model in real time.
Deploying the model on GCP
Deploying to production is done with a Jenkins job. We use Jenkinsfile to define our jobs as part of our code. A model deployment consists of the following steps:
If all of these steps are successful the model is ready for usage in the specified environment via an URL endpoint.
Deploying the Real Time API
Since the model is deployed and accessible using an URL endpoint, we now have to build a transformation API that takes the input data and transforms it into the needed format for the model endpoint, calls the model and returns it's result. To make using the model easier for other services, our data entry format is JSON. This makes the data human readable and changes to any steps concerning the model (except changing the number of attributes) can be done without dependencies on our client services.
REST Service
As framework for our REST API we chose Flask, since it is lightweight, flexible, easy to use and also written in Python. Since API and model are written in the same language we can make use of the preprocessing from the training package we needed for training above. The main work here lies in adapting the code to only run one single event, instead of the batch prediction, used to validate the result during training.
For stability and security reasons we added some additional checks:
We also created an extra package containing all transformation functions, we use in several of our models. This package contains, e.g. min-max-normalization and distance calculation functions.
Since speed is important in this component, we refrained from using database calls and instead store all needed data for enriching and transforming the incoming data inside a cache. After receiving the prediction from the model, we sometimes qualify the results for regression models, by adding a confidence value. This helps our clients to better understand the results and decide on how they want to use them, especially if they are meant to be shown to end users.
Each of our responses has its own error code and message that is supplied in the result. The result is again in JSON format. It basically consists of the fields:
Deployment to our production system is then handled by a Jenkins job with the following steps:
By using Cloud Run we do not need to worry about hardware configuration and can focus on optimizing the API and the model.
Conclusion
By following this process we make sure that the time spent on the necessary things, beside building a model is kept to a minimum and does not include managing underlying infrastructure resources or availability concerns. Especially the part after the manual data and model selection process is usable as a best practice template to fasten the deployment process. This is thanks to the tools provided by Google and deliberately choosing to extract reusable functions into their own Python package.
Links:
https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/sklearn/sklearn-template/template
https://cloud.google.com/ai-platform/training/docs/using-hyperparameter-tuning
https://www.jenkins.io/doc/book/pipeline/jenkinsfile/
https://flask.palletsprojects.com/en/1.1.x/