How to Deploy ML Models with FastAPI

Back to Knowledge Hub

Category:

Software Engineering

Preview

Deploying your machine learning model as an API makes it easy to integrate predictions into applications and services. In this blog, we'll walk through deploying a simple ML model using FastAPI—a modern, fast (high-performance) web framework for building APIs with Python 3.7+.

Introduction

Machine learning models can be used to predict outcomes, classify data, or perform other tasks based on input data. Once you have a trained model, you need a way to expose it to other applications. FastAPI makes it easy to build robust APIs with automatic interactive documentation (Swagger UI) and excellent performance.

In this tutorial, we will:

Train a simple model.
Save the model to disk.
Create an API using FastAPI to serve predictions.
Test the API locally.

Why FastAPI?

FastAPI is popular for several reasons:

Speed: Built on Starlette and Pydantic, FastAPI is one of the fastest Python frameworks.
Type Hints: Automatic data validation and documentation using Python type hints.
Easy to Use: Clean and intuitive syntax makes it beginner-friendly.
Automatic Documentation: Interactive API docs generated automatically using Swagger UI and ReDoc.

Step-by-Step Guide

Step 1: Prepare and Save Your ML Model

First, let’s create a simple machine learning model using scikit-learn. For example, we can train a logistic regression model on a synthetic dataset.

Code: Train and Save the Model

import numpy as np
import pickle
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=4, random_state=42)

# Train a logistic regression model
model = LogisticRegression()
model.fit(X, y)

# Save the model to disk
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

print("Model saved successfully!")

Run this script to create a file named model.pkl in your working directory.

Step 2: Create a FastAPI App

Now, we will create a FastAPI app that will load the saved model and define an endpoint for making predictions.

Code: FastAPI Application (main.py)

from fastapi import FastAPI
from pydantic import BaseModel
import pickle
import numpy as np

# Load the saved model
with open('model.pkl', 'rb') as f:
    model = pickle.load(f)

# Define the input data schema using Pydantic
class ModelInput(BaseModel):
    feature1: float
    feature2: float
    feature3: float
    feature4: float

app = FastAPI(title="ML Model Deployment with FastAPI")

@app.get("/")
def read_root():
    return {"message": "Welcome to the ML model API. Use the /predict endpoint to get predictions."}

@app.post("/predict")
def predict(input_data: ModelInput):
    # Convert input data to a numpy array
    data = np.array([[input_data.feature1, input_data.feature2,
                      input_data.feature3, input_data.feature4]])
    
    # Get model prediction
    prediction = model.predict(data)
    
    # Return the prediction result
    return {"prediction": int(prediction[0])}

In this script:

We load the model using pickle.
We define a ModelInput class with the expected features.
We create an endpoint /predict that accepts a JSON payload with four features and returns a prediction.

Step 3: Define Endpoints for Predictions

The /predict endpoint is where the magic happens:

It takes in JSON input, validates it against the ModelInput schema.
Converts the input into a NumPy array.
Uses the loaded ML model to predict the outcome.
Returns the prediction as a JSON response.

Interactive API Documentation:
Once your FastAPI server is running, visit http://127.0.0.1:8000/docs to see an interactive interface to test your endpoint.

Step 4: Run and Test the API

You can run your FastAPI application locally using Uvicorn. From your terminal, run:

uvicorn main:app --reload

The --reload flag is great for development because it restarts the server when code changes.
Once running, navigate to http://127.0.0.1:8000 to see the welcome message.
You can test the /predict endpoint using the interactive Swagger UI at http://127.0.0.1:8000/docs.

Testing with CURL

You can also test your API using curl:

curl -X POST "http://127.0.0.1:8000/predict" -H "Content-Type: application/json" -d '{
  "feature1": 0.5,
  "feature2": -0.2,
  "feature3": 1.3,
  "feature4": 0.7
}'

You should receive a response similar to:

{"prediction": 1}

Conclusion

Deploying your machine learning model with FastAPI is a straightforward process that involves:

Preparing and saving your model.
Creating an API to serve predictions.
Defining endpoints with data validation using Pydantic.
Running and testing your API locally.

With FastAPI, you can quickly build and scale your ML-powered services with minimal overhead. Experiment with these examples and extend them to suit your project needs. Happy coding and deploying! 🚀

‹

›

Other Blogs

View Project

When to Use XGBoost vs. Random Forest

View Project

When to Use XGBoost vs. Random Forest

View Project

When to Use XGBoost vs. Random Forest

Build an AI Chatbot with OpenAI and LangChain

View Project

Building an AI Chatbot with OpenAI and LangChain

View Project

Building an AI Chatbot with OpenAI and LangChain

View Project

Building an AI Chatbot with OpenAI and LangChain

View All

Do you have any project idea you want to discuss about?

Contact Me

Do you have any project idea you want to discuss about?

Contact Me

Do you have any project idea you want to discuss about?

Contact Me