In this tutorial, we will learn how to deploy PyTorch models using FastAPI in Google Colab. FastAPI is a modern, fast web framework for building APIs with Python, and Google Colab is a free, cloud-based platform that allows you to run Python code in a Jupyter notebook environment. By combining these tools, we can easily deploy our PyTorch models as APIs in the cloud.
Here are the steps we will follow in this tutorial:
- Install FastAPI and PyTorch
- Load a pre-trained PyTorch model
- Create a FastAPI app
- Define API endpoints
- Deploy the FastAPI app using ngrok
Step 1: Install FastAPI and PyTorch
First, we need to install the required libraries. In Google Colab, we can do this by running the following commands:
!pip install fastapi uvicorn pyngrok torch
Step 2: Load a pre-trained PyTorch model
Next, we will load a pre-trained PyTorch model that we want to deploy. For demonstration purposes, let’s use a simple image classification model:
import torch
import torchvision.models as models
model = models.resnet18(pretrained=True)
model.eval()
Step 3: Create a FastAPI app
Now, we will create a FastAPI app that will serve as our API endpoint. We will define a single POST endpoint that accepts an image as input and returns the model’s predictions:
from fastapi import FastAPI, UploadFile, File
from PIL import Image
import io
app = FastAPI()
@app.post("/predict")
async def predict(image: UploadFile = File(...)):
img = Image.open(io.BytesIO(await image.read()))
# Preprocess the image and make predictions using the loaded model
# Replace this with your own prediction logic
return {"prediction": "cat"}
Step 4: Define API endpoints
In the code above, we defined a single API endpoint /predict
that accepts an image file as input. You can replace the placeholder prediction logic with your own model predictions.
Step 5: Deploy the FastAPI app using ngrok
To deploy our FastAPI app, we will use ngrok, a tool that creates secure tunnels to localhost. First, start the FastAPI app by running the following command:
!uvicorn app:app --host 0.0.0.0 --port 8000
Next, install and run ngrok to create a secure tunnel to the FastAPI app running on port 8000:
!pip install pyngrok
from pyngrok import ngrok
# Open a secure tunnel to the FastAPI app
public_url = ngrok.connect(port=8000)
public_url
Copy the generated public URL and append /docs
to access the Swagger UI documentation for your FastAPI app. You can now upload an image to the /predict
endpoint and see the predictions made by your PyTorch model.
That’s it! You have now successfully deployed a PyTorch model using FastAPI in Google Colab. Feel free to experiment with different models and prediction logic to create your own APIs.
Really helped me today.. Thankyou !!
ModuleNotFoundError: No module named 'timm.models.beit' how to fix that in example? thanks
I can’t access to the web server, any tips?