Developing APIs for Machine Learning Models using FastAPI

Posted by

<!DOCTYPE html>

Creating APIs For Machine Learning Models with FastAPI

Creating APIs For Machine Learning Models with FastAPI

FastAPI is a modern web framework for building APIs with Python. It is fast, easy to use, and provides automatic handling of requests and responses. In this article, we will explore how to create APIs for machine learning models using FastAPI.

Step 1: Install FastAPI

To get started, you will need to install FastAPI and Uvicorn, which is a lightning-fast ASGI server that can run FastAPI applications. You can install them using pip:

pip install fastapi uvicorn

Step 2: Create a Machine Learning Model

Before creating an API, you will need to have a machine learning model that you want to serve. You can train a model using popular libraries like Scikit-learn, TensorFlow, or PyTorch.

Step 3: Create an API with FastAPI

Next, create a new Python file and import FastAPI. Define an instance of FastAPI and create a route that will accept input data and return the predictions from your machine learning model. Here is an example:

“`python
from fastapi import FastAPI

app = FastAPI()

@app.post(“/predict”)
def predict(data: dict):
# Code to make predictions using your model
return {“prediction”: “example_prediction”}
“`

Step 4: Run the API

To run your API, use the Uvicorn command-line interface (CLI) and specify the file that contains your FastAPI application:

uvicorn yourfile:app --reload

Your API will now be running on http://localhost:8000. You can make POST requests to the /predict endpoint with input data to get predictions from your machine learning model.

Conclusion

In this article, we have explored how to create APIs for machine learning models using FastAPI. FastAPI is a powerful tool for building APIs with Python, and its automatic handling of requests and responses makes it easy to deploy machine learning models as web services. Try it out and start serving your models today!

0 0 votes
Article Rating
14 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
@tcgvsocg1458
5 months ago

i am not entire sur i understand how that worst but thx a lot for the video

@zedcodes
5 months ago

Why do I have this error: 'module 'PIL.Image' has no attribute 'ANTIALIAS''? @10:41

@Lyphnet
5 months ago

Please ensure that your Discord server remains joinable. Thanks!

@cheukmingau983
5 months ago

In production the async endpoint should not be used. An async function (coroutine) will be executed in the main thread event loop, and like the event loop in JS inside the browser, it can only execute one coroutine at a time. Running the synchronous, cpu intensive `model.predict` inside the async endpoint will make your prediction endpoint frozen and wait for the underlying cpu predicting the images, so the QPS of your handler is at most one.

Better options could be: 1) Using a synchronous function as the inference endpoint, 2) create a threadpoolexecutor outside of the async function, and use `loop.run_in_executor()` with the threadpoolexecutor declared as it will run the model inside the thread, or 3) use poolexecutor similar to option 2. The problem for option 3 is that multiprocesses requires pickling and you might have to tweak your model case by case.

Also, pickling the model and deserialize in the application api server doesn't reveal the identity and method signatures of that model. If you are the only one who train and deploy that might not be a big problem, but in production you might want to use some inferencing frameworks like Onnxruntime which you just serialize your trained model to the preferred format first (onnxruntime has a very small package size compared to other DL libraries which makes the deployment dependency smaller). Lastly, running scikit-learn model in python doesn't utilize the all the cores in your cpu, whereas other packages usually have higher utilization proportion.

I understand that the model in this video is small in size and is a POC, so with the small size running async and pickling is fine. However, for just some even better CV and NLP models (e.g. BERT) it is nearly impossible to adopt the same approach as in this tutorial.

@Hardy_21
5 months ago

For me it correctly guesses only numbers 4, 6. For the rest it says they're 7 or 5.

@omegasigma4500
5 months ago

I'm glad you uploaded a video about FastAPI. We prefer it over Flask.
There are 2 topics where we need some help.
1.) Hosting – How to deploy the app so that other can access it via web? And how to manage the cloud infrastructure?
2.) Frontend – There are now plenty of frameworks and libraries. The standard approach is probably JavaScript, HTML and CSS. But I'm wondering what you think about pure Python libraries like Taipy, FastUI and reflex. What do you think is the best approach here? We would highly appreciate your input. Thanks!

Keep up the great work! 💪💪👍👍

@smstudio1035
5 months ago

Can we see a hosting video off the same

@kutilkol
5 months ago

The head is fatter

@TheDigitalSight
5 months ago

We use FastAPI more than django and flask, can you please create video on langchain and fastapi as well?

@thelifehackerpro9943
5 months ago

why not use model directly instead of pickle?

@timothyelems1357
5 months ago

Exactly what I was looking for! Thanks man!

@khandoor7228
5 months ago

This was excellent, the capabilities this opens up is really powerful. Good job as always.

@dipeshsamrawat7957
5 months ago

You are making requested videos. Thank you 💯

@systembreaker4651
5 months ago

What is your daily Linux distro ❤