Building a High-Performance Sensor Data API with FastAPI and Postgres' TimescaleDB Extension
Create an API for streaming, storing, and querying sensor data using Postgres TimescaleDB and FastAPI
In this guide, you'll build a high-performance API for streaming, storing, and querying sensor data using FastAPI and TimescaleDB for efficient time-series data storage. By combining FastAPI with TimescaleDB's advanced time-series features, you'll be able to maintain low latency queries even at the petabyte scale, making it perfect for things like IoT systems that generate large volumes of sensor data.
Prerequisites
Before starting, ensure you have the following tools and services ready:
pip
: Required for installing and managing Python packages, including uv for creating virtual environments. You can check ifpip
is installed by running the following command:- Neon serverless Postgres : you will need a Neon account for provisioning and scaling your
PostgreSQL
database. If you don't have an account yet, sign up here.
Setting up the Project
Follow these steps to set up your project and virtual environment:
-
Create a
uv
projectIf you don't already have uv installed, you can install it with:
Once
uv
is installed, create a new project:This will create a new project directory called
timescale_fastapi
. Open this directory in your code editor of your choice. -
Set up the virtual environment.
You will now create and activate a virtual environment in which your project's dependencies will beinstalled.
You should see
(timescale_fastapi)
in your terminal now, this means that your virtual environment is activated. -
Install dependencies.
Next, add all the necessary dependencies for your project:
where each package does the following:
FastAPI
: A Web / API frameworkAsyncPG
: An asynchronous PostgreSQL clientUvicorn
: An ASGI server for our appLoguru
: A logging libraryPython-dotenv
: To load environment variables from a .env file
-
Create the project structure.
Create the following directory structure to organize your project files:
Setting up your Database
In this section, you will set up the TimescaleDB
extension using Neon's console, add the database's schema, and create the database connection pool and lifecycle management logic in FastAPI. Optionally, you can also add some mock data to test your API endpoints.
Given TimescaleDB is an extension on top of vanilla Postgres, you must first add the extension by running the following SQL in the SQL Editor
tab of the Neon console.
Next, you will add the necessary tables to your database with:
One of TimescaleDB's core features is Hypertables
, which is an optimized abstraction for handling large time-series data. It partitions your data into chunks based on time, allowing efficient storage, querying, and performance at scale. By converting the sensor_data table into a hypertable, TimescaleDB will manage the underlying chunking and indexing automatically.
To convert the sensor_data
table into a hypertable, use the following command:
Now that the schema is ready, you can optionally populate the database with some sample sensor data. First, insert the metadata for two sensors:
Next, generate time-series data for the past 14 days with one-minute intervals for both sensors. Here's how you can insert random data for each sensor using Timescales generate_series()
feature.
With your schema and sample data in place, you're now ready to connect to your database in the FastAPI application. To do this you must create a .env
file in the root of the project to hold environment-specific variables, such as the connection string to your Neon PostgreSQL database.
Make sure to replace the placeholders (user, password, your-neon-hostname, etc.) with your actual Neon database credentials, which are available in the console.
In your project, the database.py
file manages the connection to PostgreSQL
using asyncpg
and its connection pool, which is a mechanism for managing and reusing database connections efficiently. With this, you can use asynchronous queries, allowing the application to handle multiple requests concurrently.
init_postgres
is responsible for opening the connection pool to the PostgreSQL
database and close_postgres
is responsible for gracefully closing all connections in the pool when the FastAPI
app shuts down to properly manage the lifecycle of the database.
Throughout your API you will also need access to the pool to get connection instances and run queries. get_postgres
returns the active connection pool. If the pool is not initialized, an error is raised.
Defining the Pydantic Models
Now, you will create Pydantic
models to define the structure of the data your API expects and returns, automatically validating incoming requests and responses versus the defined format.
Each of the models represent the following:
SensorData
: A single sensor reading, including the value recorded and the timestamp when the reading occurredSensorDataBatch
: A batch of data points, to support batch streaming in your APISensorCreate
: The fields for creating a new sensorSensorDailyStatsResponse
: The daily sensor statistics
Creating the API Endpoints
In this section, you will define the FastAPI endpoints that allow you to manage sensor data. These endpoints handle tasks like creating new sensors, streaming sensor data (both single points and batches), and querying daily statistics for a specific sensor. With these endpoints, you can efficiently manage and analyze sensor data using TimescaleDB’s time-series capabilities.
The code defines endpoints for:
-
POST /sensors
: This endpoint creates a new sensor by providing the sensor type, description, and location. -
POST /sensor_data/{sensor_id}
: Streams sensor data for a specific sensor. The data can be a single point or a batch. -
GET /daily_avg/{sensor_id}
: Retrieves daily statistics (average, min, max, median, IQR) for the given sensor over the last 7 days.
In the query for the sensor statistics, the data is able to be partitioned quickly with Timescale's time_bucket()
function by using the indexes generated when you created the hypertable. Likewise, you can easily calculate things like the interquartile range (IQR) using Timescale-specific functions.
Running the Application
After setting up the database, models, and API routes, the next step is to run the FastAPI
application and test it out.
The main.py
file defines the FastAPI
application, manages the database lifecycle, and includes the routes you created above.
To run the application, use uvicorn CLI with the following command:
Once the server is running, you can access the API documentation and test the endpoints directly in your browser:
- Interactive API Docs (Swagger UI):
Visithttp://127.0.0.1:8080/docs
to access the automatically generated API documentation where you can test the endpoints. - Alternative Docs (ReDoc):
Visithttp://127.0.0.1:8080/redoc
for another style of API documentation.
Testing the API
You can test your application using HTTPie
, a command-line tool for making HTTP requests. The following steps will guide you through creating sensors, streaming data, and querying sensor statistics.
-
Retrieve sensor statistics for pre-generated data (optional).
If you followed the optional data generation steps, you can retrieve daily statistics for the pre-generated sensors:
These commands will return the daily statistics (average, min, max, median, and IQR) for the pre-generated temperature and humidity sensors over the last 7 days.
-
Create a new sensor.
Start by creating a new sensor (e.g., a temperature sensor for the living room):
You should see a response confirming the creation of the sensor with a unique ID:
-
Stream a single sensor data point.
Stream a single data point for the newly created sensor (
sensor_id = 3
):You should get a response indicating success:
-
Stream a batch of sensor data.
You can also stream multiple sensor data points in a batch for the same sensor:
This will send two data points to the sensor. The response will confirm successful streaming of the batch data:
-
Retrieve daily statistics for the new sensor.
After streaming the sensor data, you can retrieve the daily statistics for the new sensor (
sensor_id = 3
):This will return daily statistics (average, min, max, median, and IQR) for the new sensor over the last 7 days:
By following these steps, you can easily create sensors, stream sensor data, and query statistics from your API. For sensors with pre-generated data, you can retrieve the statistics immediately. For new sensors, you can stream data and retrieve their daily stats dynamically.
Conclusion
Now, you have created and tested an API for managing, streaming, and querying sensor data into TimescaleDB
using FastAPI
. By leveraging TimescaleDB for time-series data storage, you now have a high-performance solution for handling sensor data at scale.
As a next step, you can look into streaming data into the database using a distributed event platform like Kafka
or Red Panda
, or using Timescale
to monitor the sensor data with Apache Superset
or Grafana
.