Time series databases are purpose-built for append-heavy, timestamp-indexed workloads. This project answers the question nobody asks clearly enough: what does a general purpose database actually get wrong for time series data, and at what point does a specialised engine start to matter?
The answer is not binary. It depends entirely on the query pattern.
The same 18,144,000-row IoT sensor dataset is loaded into three databases. The same 5 queries are run against all three. Results are timed, compared, and explained.
| Database | Type | Port |
|---|---|---|
| Vanilla PostgreSQL 15 | General purpose RDBMS — the baseline | 5432 |
| TimescaleDB (PG15 + extension) | PostgreSQL with hypertable + continuous aggregate | 5433 |
| InfluxDB 2.7 | Purpose-built TSDB, Flux query language | 8086 |
30 IoT sensors at 1-second intervals over 7 days.
device_id string sensor_001 … sensor_030
ts timestamptz 1-second intervals, UTC
temperature float Gaussian per device baseline ± noise
humidity float Gaussian per device baseline ± noise
pressure float Gaussian per device baseline ± noise
battery_pct float linear drain 100% → ~70% over 7 days
is_anomaly bool 1% of readings flagged
Total rows: 18,144,000 (30 devices × 604,800 seconds) File size: ~180MB (Snappy-compressed Parquet) Seed: 42 — fully reproducible
time-series-databases/
├── docker-compose.yml # postgres_vanilla, timescaledb, influxdb
├── requirements.txt
├── generate_sensor.py # generates data/sensors.parquet
├── ingest_postgres.py # vanilla PG — COPY + btree indexes
├── ingest_timescale.py # hypertable + continuous aggregate + retention
├── ingest_influx.py # line protocol write to InfluxDB
├── benchmark_queries.py # 5 queries × 3 databases, timed comparison
└── data/ # gitignored — sensors.parquet, benchmark_results.json
# 1. Start containers
docker compose up -d
# 2. Retrieve InfluxDB token (generated at container init)
docker exec influxdb influx auth list --hide-headers | awk '{print $4}'
export INFLUX_TOKEN="your-token-here"
# 3. Environment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# 4. Generate dataset
python generate_sensor.py
# 5. Ingest into all three databases
python ingest_postgres.py
python ingest_timescale.py
python ingest_influx.py
# 6. Run benchmark
python benchmark_queries.py
# 7. Teardown
docker compose down # stop, preserve volumes
docker compose down -v # stop, destroy volumes (full reset)
# Note: docker compose down -v destroys the InfluxDB token — retrieve a new one on next start| # | Query | What it tests |
|---|---|---|
| Q1 | Last value per device | Point lookup — latest reading for all 30 devices |
| Q2 | 5-min rolling average (sensor_001, 1h window) | Time-windowed aggregation |
| Q3 | Gap detection — missing readings > 2 min | Temporal reasoning across full dataset |
| Q4 | Hourly downsample (7 days) | Aggregation — the critical query |
| Q5 | Range scan — sensor_001, full day | Selective time predicate, 86,400 rows |
Each query runs 3 times per database. Best time is recorded.
| Query | Vanilla PG | TimescaleDB | InfluxDB | Winner |
|---|---|---|---|---|
| Q1 Last value | 2.594s | 9.296s | 0.031s | InfluxDB ✓ |
| Q2 Rolling avg | 0.078s | 0.097s | 0.013s | InfluxDB ✓ |
| Q3 Gap detection | 17.557s | 13.991s | ERROR | TimescaleDB ✓ |
| Q4 Hourly downsample | 0.206s | 0.003s | 0.349s | TimescaleDB ✓ |
| Q5 Range scan | 0.194s | 0.320s | ERROR | Vanilla PG ✓ |
| Query | TimescaleDB | InfluxDB |
|---|---|---|
| Q1 | 0.3× (slower) | 83.9× faster |
| Q2 | 0.8× (slower) | 5.9× faster |
| Q3 | 1.3× faster | — (timeout) |
| Q4 | 76.4× faster | 0.6× (slower) |
| Q5 | 0.6× (slower) | — (error) |
No single database wins every query. The right tool depends on the dominant query pattern in your workload.
TimescaleDB's continuous aggregate is the standout result. Q4 at 76.4× over vanilla PG is not a marginal improvement — it is a different order of magnitude. The continuous aggregate pre-computes hourly aggregates incrementally as data arrives. At query time TimescaleDB reads a materialised view, not 18M raw rows.
InfluxDB dominates point lookups. last() is a native primitive. Q1 at 83.9× over vanilla PG shows what purpose-built means — the engine was designed from the ground up to answer "what is the latest value" instantly.
TimescaleDB's hypertable can be slower than vanilla PG. Q1 and Q5 both show TimescaleDB slower than vanilla PG on the same query. For point lookups and selective single-device range scans, the chunk file overhead exceeds the pruning benefit. Hypertables win at wide time ranges and high cardinality, not on narrow selective queries.
Vanilla PG is not the wrong tool. On Q5 it beats TimescaleDB. On Q2 it is within 25% of TimescaleDB. A well-indexed PostgreSQL table handles time series workloads at moderate scale — the TSDB argument only becomes compelling at the specific query patterns where specialisation matters.
InfluxDB fails at cross-series temporal analysis. Q3 gap detection timed out. InfluxDB is optimised for tag-filtered, field-scoped queries within bounded time ranges. Full dataset temporal reasoning is outside its sweet spot.
Hypertable — a standard PostgreSQL table whose storage is automatically partitioned into time-based chunks. A query with a time predicate only opens the relevant chunk files. Queries without time predicates must open all chunks — which is why Q1 is slower on TimescaleDB than vanilla PG.
Continuous aggregate — a materialised view that recomputes incrementally on a background schedule. sensor_readings_hourly stores pre-computed hourly averages per device. Q4 reads from this view rather than aggregating 18M rows live.
Retention policy — automatic chunk deletion after N days at the filesystem level. O(1) regardless of row count. Vanilla PG DELETE WHERE ts < cutoff is O(n).
Tags vs Fields — tags are indexed string metadata (device_id). Fields are the actual numeric values (temperature, humidity). Filtering on tags is fast. Filtering on field values is a full scan. Getting this wrong is the most common InfluxDB production mistake.
Line protocol — measurement,tag=value field=value timestamp_ns. The wire format for writes. Built for high-throughput appends.
Flux — InfluxDB's functional query language. Every query must start with range() — InfluxDB always filters by time first. Queries without a time bound are rejected.