June 10, 2026

Async FastAPI, Measured

Everyone says async is faster. I finally ran the numbers.

Same Books API in Flask and FastAPI as mirror projects — same Postgres, same schema, same 10k seeded rows, same Docker host, both capped at 2 CPUs each. Flask under gunicorn 4w/4t. FastAPI under uvicorn 4w. k6 driving the load, 3 workloads, 3 runs each, median reported.

Workload 1 — read-light (GET random book of 10k, ramp 50 → 1000 VUs):
1. Flask: 1,476 req/s. FastAPI: 1,692 req/s. ~14% gap.
2. Client p50: Flask 276 ms. FastAPI 165 ms.
3. Verdict: similar. Pick on ergonomics.

Workload 2 — mixed (70% GET / 25% POST / 5% PATCH, ramp 100 → 500 VUs):
1. Flask: 1,326 req/s. FastAPI: 1,413 req/s. ~7% gap.
2. Client p50: Flask 232 ms. FastAPI 171 ms.
3. Verdict: also similar. Tail (p95) tilts to Flask under saturation here.

Workload 3 — I/O fanout (every request does pg_sleep(50ms), ramp 50 → 500 VUs):
1. Flask: 265 req/s. FastAPI: 992 req/s.
2. Client p50: Flask 905 ms. FastAPI 226 ms.
3. **FastAPI = 3.7× the throughput at ¼ the client p50.**

Why the fanout gap opens that wide:
gunicorn 4 workers × 4 threads = 16 concurrent slots. Once VUs exceed 16, every extra request queues on a thread that's blocked waiting for pg_sleep. FastAPI's event loop holds hundreds of in-flight coroutines per worker without blocking on the wait. The framework difference shows up exactly where the workload spends time waiting.

The honest rule:
1. Low-concurrency CRUD? Either framework is fine. The latency gap is below the noise.
2. Many concurrent slow I/O calls? Async wins clearly. WebSockets and streaming live here too.
3. CPU-bound work? Neither helps. Move to a worker queue.
4. The cost of async is real — harder debugging, smaller hiring pool, lazy-load gotchas. Pay it when the workload justifies it.

Caveat I owe you: this ran on a developer machine, not a cloud cluster. Treat the numbers as relative comparison under controlled identical conditions, not absolute production capacity.

Async in FastAPI is tooling, not a miracle. Use it where the workload demands it, not because the README is pretty.

This closes the 4-part series. Thanks for reading.

Full reproduction + raw data: https://github.com/bilouro/FastAPIProject/tree/main/benchmark

What workload finally pushed you from sync to async — and was the gain measurable?

P.S. New tech post every Wednesday.

#FastAPI #PerformanceEngineering #Python