Guide

PostgreSQL fundamentals explained

Supabase, Neon, RDS, and most greenfield backends default to PostgreSQL for good reason: strict SQL semantics, rich indexing, extensions (PostGIS, pgvector, Citus), and a concurrency model that keeps readers from blocking writers under normal OLTP load. The catch is that Postgres is not “install and forget” — you need to understand MVCC (multi-version concurrency control), how the write-ahead log (WAL) guarantees durability, why VACUUM is not optional housekeeping, and how B-tree and GIN indexes interact with the planner. ORMs like SQLAlchemy, Drizzle, and Prisma hide SQL generation but not physics — a missing index or bloated table still hurts at scale. This guide covers Postgres process architecture, MVCC snapshots and isolation, indexing strategy, WAL and checkpoints, autovacuum and bloat, connection pooling with PgBouncer, streaming replication, JSONB and partitioning, a Harbor Commerce order ledger worked example, a database decision table, common pitfalls, and a production checklist alongside our SQL fundamentals guide.

What PostgreSQL is

PostgreSQL is an open-source object-relational database descended from the Berkeley POSTGRES research project. Clients connect over TCP (port 5432) or a Unix socket, authenticate (password, cert, or SSO via extensions), and issue SQL. Unlike MySQL’s storage-engine plug-ins, Postgres uses a single unified storage layer (heap + index access methods) with a powerful extension system loaded into the same server process.

Process architecture

A running cluster is a postmaster parent that forks backend processes — one per client connection by default. Background workers handle WAL writing, checkpointing, autovacuum, logical replication, and stats collection. Each backend owns private memory for sorts and hash aggregates (work_mem), while shared buffers cache data pages for all backends.

Because every connection is a full OS process, Postgres is memory-hungry at high concurrency. Production stacks almost always front the database with connection pooling (PgBouncer in transaction mode, RDS Proxy, or pooler built into serverless Postgres) rather than opening thousands of direct sessions from serverless functions.

MVCC: how Postgres handles concurrent reads and writes

Postgres implements multi-version concurrency control: when you UPDATE a row, Postgres writes a new row version and leaves the old version in place until vacuum reclaims it. Each row carries xmin and xmax transaction IDs marking visibility bounds. Readers take a snapshot at transaction start and see only row versions visible to that snapshot — writers do not block readers, and readers do not block writers (except when both need row-level locks on the same row).

Isolation levels

Default isolation is READ COMMITTED: each statement sees committed data as of statement start. REPEATABLE READ and SERIALIZABLE provide stronger guarantees; serializable uses predicate locking and may raise serialization failures your app must retry. See our isolation levels guide for cross-database comparison with MySQL InnoDB.

Why dead tuples matter

Heavy UPDATE/DELETE workloads leave dead tuples that bloat tables and indexes. Queries scan more pages; indexes point at obsolete versions. VACUUM (usually via autovacuum) marks space reusable; VACUUM FULL rewrites the whole table (locks exclusively — avoid on hot tables). Monitor n_dead_tup in pg_stat_user_tables and autovacuum lag alerts.

WAL, checkpoints, and durability

Every committed change appends to the write-ahead log (WAL) before dirty data pages hit disk. On crash, Postgres replays WAL to reach a consistent state. This is the same durability contract as InnoDB’s redo log, but Postgres exposes more tuning knobs:

  • synchronous_commit = on (default) — commit waits for WAL flush; safest.
  • synchronous_commit = off — faster commits with a window of lost transactions on crash; rare in financial OLTP.
  • Checkpoints — periodic WAL flush to data files; tune checkpoint_timeout and max_wal_size to avoid I/O spikes.

Streaming replication ships WAL to standbys for read scaling and failover. Synchronous replicas (synchronous_standby_names) trade latency for zero data loss on primary failure; async replicas may lag seconds. Logical replication decodes WAL into row changes for selective table sync or cross-version upgrades. Point-in-time recovery (PITR) combines base backups with archived WAL segments — test restore procedures quarterly, not only backup jobs.

Indexing strategy

Postgres heap tables store rows in insertion order (unless clustered). Indexes are separate B-tree structures pointing to row locations (ctid). Unlike MySQL InnoDB, the primary key does not dictate physical row order unless you CLUSTER (rare, reverts over time).

Index types you will actually use

  • B-tree (default) — equality and range on scalars, timestamps, UUIDs; supports multicolumn and partial indexes (WHERE status = 'open').
  • GIN — inverted indexes for jsonb, arrays, full-text tsvector; larger and slower to update than B-tree but essential for containment queries (@>, @@).
  • GiST / SP-GiST — geometric data, ranges, nearest-neighbor patterns.
  • BRIN — block-range summaries for append-only time-series where rows correlate with insert order.

Use EXPLAIN (ANALYZE, BUFFERS) on production-shaped queries. A sequential scan on a million-row table is often correct if you need most rows; forcing indexes on low-selectivity columns hurts. Expression indexes (CREATE INDEX ON orders ((lower(email)))) and covering indexes (INCLUDE (col)) eliminate heap lookups for index-only scans when visibility map bits allow.

Foreign keys should be indexed on the referencing side — Postgres does not auto-create them. Missing FK indexes turn cascades and parent deletes into sequential scans on child tables.

JSONB, partitioning, and extensions

JSONB stores JSON in a decomposed binary format with GIN indexing support. Use it for semi-structured attributes (metadata, feature flags) that change shape often — not as a substitute for normalized columns you filter and join on every request. Validate shape at the application layer or with CHECK (jsonb_typeof(payload->'amount') = 'number') constraints.

Declarative partitioning (RANGE, LIST, HASH) splits large tables into child partitions pruned at plan time. Time-series event tables partitioned by month keep indexes small and allow dropping old partitions instead of mass DELETE. Partition key must appear in queries for pruning to help.

The extension ecosystem is a major Postgres advantage: PostGIS for geospatial, pgvector for embeddings, pg_cron for scheduled jobs, citext for case-insensitive text. Managed providers whitelist subsets — verify extension availability before designing around one.

Connection pooling and serverless

PgBouncer multiplexes many client connections onto fewer server backends. Transaction pooling returns the backend after each transaction — great for stateless APIs but breaks session-level features (prepared statements without DEALLOCATE, SET, advisory locks, LISTEN/NOTIFY). Session pooling preserves session state at the cost of lower multiplexing gain.

Serverless platforms (Vercel functions, Cloudflare Workers calling Hyperdrive) open bursty connections. Pair short-lived compute with a pooler and conservative max_connections on Postgres itself. ORMs that prepare every statement may need prepare: false in transaction pool mode — a common production misconfiguration that surfaces as mysterious protocol errors.

Worked example: Harbor Commerce order ledger

Harbor Commerce runs a multi-tenant order API on Postgres 16 behind FastAPI. Requirements: idempotent payment webhooks, fast lookup by order ID and customer email, nightly revenue aggregates without locking checkout, and row-level security when exposing read replicas to analytics tools.

CREATE TABLE customers (
  id           uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  email        citext NOT NULL UNIQUE,
  created_at   timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE orders (
  id           uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  customer_id  uuid NOT NULL REFERENCES customers(id),
  status       text NOT NULL CHECK (status IN ('pending','paid','shipped','refunded')),
  total_cents  bigint NOT NULL CHECK (total_cents >= 0),
  metadata     jsonb NOT NULL DEFAULT '{}',
  created_at   timestamptz NOT NULL DEFAULT now(),
  updated_at   timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX idx_orders_customer_created
  ON orders (customer_id, created_at DESC);

CREATE INDEX idx_orders_status_open
  ON orders (created_at)
  WHERE status IN ('pending', 'paid');

CREATE UNIQUE INDEX idx_orders_idempotency
  ON orders ((metadata->>'idempotency_key'))
  WHERE metadata ? 'idempotency_key';

-- Idempotent insert from payment webhook
INSERT INTO orders (customer_id, status, total_cents, metadata)
VALUES ($1, 'paid', $2, $3)
ON CONFLICT ((metadata->>'idempotency_key'))
  WHERE metadata ? 'idempotency_key'
DO NOTHING
RETURNING id;

Design choices explained:

  • UUID primary keys — safe for public APIs and distributed writers; accept slightly larger indexes than bigint serials.
  • citext email — case-insensitive uniqueness without application normalization bugs.
  • Partial index on open statuses — ops dashboards query pending/paid queues without indexing shipped history.
  • Expression unique index on idempotency_key — webhook retries cannot double-charge; partial predicate skips rows without the key.
  • jsonb metadata — gateway-specific fields without schema migrations per processor.

FastAPI connects through PgBouncer in transaction mode with pool_size capped per pod. Read-heavy reporting hits a streaming replica lag-monitored under five seconds; financial reconciliation jobs use REPEATABLE READ snapshots on the primary during low-traffic windows.

Database decision table

Choose PostgreSQL when… Prefer MySQL when… Consider something else when…
Rich SQL, CTEs, windows, and partial indexes are daily needs Existing LAMP/WordPress stack is fixed on InnoDB Single-file embedded app — SQLite
Extensions (PostGIS, pgvector, Citus) matter PlanetScale/Vitess sharding path is already chosen Columnar analytics at petabyte scale — warehouse
JSONB + GIN beats document DB for hybrid workloads Team tooling is Percona/MySQL EXPLAIN-centric Sub-ms KV at millions QPS — Redis
Logical replication and foreign data wrappers integrate pipelines Managed Aurora MySQL is already provisioned Edge-local sync — Turso, CRDT stores
Serializable isolation or complex CHECK constraints are required Read replicas + async binlog are sufficient Schema-free burst writes only — document store

Common pitfalls

  • Too many direct connections — serverless bursts exhaust max_connections; use PgBouncer or a managed pooler.
  • Neglecting autovacuum — dead tuple bloat causes sequential scans and index swell; tune thresholds on hot tables.
  • Missing indexes on FK columns — parent deletes and joins degrade to full child scans.
  • SELECT * on wide jsonb rows — ships megabytes per row; project columns explicitly.
  • ORM N+1 queries — lazy loading in loops multiplies round trips; use joins or batched IN queries.
  • Long transactions holding xmin — blocks vacuum, inflates bloat; keep transactions short.
  • Prepared statements + transaction pooling — protocol errors until ORM disables prepare or pool mode changes.
  • Random UUID inserts without plan — index fragmentation on monotonic inserts; consider uuidv7() or bigint IDs for append-heavy tables.

Production checklist

  • Set shared_buffers ~25% of RAM (not more without measurement) and size work_mem conservatively per connection.
  • Enable pg_stat_statements and log slow queries (log_min_duration_statement).
  • Monitor autovacuum, dead tuples, replication lag, and disk growth on WAL and data directories.
  • Index every foreign key and frequent WHERE/JOIN predicate; verify with EXPLAIN ANALYZE.
  • Front OLTP with PgBouncer or equivalent; cap app pool sizes to match pooler math.
  • Automate base backups + WAL archiving; test point-in-time restore quarterly.
  • Use migrations (Flyway, Liquibase, Alembic, Prisma migrate) — never hand-edit production schema.
  • Set statement timeouts on analytics roles to kill runaway queries.
  • Run major version upgrades through blue/green or logical replication with a rehearsed cutover.
  • Document extension dependencies and managed-provider allowlists before architecture sign-off.

Key takeaways

  • PostgreSQL uses MVCC so readers and writers rarely block each other — at the cost of vacuum discipline.
  • WAL provides crash recovery and replication; synchronous settings trade latency for durability.
  • B-tree, GIN, and partial indexes are the main levers for OLTP and JSONB query performance.
  • Connection pooling is mandatory at serverless scale; mind transaction-mode limitations.
  • Extensions and SQL depth make Postgres the default for new backends when the team can operate it well.

Related reading