PostgreSQL Table Partitioning for High-Volume Payment Data

When Partitioning Becomes Necessary

I spent the better part of a year watching our main transactions table grow from 50 million rows to over 400 million. For a while, indexes held things together. But once we crossed roughly 200 million rows, the cracks started showing: nightly settlement queries that used to finish in seconds were taking minutes, VACUUM operations were blocking writes during peak hours, and our index bloat was eating disk space faster than we could provision it.

There's no magic number, but in my experience these are the signals that it's time to partition:

Table size exceeds 50–100 GB or surpasses 100–200 million rows
VACUUM and ANALYZE take longer than your maintenance window
Index scans slow down despite proper indexing, because the B-tree depth keeps growing
You have a natural time-based or categorical access pattern (most payment queries do)

Tip: Don't partition prematurely. If your table is under 10 GB and queries are fast, partitioning adds complexity without meaningful benefit. Measure first.

How Partitioning Works

PostgreSQL's declarative partitioning (available since version 10, significantly improved in 12+) lets you split a single logical table into smaller physical tables. Queries against the parent table are automatically routed to the relevant partitions. The planner skips partitions that can't contain matching rows — a mechanism called partition pruning.

Transactions Table → Monthly Partitions

transactions (parent table — no data stored here)

↓ ↓ ↓ ↓ ↓ ↓

transactions_2026_01Jan 1 – Jan 31
~38M rows

transactions_2026_02Feb 1 – Feb 28
~35M rows

transactions_2026_03Mar 1 – Mar 31
~40M rows

transactions_2026_04Apr 1 – Apr 30
~12M rows

transactions_2025_12Dec 1 – Dec 31
~42M rows

...older months

Instead of scanning 400 million rows, a query for March 2026 settlements only touches the ~40 million rows in that single partition. The difference is dramatic.

Choosing a Partitioning Strategy

PostgreSQL supports three partitioning methods. For payment data, the choice depends on your query patterns.

Strategy	Best For	Pros	Cons
Range	Time-series data (created_at, settlement_date)	Natural fit for payment timelines; easy to drop old data; excellent pruning on date queries	Uneven partition sizes if volume fluctuates seasonally
List	Categorical data (status, currency, region)	Good for queries that always filter by a specific category; simple to reason about	Doesn't scale well if categories grow; unbalanced partition sizes
Hash	Even distribution when no natural range exists	Guarantees uniform partition sizes; works for merchant_id or transaction_id	No partition pruning on range queries; can't easily drop a "time window" of data

For most payment systems, range partitioning on created_at is the right default. Your settlement reports, reconciliation jobs, and compliance queries almost always filter by date range. If you also need to shard by merchant, consider a composite approach: range-partition by month, then sub-partition by merchant hash.

Setting It Up

Here's the DDL I used for our transactions table. The parent table defines the schema and partition key; each child table covers one month.

CREATE TABLE transactions (
    id              BIGINT GENERATED ALWAYS AS IDENTITY,
    merchant_id     UUID NOT NULL,
    amount          NUMERIC(15,2) NOT NULL,
    currency        CHAR(3) NOT NULL,
    status          VARCHAR(20) NOT NULL DEFAULT 'pending',
    payment_method  VARCHAR(30),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    settled_at      TIMESTAMPTZ,
    metadata        JSONB
) PARTITION BY RANGE (created_at);

-- Create monthly partitions
CREATE TABLE transactions_2026_01 PARTITION OF transactions
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE transactions_2026_02 PARTITION OF transactions
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

CREATE TABLE transactions_2026_03 PARTITION OF transactions
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

CREATE TABLE transactions_2026_04 PARTITION OF transactions
    FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');

-- Indexes are created on each partition automatically
CREATE INDEX ON transactions (created_at);
CREATE INDEX ON transactions (merchant_id, created_at);
CREATE INDEX ON transactions (status) WHERE status IN ('pending', 'processing');

Important: The partition key (created_at) must be included in any unique constraint or primary key. This is a PostgreSQL requirement. You can't have a global unique index on id alone — it needs to be (id, created_at).

Partition Pruning in Action

When you query with a filter on the partition key, PostgreSQL skips irrelevant partitions entirely. You can verify this with EXPLAIN:

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, amount, status
FROM transactions
WHERE created_at >= '2026-03-01'
  AND created_at < '2026-04-01'
  AND merchant_id = 'a1b2c3d4-...'
  AND status = 'settled';

-- Output shows:
--   -> Index Scan on transactions_2026_03
--      Rows Removed by Filter: 0
--      Buffers: shared hit=42
-- Only the March partition was scanned.

Without partitioning, that same query would scan the full index across all 400 million rows. With partitioning, it touches only the March partition's index — roughly 40 million rows.

Settlement Query

Before: 4.2s

320ms

13x faster

VACUUM Duration

Before: 45 min

3 min

Per partition

Index Size

Before: 28 GB

2.1 GB

Per partition

Automating Maintenance with pg_partman

Manually creating partitions every month is a recipe for a 2 AM incident when the next month's partition doesn't exist and inserts start failing. That's where pg_partman comes in. It automates partition creation and optionally handles retention.

-- Install the extension
CREATE EXTENSION pg_partman;

-- Configure automatic partition management
SELECT partman.create_parent(
    p_parent_table  => 'public.transactions',
    p_control       => 'created_at',
    p_type          => 'native',
    p_interval      => 'monthly',
    p_premake       => 3  -- create 3 future partitions ahead
);

-- Schedule maintenance (run via pg_cron or external scheduler)
-- This creates upcoming partitions and optionally drops old ones
SELECT partman.run_maintenance('public.transactions');

I run run_maintenance daily via pg_cron. It checks whether the next three months of partitions exist and creates any that are missing. For retention, you can configure pg_partman to automatically detach or drop partitions older than a specified threshold:

-- Keep 24 months of data, detach older partitions
UPDATE partman.part_config
SET retention = '24 months',
    retention_keep_table = true  -- detach but don't drop
WHERE parent_table = 'public.transactions';

Warning: Setting retention_keep_table = false will permanently delete data. In payment systems, regulatory requirements often mandate keeping transaction records for 5–7 years. Detach old partitions and move them to cold storage instead of dropping them.

Migrating an Existing Table

This is the part nobody enjoys. You can't just ALTER TABLE an existing unpartitioned table into a partitioned one. The migration requires creating a new partitioned table and moving data over. Here's the approach I've used in production with minimal downtime:

Create the new partitioned table with the same schema (as shown above).

Backfill historical data in batches during off-peak hours:

-- Migrate in monthly chunks to avoid long locks
INSERT INTO transactions_new
SELECT * FROM transactions_old
WHERE created_at >= '2026-01-01'
  AND created_at < '2026-02-01';
-- Repeat for each month

Set up a trigger or logical replication on the old table to capture new writes during migration.

Swap tables in a short transaction once the backfill catches up:

BEGIN;
ALTER TABLE transactions_old RENAME TO transactions_archive;
ALTER TABLE transactions_new RENAME TO transactions;
COMMIT;

Update application connection pools and verify queries hit the new table.

The entire swap step takes milliseconds since it's just metadata changes. The backfill is the slow part, but it runs in the background without blocking reads or writes on the original table.

Partition Key Selection: created_at vs merchant_id

I've seen teams debate this endlessly. Here's my take after running both in production:

created_at (range) — wins for settlement, reconciliation, reporting, and compliance queries. Also makes retention trivial: drop a month, done. This is the right choice for 90% of payment systems.
merchant_id (hash) — useful if your primary access pattern is "give me all transactions for merchant X" and you have thousands of merchants with roughly equal volume. But you lose the ability to efficiently drop old data by time range.
Composite (range + hash sub-partitioning) — the best of both worlds, but adds operational complexity. Only worth it if you genuinely query both dimensions frequently and have the team to manage it.

Gotchas I Learned the Hard Way

Foreign keys referencing partitioned tables are not supported in PostgreSQL (as of version 17). If other tables reference transactions.id, you'll need to handle referential integrity at the application level or use a different approach.
Cross-partition queries without the partition key will scan all partitions. If someone writes SELECT * FROM transactions WHERE merchant_id = '...' without a date filter, they'll hit every partition. Add guardrails in your application layer.
enable_partition_pruning must be on (it's on by default, but verify). Without it, the planner won't skip irrelevant partitions.
Partition-wise joins and aggregates (enable_partitionwise_join, enable_partitionwise_aggregate) are off by default. Turning them on can significantly speed up joins between partitioned tables, but increases planning time.

PostgreSQL Table Partitioning for High-Volume Payment Data

When Partitioning Becomes Necessary

How Partitioning Works

Choosing a Partitioning Strategy

Setting It Up

Partition Pruning in Action

Automating Maintenance with pg_partman

Migrating an Existing Table

Partition Key Selection: created_at vs merchant_id

Gotchas I Learned the Hard Way

References

Further Reading

When Partitioning Becomes Necessary

How Partitioning Works

Choosing a Partitioning Strategy

Setting It Up

Partition Pruning in Action

Automating Maintenance with pg_partman

Migrating an Existing Table

Partition Key Selection: created_at vs merchant_id

Gotchas I Learned the Hard Way

References

Further Reading

Related Articles