Using PostgreSQL JSONB for Flexible Payment Metadata

The Problem with Rigid Schemas

I spent the better part of 2024 working on a payment orchestration platform that routed transactions across four different gateways. Early on, we made the classic mistake: we tried to normalize every field from every provider into dedicated columns. gateway_transaction_id, gateway_status, risk_score, auth_code — the list kept growing.

Every time we integrated a new provider, we'd run an ALTER TABLE to add columns that only applied to that one gateway. The payments table ballooned to 40+ columns, most of them nullable. Queries got confusing. New engineers couldn't tell which columns mattered for which provider. And the migration overhead was slowing down our integration timeline from days to weeks.

The fix was straightforward: keep the core transactional fields relational, and push everything provider-specific into a single JSONB column.

The Data Model

The idea is simple. Your payments table keeps the fields you actually query and join on — amount, currency, status, timestamps — as proper typed columns. Everything else goes into metadata.

Payment Table Schema — Relational + JSONB

Relational Columns (Indexed, Typed)

id (UUID PK)

amount (BIGINT)

currency (VARCHAR)

status (VARCHAR)

provider (VARCHAR)

merchant_id (UUID FK)

created_at (TIMESTAMPTZ)

updated_at (TIMESTAMPTZ)

↓

One row, one payment

JSONB metadata Column (Flexible, GIN-Indexed)

Stripe: charge_id, receipt_url, payment_method_details, radar_outcome

Adyen: pspReference, resultCode, refusalReason, additionalData

Merchant Custom: order_id, line_items[], customer_notes, internal_ref

This gives you the best of both worlds. The relational columns give you type safety, foreign keys, and fast B-tree index lookups. The JSONB column absorbs whatever shape of data each provider throws at you.

The Table Definition

CREATE TABLE payments (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    amount          BIGINT NOT NULL,
    currency        VARCHAR(3) NOT NULL,
    status          VARCHAR(20) NOT NULL DEFAULT 'pending',
    provider        VARCHAR(30) NOT NULL,
    merchant_id     UUID NOT NULL REFERENCES merchants(id),
    metadata        JSONB NOT NULL DEFAULT '{}',
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- GIN index for general JSONB containment queries
CREATE INDEX idx_payments_metadata ON payments USING GIN (metadata);

-- Partial index: fast lookup for Stripe charge IDs specifically
CREATE INDEX idx_payments_stripe_charge
    ON payments ((metadata->>'charge_id'))
    WHERE provider = 'stripe';

-- Partial index: Adyen PSP reference lookups
CREATE INDEX idx_payments_adyen_psp
    ON payments ((metadata->>'pspReference'))
    WHERE provider = 'adyen';

The GIN index supports the @> containment operator, which is your workhorse for JSONB queries. The partial indexes on extracted text values are for the high-frequency lookups you'll actually hit in production — like finding a payment by its gateway reference during reconciliation.

JSONB vs. the Alternatives

Before you reach for JSONB, it's worth understanding what you're trading off against the other common patterns.

Criteria	JSONB Column	Normalized Columns	EAV Pattern
Schema flexibility	High — no migrations for new fields	Low — ALTER TABLE per field	High — but at a steep cost
Query performance	Good with GIN/partial indexes	Best — native B-tree indexes	Poor — requires pivoting/joins
Type safety	Application-level enforcement	Database-enforced	None — everything is text
Storage efficiency	Good — no null columns wasted	Poor with many nullable cols	Poor — row per attribute
Reporting / analytics	Moderate — needs extraction	Best — direct column access	Painful — heavy pivoting
Integration speed	Fast — just store the response	Slow — migration per provider	Fast — but regret comes later
Ideal use case	Provider-varying metadata	Core business fields	Avoid in payments

The short version: use relational columns for anything you filter, join, or aggregate on regularly. Use JSONB for the rest. Don't use EAV in a payment system — I've seen it, and the reconciliation queries alone will make you question your career choices.

Query Patterns That Actually Work

Here are the queries we use daily. The containment operator @> is the one you'll reach for most often because it leverages the GIN index.

-- Find a payment by Stripe charge ID (uses partial index)
SELECT id, amount, status, metadata
FROM payments
WHERE provider = 'stripe'
  AND metadata->>'charge_id' = 'ch_3PqR7s2eZvKYlo2C';

-- Find all payments flagged as elevated risk by any provider
SELECT id, provider, amount, metadata->'risk'->>'score' AS risk_score
FROM payments
WHERE metadata @> '{"risk": {"level": "elevated"}}';

-- Aggregate merchant custom metadata — total by order source
SELECT metadata->>'order_source' AS source,
       COUNT(*) AS payment_count,
       SUM(amount) AS total_amount
FROM payments
WHERE merchant_id = '550e8400-e29b-41d4-a716-446655440000'
  AND metadata ? 'order_source'
GROUP BY metadata->>'order_source';

Reading JSONB in Go

On the application side, we unmarshal the JSONB column into a typed struct per provider. Here's the pattern we settled on after a few iterations.

package payment

import (
	"database/sql"
	"encoding/json"
	"fmt"
	"time"
)

type Payment struct {
	ID         string          `db:"id"`
	Amount     int64           `db:"amount"`
	Currency   string          `db:"currency"`
	Status     string          `db:"status"`
	Provider   string          `db:"provider"`
	MerchantID string          `db:"merchant_id"`
	Metadata   json.RawMessage `db:"metadata"`
	CreatedAt  time.Time       `db:"created_at"`
}

type StripeMetadata struct {
	ChargeID       string `json:"charge_id"`
	ReceiptURL     string `json:"receipt_url"`
	PaymentMethod  string `json:"payment_method"`
	RadarOutcome   string `json:"radar_outcome,omitempty"`
}

type AdyenMetadata struct {
	PSPReference  string `json:"pspReference"`
	ResultCode    string `json:"resultCode"`
	RefusalReason string `json:"refusalReason,omitempty"`
}

func (p *Payment) StripeData() (*StripeMetadata, error) {
	if p.Provider != "stripe" {
		return nil, fmt.Errorf("payment %s is not a stripe payment", p.ID)
	}
	var m StripeMetadata
	if err := json.Unmarshal(p.Metadata, &m); err != nil {
		return nil, fmt.Errorf("unmarshal stripe metadata: %w", err)
	}
	return &m, nil
}

// Storing a payment with metadata
func InsertPayment(db *sql.DB, p *Payment) error {
	query := `
		INSERT INTO payments (amount, currency, status, provider, merchant_id, metadata)
		VALUES ($1, $2, $3, $4, $5, $6)
		RETURNING id, created_at`

	return db.QueryRow(query,
		p.Amount, p.Currency, p.Status, p.Provider, p.MerchantID, p.Metadata,
	).Scan(&p.ID, &p.CreatedAt)
}

The key detail: we store Metadata as json.RawMessage on the base struct. This avoids deserializing it until we actually need the provider-specific fields. For high-throughput listing endpoints where you're just returning payment summaries, you skip the unmarshal entirely.

Performance in Practice

After migrating from the 40-column approach to the relational + JSONB model, we measured the impact across our production workload (roughly 2M payments, PostgreSQL 16 on RDS db.r6g.xlarge).

3.2ms

Avg query time with partial index on gateway reference lookups

38%

Table size reduction after dropping 26 nullable columns

Schema migrations needed for last 3 provider integrations

The GIN index does add write overhead — roughly 10-15% slower inserts compared to a table with no JSONB index. For our throughput (a few hundred TPS), that was negligible. If you're processing tens of thousands of TPS, benchmark the GIN index cost carefully and consider whether partial indexes alone are sufficient.

Migration Strategy

If you're sitting on a wide payments table and want to move to this model, here's the approach that worked for us without downtime.

Add the metadata JSONB DEFAULT '{}' column. This is a non-blocking operation in PostgreSQL — no table rewrite.
Deploy application code that writes to both the old columns and the new JSONB column (dual-write).
Run a backfill migration that copies existing column values into the JSONB field for historical rows. Do this in batches of 5,000-10,000 to avoid long-running transactions.
Switch reads to pull from JSONB. Verify with shadow reads if you're cautious.
Drop the old columns in a later release once you've confirmed everything is stable.

-- Backfill in batches (run in a loop from your migration script)
UPDATE payments
SET metadata = jsonb_build_object(
    'charge_id', stripe_charge_id,
    'receipt_url', stripe_receipt_url,
    'radar_outcome', stripe_radar_outcome
)
WHERE provider = 'stripe'
  AND metadata = '{}'
  AND id > $1  -- cursor-based pagination
ORDER BY id
LIMIT 10000;

When Not to Use JSONB

JSONB isn't a silver bullet. Keep data in proper columns when:

You need foreign key constraints referencing the value.
The field is used in WHERE clauses on the majority of your queries — a B-tree index on a typed column will always outperform a GIN index on a JSONB path.
You need CHECK constraints or NOT NULL enforcement at the database level.
The field is part of your core domain model, not provider-specific metadata.

A good rule of thumb: if you'd put the field in a WHERE clause in more than 30% of your queries against that table, it deserves its own column. Everything else is a JSONB candidate.

References

Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Pricing and features mentioned are subject to change — always verify with official documentation.

The Problem with Rigid Schemas

The Data Model

The Table Definition

JSONB vs. the Alternatives

Query Patterns That Actually Work

Reading JSONB in Go

Performance in Practice

Migration Strategy

When Not to Use JSONB

References

Related Articles