The Problem with Rigid Schemas
I spent the better part of 2024 working on a payment orchestration platform that routed transactions across four different gateways. Early on, we made the classic mistake: we tried to normalize every field from every provider into dedicated columns. gateway_transaction_id, gateway_status, risk_score, auth_code — the list kept growing.
Every time we integrated a new provider, we'd run an ALTER TABLE to add columns that only applied to that one gateway. The payments table ballooned to 40+ columns, most of them nullable. Queries got confusing. New engineers couldn't tell which columns mattered for which provider. And the migration overhead was slowing down our integration timeline from days to weeks.
The fix was straightforward: keep the core transactional fields relational, and push everything provider-specific into a single JSONB column.
The Data Model
The idea is simple. Your payments table keeps the fields you actually query and join on — amount, currency, status, timestamps — as proper typed columns. Everything else goes into metadata.
This gives you the best of both worlds. The relational columns give you type safety, foreign keys, and fast B-tree index lookups. The JSONB column absorbs whatever shape of data each provider throws at you.
The Table Definition
CREATE TABLE payments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
amount BIGINT NOT NULL,
currency VARCHAR(3) NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'pending',
provider VARCHAR(30) NOT NULL,
merchant_id UUID NOT NULL REFERENCES merchants(id),
metadata JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- GIN index for general JSONB containment queries
CREATE INDEX idx_payments_metadata ON payments USING GIN (metadata);
-- Partial index: fast lookup for Stripe charge IDs specifically
CREATE INDEX idx_payments_stripe_charge
ON payments ((metadata->>'charge_id'))
WHERE provider = 'stripe';
-- Partial index: Adyen PSP reference lookups
CREATE INDEX idx_payments_adyen_psp
ON payments ((metadata->>'pspReference'))
WHERE provider = 'adyen';
The GIN index supports the @> containment operator, which is your workhorse for JSONB queries. The partial indexes on extracted text values are for the high-frequency lookups you'll actually hit in production — like finding a payment by its gateway reference during reconciliation.
JSONB vs. the Alternatives
Before you reach for JSONB, it's worth understanding what you're trading off against the other common patterns.
| Criteria | JSONB Column | Normalized Columns | EAV Pattern |
|---|---|---|---|
| Schema flexibility | High — no migrations for new fields | Low — ALTER TABLE per field | High — but at a steep cost |
| Query performance | Good with GIN/partial indexes | Best — native B-tree indexes | Poor — requires pivoting/joins |
| Type safety | Application-level enforcement | Database-enforced | None — everything is text |
| Storage efficiency | Good — no null columns wasted | Poor with many nullable cols | Poor — row per attribute |
| Reporting / analytics | Moderate — needs extraction | Best — direct column access | Painful — heavy pivoting |
| Integration speed | Fast — just store the response | Slow — migration per provider | Fast — but regret comes later |
| Ideal use case | Provider-varying metadata | Core business fields | Avoid in payments |
The short version: use relational columns for anything you filter, join, or aggregate on regularly. Use JSONB for the rest. Don't use EAV in a payment system — I've seen it, and the reconciliation queries alone will make you question your career choices.
Query Patterns That Actually Work
Here are the queries we use daily. The containment operator @> is the one you'll reach for most often because it leverages the GIN index.
-- Find a payment by Stripe charge ID (uses partial index)
SELECT id, amount, status, metadata
FROM payments
WHERE provider = 'stripe'
AND metadata->>'charge_id' = 'ch_3PqR7s2eZvKYlo2C';
-- Find all payments flagged as elevated risk by any provider
SELECT id, provider, amount, metadata->'risk'->>'score' AS risk_score
FROM payments
WHERE metadata @> '{"risk": {"level": "elevated"}}';
-- Aggregate merchant custom metadata — total by order source
SELECT metadata->>'order_source' AS source,
COUNT(*) AS payment_count,
SUM(amount) AS total_amount
FROM payments
WHERE merchant_id = '550e8400-e29b-41d4-a716-446655440000'
AND metadata ? 'order_source'
GROUP BY metadata->>'order_source';
Reading JSONB in Go
On the application side, we unmarshal the JSONB column into a typed struct per provider. Here's the pattern we settled on after a few iterations.
package payment
import (
"database/sql"
"encoding/json"
"fmt"
"time"
)
type Payment struct {
ID string `db:"id"`
Amount int64 `db:"amount"`
Currency string `db:"currency"`
Status string `db:"status"`
Provider string `db:"provider"`
MerchantID string `db:"merchant_id"`
Metadata json.RawMessage `db:"metadata"`
CreatedAt time.Time `db:"created_at"`
}
type StripeMetadata struct {
ChargeID string `json:"charge_id"`
ReceiptURL string `json:"receipt_url"`
PaymentMethod string `json:"payment_method"`
RadarOutcome string `json:"radar_outcome,omitempty"`
}
type AdyenMetadata struct {
PSPReference string `json:"pspReference"`
ResultCode string `json:"resultCode"`
RefusalReason string `json:"refusalReason,omitempty"`
}
func (p *Payment) StripeData() (*StripeMetadata, error) {
if p.Provider != "stripe" {
return nil, fmt.Errorf("payment %s is not a stripe payment", p.ID)
}
var m StripeMetadata
if err := json.Unmarshal(p.Metadata, &m); err != nil {
return nil, fmt.Errorf("unmarshal stripe metadata: %w", err)
}
return &m, nil
}
// Storing a payment with metadata
func InsertPayment(db *sql.DB, p *Payment) error {
query := `
INSERT INTO payments (amount, currency, status, provider, merchant_id, metadata)
VALUES ($1, $2, $3, $4, $5, $6)
RETURNING id, created_at`
return db.QueryRow(query,
p.Amount, p.Currency, p.Status, p.Provider, p.MerchantID, p.Metadata,
).Scan(&p.ID, &p.CreatedAt)
}
The key detail: we store Metadata as json.RawMessage on the base struct. This avoids deserializing it until we actually need the provider-specific fields. For high-throughput listing endpoints where you're just returning payment summaries, you skip the unmarshal entirely.
Performance in Practice
After migrating from the 40-column approach to the relational + JSONB model, we measured the impact across our production workload (roughly 2M payments, PostgreSQL 16 on RDS db.r6g.xlarge).
The GIN index does add write overhead — roughly 10-15% slower inserts compared to a table with no JSONB index. For our throughput (a few hundred TPS), that was negligible. If you're processing tens of thousands of TPS, benchmark the GIN index cost carefully and consider whether partial indexes alone are sufficient.
Migration Strategy
If you're sitting on a wide payments table and want to move to this model, here's the approach that worked for us without downtime.
- Add the
metadata JSONB DEFAULT '{}'column. This is a non-blocking operation in PostgreSQL — no table rewrite. - Deploy application code that writes to both the old columns and the new JSONB column (dual-write).
- Run a backfill migration that copies existing column values into the JSONB field for historical rows. Do this in batches of 5,000-10,000 to avoid long-running transactions.
- Switch reads to pull from JSONB. Verify with shadow reads if you're cautious.
- Drop the old columns in a later release once you've confirmed everything is stable.
-- Backfill in batches (run in a loop from your migration script)
UPDATE payments
SET metadata = jsonb_build_object(
'charge_id', stripe_charge_id,
'receipt_url', stripe_receipt_url,
'radar_outcome', stripe_radar_outcome
)
WHERE provider = 'stripe'
AND metadata = '{}'
AND id > $1 -- cursor-based pagination
ORDER BY id
LIMIT 10000;
When Not to Use JSONB
JSONB isn't a silver bullet. Keep data in proper columns when:
- You need foreign key constraints referencing the value.
- The field is used in
WHEREclauses on the majority of your queries — a B-tree index on a typed column will always outperform a GIN index on a JSONB path. - You need
CHECKconstraints orNOT NULLenforcement at the database level. - The field is part of your core domain model, not provider-specific metadata.
A good rule of thumb: if you'd put the field in a WHERE clause in more than 30% of your queries against that table, it deserves its own column. Everything else is a JSONB candidate.
References
- PostgreSQL Documentation — JSON Types
- PostgreSQL Documentation — JSON Functions and Operators
- PostgreSQL Documentation — GIN Indexes
- PostgreSQL Documentation — Partial Indexes
- PostgreSQL Documentation — ALTER TABLE
Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Pricing and features mentioned are subject to change — always verify with official documentation.