Event Sourcing for Payment Systems — Why Your Transaction Log Is Your Most Valuable Asset
Most payment systems store only the current state of a transaction — a single row that gets mutated as the payment moves through its lifecycle. Event sourcing flips that model: every state change becomes an immutable event, and the current state is just a projection of those events. When money is involved, that distinction matters more than you might think.
Why Traditional CRUD Falls Short for Payments
In a typical CRUD-based payment system, you have a payments table with columns like status, amount, and updated_at. When a payment moves from AUTHORIZED to CAPTURED, you run an UPDATE statement. The old status is gone. If someone asks "when exactly was this payment authorized, and by which service?" you're digging through application logs and hoping they haven't been rotated.
I learned this the hard way during a reconciliation incident. A merchant disputed a batch of transactions, claiming they were captured without authorization. Our payments table showed them as CAPTURED — but we had no record of the authorization step. The application logs had rolled over. We spent three days reconstructing the timeline from gateway API logs and database WAL archives. That was the week I started taking event sourcing seriously.
The core problem with CRUD for payments comes down to three things:
- State mutations destroy history. An UPDATE overwrites the previous value. You lose the "who, when, and why" of every transition.
- Audit gaps are invisible. You don't know what you've lost until someone asks for it. By then, it's too late.
- Debugging production issues is archaeology. When a payment is stuck in a weird state, you need the full sequence of events that got it there — not just the final snapshot.
The Event Sourcing Model
Event sourcing is deceptively simple in concept. Instead of storing the current state of an entity, you store the sequence of events that led to that state. The current state is derived by replaying those events in order. That's it.
For a payment, this means you don't have a status column that gets updated. Instead, you have an append-only log of events: PaymentInitiated, PaymentAuthorized, PaymentCaptured, PaymentSettled. The "current status" is whatever the last event says it is. But you also know exactly when each transition happened, what data was associated with it, and you can reconstruct the state at any point in time.
Three principles make this work in practice:
- Events are immutable. Once written, an event is never modified or deleted. If something needs to be corrected, you append a compensating event (like
PaymentRefundedorAuthorizationVoided). - Events are the source of truth. The event log is the canonical data. Everything else — read models, dashboards, reports — is a projection derived from events.
- Order matters. Events have a sequence number within their aggregate (the payment). Replaying them out of order produces incorrect state.
Key insight: In a payment system, your event log isn't just a technical pattern — it's your audit trail, your dispute resolution tool, and your debugging superpower, all in one. Every regulator, auditor, and on-call engineer will thank you for it.
Designing Your Payment Event Store
The event store is the heart of the system. I've found that a single PostgreSQL table works remarkably well for most payment platforms — you don't need a specialized event store database until you're processing millions of events per second.
Here's the schema I've used across two production systems:
CREATE TABLE payment_events (
event_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
payment_id UUID NOT NULL,
event_type VARCHAR(50) NOT NULL,
event_data JSONB NOT NULL,
version INTEGER NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
created_by VARCHAR(100) NOT NULL,
UNIQUE (payment_id, version)
);
CREATE INDEX idx_payment_events_payment_id
ON payment_events (payment_id, version ASC);
The UNIQUE (payment_id, version) constraint is critical. It gives you optimistic concurrency control for free — if two processes try to append an event with the same version for the same payment, one of them gets a unique violation and has to retry. No distributed locks needed.
Event Types in Go
On the application side, I model events as a base struct with typed payloads. This keeps serialization clean and makes it easy to add new event types without breaking existing code:
type PaymentEvent struct {
EventID uuid.UUID `json:"event_id"`
PaymentID uuid.UUID `json:"payment_id"`
Type EventType `json:"event_type"`
Data json.RawMessage `json:"event_data"`
Version int `json:"version"`
CreatedAt time.Time `json:"created_at"`
CreatedBy string `json:"created_by"`
}
type EventType string
const (
EventPaymentInitiated EventType = "PaymentInitiated"
EventPaymentAuthorized EventType = "PaymentAuthorized"
EventPaymentCaptured EventType = "PaymentCaptured"
EventPaymentSettled EventType = "PaymentSettled"
EventPaymentRefunded EventType = "PaymentRefunded"
EventPaymentFailed EventType = "PaymentFailed"
)
type PaymentInitiatedData struct {
Amount int64 `json:"amount"`
Currency string `json:"currency"`
MerchantID string `json:"merchant_id"`
CustomerID string `json:"customer_id"`
CardToken string `json:"card_token"`
IdempotencyKey string `json:"idempotency_key"`
}
type PaymentAuthorizedData struct {
GatewayRef string `json:"gateway_ref"`
AuthCode string `json:"auth_code"`
AVSResult string `json:"avs_result"`
CVVResult string `json:"cvv_result"`
}
Using json.RawMessage for the Data field means you can deserialize the base event without knowing the payload type, then unmarshal the payload based on Type. This is important when you're replaying events and need to handle types that might not exist in older versions of your code.
Appending Events
The append function is where optimistic concurrency kicks in:
func (s *EventStore) Append(ctx context.Context, event PaymentEvent) error {
_, err := s.db.ExecContext(ctx, `
INSERT INTO payment_events
(event_id, payment_id, event_type, event_data, version, created_by)
VALUES ($1, $2, $3, $4, $5, $6)`,
event.EventID, event.PaymentID, event.Type,
event.Data, event.Version, event.CreatedBy,
)
if err != nil {
var pgErr *pgconn.PgError
if errors.As(err, &pgErr) && pgErr.Code == "23505" {
return ErrConcurrencyConflict
}
return fmt.Errorf("append event: %w", err)
}
return nil
}
Projections and Read Models
Storing events is only half the story. Your API endpoints and dashboards don't want to replay 50 events every time someone checks a payment status. That's where projections come in — materialized read models that are built by processing the event stream.
The simplest projection is a "current state" view. You subscribe to the event stream (or poll the event table) and maintain a denormalized payment_summary table:
func (p *PaymentProjection) Apply(event PaymentEvent) error {
switch event.Type {
case EventPaymentInitiated:
var data PaymentInitiatedData
if err := json.Unmarshal(event.Data, &data); err != nil {
return err
}
return p.db.ExecContext(p.ctx, `
INSERT INTO payment_summary
(payment_id, status, amount, currency, merchant_id, created_at)
VALUES ($1, 'INITIATED', $2, $3, $4, $5)`,
event.PaymentID, data.Amount, data.Currency,
data.MerchantID, event.CreatedAt,
)
case EventPaymentAuthorized:
return p.db.ExecContext(p.ctx, `
UPDATE payment_summary
SET status = 'AUTHORIZED', updated_at = $2
WHERE payment_id = $1`,
event.PaymentID, event.CreatedAt,
)
case EventPaymentCaptured:
return p.db.ExecContext(p.ctx, `
UPDATE payment_summary
SET status = 'CAPTURED', updated_at = $2
WHERE payment_id = $1`,
event.PaymentID, event.CreatedAt,
)
}
return nil
}
The key mental shift: the projection is disposable. If it gets corrupted or you need to change its schema, you drop the table, replay all events, and rebuild it from scratch. The events are the truth; the projection is just a convenient lens.
submits
approves
reserved
transferred
updated
Balance Calculations
One of the most powerful projections in a payment system is a running balance. Instead of querying the events table with a SUM every time, you maintain a merchant_balances projection that updates incrementally as capture and refund events flow through:
case EventPaymentCaptured:
return p.db.ExecContext(p.ctx, `
UPDATE merchant_balances
SET available_balance = available_balance + $2,
last_updated = $3
WHERE merchant_id = $1`,
merchantID, amount, event.CreatedAt,
)
case EventPaymentRefunded:
return p.db.ExecContext(p.ctx, `
UPDATE merchant_balances
SET available_balance = available_balance - $2,
last_updated = $3
WHERE merchant_id = $1`,
merchantID, amount, event.CreatedAt,
)
If the balance ever looks wrong, you don't debug the projection logic in production. You replay the events into a fresh table and diff the results. Nine times out of ten, the bug becomes obvious when you can see exactly which event produced the wrong state transition.
Replay and Recovery
This is where event sourcing really earns its keep. Replay — the ability to rebuild any projection from scratch by re-processing the event log — gives you capabilities that are simply impossible with CRUD.
Rebuilding state after a bug. We once shipped a projection bug that miscalculated settlement amounts for partial refunds. In a CRUD system, we'd have been writing migration scripts and manually reconciling data. With event sourcing, we fixed the projection code, dropped the settlement table, and replayed three months of events. The correct balances materialized in about 40 minutes. Zero manual intervention.
Time-travel debugging. When a merchant reports that a payment "disappeared," you can replay events for that payment ID up to any point in time and see exactly what state the system thought it was in. This has saved us countless hours in support escalations.
Retroactive features. Need a new report that shows average authorization-to-capture time by merchant? You don't need to start collecting that data from today. You replay your existing events through a new projection and get historical data going back to day one.
The replay function itself is straightforward:
func (s *EventStore) ReplayAll(ctx context.Context, handler func(PaymentEvent) error) error {
rows, err := s.db.QueryContext(ctx, `
SELECT event_id, payment_id, event_type, event_data,
version, created_at, created_by
FROM payment_events
ORDER BY created_at ASC, version ASC`)
if err != nil {
return err
}
defer rows.Close()
for rows.Next() {
var event PaymentEvent
if err := rows.Scan(
&event.EventID, &event.PaymentID, &event.Type,
&event.Data, &event.Version, &event.CreatedAt,
&event.CreatedBy,
); err != nil {
return err
}
if err := handler(event); err != nil {
return fmt.Errorf("replay event %s: %w", event.EventID, err)
}
}
return rows.Err()
}
In production, you'll want to batch this (process 1,000 events at a time), add progress logging, and support resuming from a checkpoint. But the core idea stays the same: read events in order, apply them to a projection.
The Trade-offs Nobody Mentions
Event sourcing isn't free. After running it in production for a couple of years, here are the trade-offs that don't show up in the conference talks:
Eventual consistency is real. Your projections lag behind the event log. In practice, this lag is milliseconds — but it means a customer might submit a payment and not see it in their dashboard for a brief moment. You need to design your UI for this, either with optimistic updates or by reading from the event log directly for critical paths.
Storage grows linearly. Every state change is a new row. A payment that goes through initiation, authorization, capture, and settlement is four rows instead of one. For high-volume systems, this adds up. We partition our event table by month and archive older partitions to cold storage. PostgreSQL's native partitioning handles this well.
Schema evolution is tricky. When you need to add a field to PaymentInitiatedData, old events won't have it. Your projection code needs to handle both the old and new schema. I use a version field in the event data and explicit migration functions for each version bump. It's more work than adding a nullable column, but it keeps the event log honest.
Querying events directly is slow. You can't efficiently ask "show me all payments over $1,000 that were refunded last week" against the event table. That's what projections are for — but it means you need to think about your read patterns upfront and build projections for them.
Despite these trade-offs, I'd choose event sourcing for any new payment system. The audit trail alone justifies the complexity. When a regulator asks "show me every state change for this transaction," you hand them a query result instead of a forensic investigation. When a bug corrupts derived data, you fix the code and replay instead of writing one-off migration scripts at 2 AM. The upfront investment pays for itself the first time you avoid a multi-day reconciliation incident.
References
- Martin Fowler — Event Sourcing Pattern
- PostgreSQL Documentation — JSON Types (JSONB)
- Go Standard Library — encoding/json Package
- Go Standard Library — database/sql Package
Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Code examples are simplified for clarity — always review and adapt for your specific use case and security requirements. This is not financial or legal advice.