April 8, 2026 10 min read

Event Sourcing for Payment Systems — Why Your Transaction Log Is Your Most Valuable Asset

Most payment systems store only the current state of a transaction — a single row that gets mutated as the payment moves through its lifecycle. Event sourcing flips that model: every state change becomes an immutable event, and the current state is just a projection of those events. When money is involved, that distinction matters more than you might think.

Why Traditional CRUD Falls Short for Payments

In a typical CRUD-based payment system, you have a payments table with columns like status, amount, and updated_at. When a payment moves from AUTHORIZED to CAPTURED, you run an UPDATE statement. The old status is gone. If someone asks "when exactly was this payment authorized, and by which service?" you're digging through application logs and hoping they haven't been rotated.

I learned this the hard way during a reconciliation incident. A merchant disputed a batch of transactions, claiming they were captured without authorization. Our payments table showed them as CAPTURED — but we had no record of the authorization step. The application logs had rolled over. We spent three days reconstructing the timeline from gateway API logs and database WAL archives. That was the week I started taking event sourcing seriously.

The core problem with CRUD for payments comes down to three things:

CRUD Approach
payments table UPDATE status='AUTH' UPDATE status='CAPT' History lost
Event Sourcing
PaymentInitiated PaymentAuthorized PaymentCaptured PaymentSettled Full history preserved

The Event Sourcing Model

Event sourcing is deceptively simple in concept. Instead of storing the current state of an entity, you store the sequence of events that led to that state. The current state is derived by replaying those events in order. That's it.

For a payment, this means you don't have a status column that gets updated. Instead, you have an append-only log of events: PaymentInitiated, PaymentAuthorized, PaymentCaptured, PaymentSettled. The "current status" is whatever the last event says it is. But you also know exactly when each transition happened, what data was associated with it, and you can reconstruct the state at any point in time.

Three principles make this work in practice:

  1. Events are immutable. Once written, an event is never modified or deleted. If something needs to be corrected, you append a compensating event (like PaymentRefunded or AuthorizationVoided).
  2. Events are the source of truth. The event log is the canonical data. Everything else — read models, dashboards, reports — is a projection derived from events.
  3. Order matters. Events have a sequence number within their aggregate (the payment). Replaying them out of order produces incorrect state.

Key insight: In a payment system, your event log isn't just a technical pattern — it's your audit trail, your dispute resolution tool, and your debugging superpower, all in one. Every regulator, auditor, and on-call engineer will thank you for it.

Designing Your Payment Event Store

The event store is the heart of the system. I've found that a single PostgreSQL table works remarkably well for most payment platforms — you don't need a specialized event store database until you're processing millions of events per second.

Here's the schema I've used across two production systems:

CREATE TABLE payment_events (
    event_id     UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    payment_id   UUID NOT NULL,
    event_type   VARCHAR(50) NOT NULL,
    event_data   JSONB NOT NULL,
    version      INTEGER NOT NULL,
    created_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
    created_by   VARCHAR(100) NOT NULL,
    UNIQUE (payment_id, version)
);

CREATE INDEX idx_payment_events_payment_id
    ON payment_events (payment_id, version ASC);

The UNIQUE (payment_id, version) constraint is critical. It gives you optimistic concurrency control for free — if two processes try to append an event with the same version for the same payment, one of them gets a unique violation and has to retry. No distributed locks needed.

Event Types in Go

On the application side, I model events as a base struct with typed payloads. This keeps serialization clean and makes it easy to add new event types without breaking existing code:

type PaymentEvent struct {
    EventID   uuid.UUID       `json:"event_id"`
    PaymentID uuid.UUID       `json:"payment_id"`
    Type      EventType       `json:"event_type"`
    Data      json.RawMessage `json:"event_data"`
    Version   int             `json:"version"`
    CreatedAt time.Time       `json:"created_at"`
    CreatedBy string          `json:"created_by"`
}

type EventType string

const (
    EventPaymentInitiated  EventType = "PaymentInitiated"
    EventPaymentAuthorized EventType = "PaymentAuthorized"
    EventPaymentCaptured   EventType = "PaymentCaptured"
    EventPaymentSettled    EventType = "PaymentSettled"
    EventPaymentRefunded   EventType = "PaymentRefunded"
    EventPaymentFailed     EventType = "PaymentFailed"
)

type PaymentInitiatedData struct {
    Amount       int64  `json:"amount"`
    Currency     string `json:"currency"`
    MerchantID   string `json:"merchant_id"`
    CustomerID   string `json:"customer_id"`
    CardToken    string `json:"card_token"`
    IdempotencyKey string `json:"idempotency_key"`
}

type PaymentAuthorizedData struct {
    GatewayRef   string `json:"gateway_ref"`
    AuthCode     string `json:"auth_code"`
    AVSResult    string `json:"avs_result"`
    CVVResult    string `json:"cvv_result"`
}

Using json.RawMessage for the Data field means you can deserialize the base event without knowing the payload type, then unmarshal the payload based on Type. This is important when you're replaying events and need to handle types that might not exist in older versions of your code.

Appending Events

The append function is where optimistic concurrency kicks in:

func (s *EventStore) Append(ctx context.Context, event PaymentEvent) error {
    _, err := s.db.ExecContext(ctx, `
        INSERT INTO payment_events
            (event_id, payment_id, event_type, event_data, version, created_by)
        VALUES ($1, $2, $3, $4, $5, $6)`,
        event.EventID, event.PaymentID, event.Type,
        event.Data, event.Version, event.CreatedBy,
    )
    if err != nil {
        var pgErr *pgconn.PgError
        if errors.As(err, &pgErr) && pgErr.Code == "23505" {
            return ErrConcurrencyConflict
        }
        return fmt.Errorf("append event: %w", err)
    }
    return nil
}

Projections and Read Models

Storing events is only half the story. Your API endpoints and dashboards don't want to replay 50 events every time someone checks a payment status. That's where projections come in — materialized read models that are built by processing the event stream.

The simplest projection is a "current state" view. You subscribe to the event stream (or poll the event table) and maintain a denormalized payment_summary table:

func (p *PaymentProjection) Apply(event PaymentEvent) error {
    switch event.Type {
    case EventPaymentInitiated:
        var data PaymentInitiatedData
        if err := json.Unmarshal(event.Data, &data); err != nil {
            return err
        }
        return p.db.ExecContext(p.ctx, `
            INSERT INTO payment_summary
                (payment_id, status, amount, currency, merchant_id, created_at)
            VALUES ($1, 'INITIATED', $2, $3, $4, $5)`,
            event.PaymentID, data.Amount, data.Currency,
            data.MerchantID, event.CreatedAt,
        )

    case EventPaymentAuthorized:
        return p.db.ExecContext(p.ctx, `
            UPDATE payment_summary
            SET status = 'AUTHORIZED', updated_at = $2
            WHERE payment_id = $1`,
            event.PaymentID, event.CreatedAt,
        )

    case EventPaymentCaptured:
        return p.db.ExecContext(p.ctx, `
            UPDATE payment_summary
            SET status = 'CAPTURED', updated_at = $2
            WHERE payment_id = $1`,
            event.PaymentID, event.CreatedAt,
        )
    }
    return nil
}

The key mental shift: the projection is disposable. If it gets corrupted or you need to change its schema, you drop the table, replay all events, and rebuild it from scratch. The events are the truth; the projection is just a convenient lens.

Payment Event Flow
1
Initiated
Customer
submits
2
Authorized
Gateway
approves
3
Captured
Funds
reserved
4
Settled
Money
transferred
5
Projected
Read model
updated
Each step appends an immutable event — projections materialize the current state for queries

Balance Calculations

One of the most powerful projections in a payment system is a running balance. Instead of querying the events table with a SUM every time, you maintain a merchant_balances projection that updates incrementally as capture and refund events flow through:

case EventPaymentCaptured:
    return p.db.ExecContext(p.ctx, `
        UPDATE merchant_balances
        SET available_balance = available_balance + $2,
            last_updated = $3
        WHERE merchant_id = $1`,
        merchantID, amount, event.CreatedAt,
    )

case EventPaymentRefunded:
    return p.db.ExecContext(p.ctx, `
        UPDATE merchant_balances
        SET available_balance = available_balance - $2,
            last_updated = $3
        WHERE merchant_id = $1`,
        merchantID, amount, event.CreatedAt,
    )

If the balance ever looks wrong, you don't debug the projection logic in production. You replay the events into a fresh table and diff the results. Nine times out of ten, the bug becomes obvious when you can see exactly which event produced the wrong state transition.

Replay and Recovery

This is where event sourcing really earns its keep. Replay — the ability to rebuild any projection from scratch by re-processing the event log — gives you capabilities that are simply impossible with CRUD.

Rebuilding state after a bug. We once shipped a projection bug that miscalculated settlement amounts for partial refunds. In a CRUD system, we'd have been writing migration scripts and manually reconciling data. With event sourcing, we fixed the projection code, dropped the settlement table, and replayed three months of events. The correct balances materialized in about 40 minutes. Zero manual intervention.

Time-travel debugging. When a merchant reports that a payment "disappeared," you can replay events for that payment ID up to any point in time and see exactly what state the system thought it was in. This has saved us countless hours in support escalations.

Retroactive features. Need a new report that shows average authorization-to-capture time by merchant? You don't need to start collecting that data from today. You replay your existing events through a new projection and get historical data going back to day one.

The replay function itself is straightforward:

func (s *EventStore) ReplayAll(ctx context.Context, handler func(PaymentEvent) error) error {
    rows, err := s.db.QueryContext(ctx, `
        SELECT event_id, payment_id, event_type, event_data,
               version, created_at, created_by
        FROM payment_events
        ORDER BY created_at ASC, version ASC`)
    if err != nil {
        return err
    }
    defer rows.Close()

    for rows.Next() {
        var event PaymentEvent
        if err := rows.Scan(
            &event.EventID, &event.PaymentID, &event.Type,
            &event.Data, &event.Version, &event.CreatedAt,
            &event.CreatedBy,
        ); err != nil {
            return err
        }
        if err := handler(event); err != nil {
            return fmt.Errorf("replay event %s: %w", event.EventID, err)
        }
    }
    return rows.Err()
}

In production, you'll want to batch this (process 1,000 events at a time), add progress logging, and support resuming from a checkpoint. But the core idea stays the same: read events in order, apply them to a projection.

The Trade-offs Nobody Mentions

Event sourcing isn't free. After running it in production for a couple of years, here are the trade-offs that don't show up in the conference talks:

Eventual consistency is real. Your projections lag behind the event log. In practice, this lag is milliseconds — but it means a customer might submit a payment and not see it in their dashboard for a brief moment. You need to design your UI for this, either with optimistic updates or by reading from the event log directly for critical paths.

Storage grows linearly. Every state change is a new row. A payment that goes through initiation, authorization, capture, and settlement is four rows instead of one. For high-volume systems, this adds up. We partition our event table by month and archive older partitions to cold storage. PostgreSQL's native partitioning handles this well.

Schema evolution is tricky. When you need to add a field to PaymentInitiatedData, old events won't have it. Your projection code needs to handle both the old and new schema. I use a version field in the event data and explicit migration functions for each version bump. It's more work than adding a nullable column, but it keeps the event log honest.

Querying events directly is slow. You can't efficiently ask "show me all payments over $1,000 that were refunded last week" against the event table. That's what projections are for — but it means you need to think about your read patterns upfront and build projections for them.

Aspect CRUD Event Sourcing
Audit trail Requires separate logging Built-in by design
Debugging Current state only Full history, time-travel
Storage One row per entity N rows per entity (grows)
Consistency Immediate (strong) Eventual for read models
Schema changes ALTER TABLE migration Event versioning required
Bug recovery Manual data fixes Fix code, replay events
Complexity Low (familiar pattern) Higher (projections, replay)
Retroactive reports Only from collection date Replay from day one

Despite these trade-offs, I'd choose event sourcing for any new payment system. The audit trail alone justifies the complexity. When a regulator asks "show me every state change for this transaction," you hand them a query result instead of a forensic investigation. When a bug corrupts derived data, you fix the code and replay instead of writing one-off migration scripts at 2 AM. The upfront investment pays for itself the first time you avoid a multi-day reconciliation incident.

References

Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Code examples are simplified for clarity — always review and adapt for your specific use case and security requirements. This is not financial or legal advice.