Engineering Compliance Audit Trails in FinTech

What Regulators Actually Want

After going through audits with MAS (Monetary Authority of Singapore), external SOX auditors, and PCI DSS assessors, I can tell you the questions always boil down to the same four things: who performed the action, what exactly changed, when it happened, and can you prove nobody tampered with this record.

MAS TRM (Technology Risk Management) guidelines are explicit about this. Section 9.1 requires financial institutions to maintain audit trails for all critical system activities with sufficient detail to facilitate investigations. SOX Section 404 demands that internal controls over financial reporting are documented and verifiable. PCI DSS Requirement 10 goes even further — you need to track all access to network resources and cardholder data, with logs that are tamper-evident and retained for at least one year.

The common thread? Regulators don't care about your fancy observability stack. They want a single, authoritative source of truth that they can query, verify, and trust. Application logs in Elasticsearch won't cut it — those are mutable, deletable, and lack cryptographic integrity.

Lesson learned the hard way: During our MAS audit, the examiner specifically asked us to prove that a log entry from 14 months ago hadn't been modified after the fact. If you can't answer that question in under 5 minutes, your audit trail isn't compliant — it's just logging.

The Append-Only Event Log

The foundation of a proper audit trail is an append-only event log. No updates, no deletes — ever. Every state change in your system produces an immutable record. We store ours in PostgreSQL because it gives us ACID guarantees, strong typing, and the query flexibility auditors need.

Here's the schema we use in production:

CREATE TABLE audit_events (
    id              BIGSERIAL PRIMARY KEY,
    entity_type     VARCHAR(100)  NOT NULL,  -- 'account', 'transaction', 'user'
    entity_id       VARCHAR(255)  NOT NULL,
    action          VARCHAR(50)   NOT NULL,  -- 'balance_update', 'login', 'config_change'
    actor_id        VARCHAR(255)  NOT NULL,
    actor_ip        INET          NOT NULL,
    payload_before  JSONB,
    payload_after   JSONB,
    hash            CHAR(64)      NOT NULL,  -- SHA-256 hex
    prev_hash       CHAR(64)      NOT NULL,  -- chain link to previous event
    created_at      TIMESTAMPTZ   NOT NULL DEFAULT NOW()
);

-- Prevent any modifications
REVOKE UPDATE, DELETE ON audit_events FROM app_service;
GRANT INSERT, SELECT ON audit_events TO app_service;

The key detail is the permission model. The application service account can only INSERT and SELECT. No UPDATE, no DELETE. We also set up a separate database role for auditors with read-only access. The DBA role that could modify rows is locked behind MFA and requires two-person approval — and that access itself is audit-logged by our PAM system.

Computing the Hash Chain in Go

Each event's hash field is a SHA-256 digest of its contents concatenated with the previous event's hash. This creates a chain — if anyone modifies a historical record, every subsequent hash breaks.

func computeAuditHash(event AuditEvent, prevHash string) string {
    data := fmt.Sprintf("%s|%s|%s|%s|%s|%s|%s|%s",
        event.EntityType,
        event.EntityID,
        event.Action,
        event.ActorID,
        event.ActorIP,
        event.PayloadBefore,
        event.PayloadAfter,
        prevHash,
    )
    h := sha256.Sum256([]byte(data))
    return hex.EncodeToString(h[:])
}

We call this function inside a database transaction that also reads the latest prev_hash using SELECT ... FOR UPDATE to prevent race conditions. The serialization cost is real — roughly 2-3ms of added latency per write — but for audit events, correctness beats throughput every time.

Hash Chaining for Tamper Evidence

The hash chain is what turns your log table from "we promise we didn't change it" into "we can mathematically prove we didn't change it." Here's how the chain links together:

Event #1

balance_update

prev_hash: 000...000

hash: a3f...c81

Event #2

prev_hash: a3f...c81

hash: 7b2...e47

Event #3

config_change

prev_hash: 7b2...e47

hash: f19...d53

Verification is straightforward: iterate through the chain, recompute each hash from the event data plus the previous hash, and compare. If any record was altered — even a single character in the payload — the computed hash won't match the stored one, and every event after it will also fail verification. We run this check nightly as a cron job and alert on any mismatch. In three years of production use, we've had zero false positives and caught one incident where a junior DBA accidentally ran an UPDATE against the wrong table in staging (which is how we discovered our staging permissions were too lax).

Retention and Archival

Keeping seven years of audit data in PostgreSQL is a terrible idea. Our production audit table generates about 2 million rows per month. At that volume, query performance degrades and storage costs balloon. We use a tiered retention strategy:

Hot Storage

0 – 90 days

PostgreSQL
Full query access
Sub-second lookups

Warm Storage

90 days – 2 years

S3 + Parquet
Athena queries
Minutes to retrieve

Cold Storage

2 – 7 years

S3 Glacier
Deep archive
Hours to retrieve

A nightly job exports audit events older than 90 days to Parquet files on S3, partitioned by year/month/entity_type. We include the hash chain in the export so integrity verification works across storage tiers. After two years, data moves to Glacier Deep Archive. The total cost for storing 7 years of audit data dropped from ~$2,400/month (all in PostgreSQL with provisioned IOPS) to ~$180/month with this tiered approach.

Querying Audit Trails at Scale

Even with 90 days of hot data, the audit table can hold 6 million+ rows. Without proper indexing, the compliance team's queries will time out and they'll start asking you uncomfortable questions.

We partition the table by month using PostgreSQL's native declarative partitioning:

CREATE TABLE audit_events (
    -- columns as above
) PARTITION BY RANGE (created_at);

CREATE TABLE audit_events_2026_03 PARTITION OF audit_events
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

CREATE TABLE audit_events_2026_04 PARTITION OF audit_events
    FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');

On top of partitioning, we maintain covering indexes for the two most common audit queries:

Entity timeline: CREATE INDEX idx_audit_entity ON audit_events (entity_type, entity_id, created_at DESC) — "show me everything that happened to account X"
Actor activity: CREATE INDEX idx_audit_actor ON audit_events (actor_id, created_at DESC) — "show me everything user Y did last Tuesday"

For recurring compliance reports (monthly transaction summaries, access frequency reports), we use materialized views that refresh on a schedule. The compliance team gets sub-second dashboard loads instead of waiting for expensive aggregation queries to run against the raw table.

Surviving the Audit

Building the system is half the battle. The other half is not panicking when the auditor shows up. Here's what we learned:

Pre-build your report templates. Every audit we've been through asks for the same reports: user access logs for a date range, transaction trails for specific accounts, configuration change history, and privilege escalation events. We have parameterized SQL queries ready to go. The auditor asks, we run the query, export to CSV, done in minutes.
Lock down the audit tables. Your application service account should never have UPDATE or DELETE on audit tables. Period. Document this in your access control matrix. Auditors love seeing explicit REVOKE statements.
Use a dedicated read replica for auditors. Don't let auditors run ad-hoc queries against your production primary. Set up a read replica with a separate connection string. This protects production performance and gives auditors the freedom to run whatever queries they need without you hovering over their shoulder.
Automate integrity checks. Run hash chain verification daily. Keep the verification logs. When the auditor asks "how do you know this data hasn't been tampered with?", you hand them 365 days of automated verification reports showing zero integrity violations.
Document your retention policy. Write it down, get it signed by your compliance officer, and actually follow it. Auditors will check that your stated policy matches reality. If you say "7 years" but your Glacier lifecycle policy deletes after 5, that's a finding.

Pro tip: Create a "compliance runbook" — a step-by-step document that any engineer on your team can follow to pull standard audit reports. You don't want to be the single point of failure when the auditor calls and you're on vacation in Bali.

References

Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Pricing and features mentioned are subject to change — always verify with official documentation.

What Regulators Actually Want

The Append-Only Event Log

Computing the Hash Chain in Go

Hash Chaining for Tamper Evidence

Retention and Archival

Querying Audit Trails at Scale

Surviving the Audit

References

Related Articles