Why Application-Level Logging Isn't Enough
Most teams start with application-level audit logging — an after_save callback in Rails, or a middleware in Go that logs mutations. This works until it doesn't:
- Database migrations that update transaction statuses bypass your application entirely.
- Manual SQL fixes during incidents — someone runs
UPDATE payments SET status = 'refunded'directly, and there's no record of it. - Background jobs that use raw SQL for performance skip your ORM callbacks.
- Multiple services writing to the same tables — each would need identical logging logic.
PostgreSQL triggers fire on every row change regardless of how it was made. They're the only way to guarantee complete coverage.
The Audit Table Schema
The audit table needs to capture enough context to reconstruct what happened, without storing so much data that it becomes a storage problem. Here's the schema we use:
CREATE TABLE audit_log (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
table_name TEXT NOT NULL,
record_id TEXT NOT NULL,
operation TEXT NOT NULL CHECK (operation IN ('INSERT','UPDATE','DELETE')),
old_data JSONB,
new_data JSONB,
changed_fields TEXT[],
actor_id TEXT,
actor_ip INET,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_audit_log_table_record ON audit_log (table_name, record_id);
CREATE INDEX idx_audit_log_created_at ON audit_log (created_at);
Key design decisions:
record_idis TEXT, not UUID or BIGINT — different tables use different primary key types. Casting to text is universal.changed_fieldsarray lets you quickly filter "show me all status changes" without parsing JSONB.actor_idandactor_ipcome from session variables set by the application (more on this below).- JSONB for old/new data — flexible, queryable, and handles schema evolution without migration headaches.
Building the Trigger Function
One trigger function handles all audited tables. It uses TG_TABLE_NAME and TG_OP to adapt to the context:
CREATE OR REPLACE FUNCTION audit_trigger_func()
RETURNS TRIGGER AS $$
DECLARE
old_json JSONB;
new_json JSONB;
changed TEXT[];
rid TEXT;
key TEXT;
BEGIN
IF TG_OP = 'DELETE' THEN
old_json := to_jsonb(OLD);
rid := OLD.id::TEXT;
ELSIF TG_OP = 'INSERT' THEN
new_json := to_jsonb(NEW);
rid := NEW.id::TEXT;
ELSIF TG_OP = 'UPDATE' THEN
old_json := to_jsonb(OLD);
new_json := to_jsonb(NEW);
rid := NEW.id::TEXT;
-- Compute changed fields
FOR key IN SELECT jsonb_object_keys(new_json)
LOOP
IF old_json->key IS DISTINCT FROM new_json->key THEN
changed := array_append(changed, key);
END IF;
END LOOP;
-- Skip if nothing actually changed
IF array_length(changed, 1) IS NULL THEN
RETURN NEW;
END IF;
END IF;
INSERT INTO audit_log (table_name, record_id, operation,
old_data, new_data, changed_fields,
actor_id, actor_ip)
VALUES (TG_TABLE_NAME, rid, TG_OP,
old_json, new_json, changed,
current_setting('app.actor_id', true),
current_setting('app.actor_ip', true)::INET);
RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
The no-op optimization matters. Payment tables get frequent UPDATE calls that don't actually change data — ORM save-all patterns, idempotent retries. Skipping these avoids audit log bloat. In our system, this cut audit volume by ~30%.
Attaching Triggers to Payment Tables
Attach the trigger to each table you want to audit. For payment systems, the critical tables are usually transactions, refunds, settlements, and merchant configurations:
CREATE TRIGGER audit_payments
AFTER INSERT OR UPDATE OR DELETE ON payments
FOR EACH ROW EXECUTE FUNCTION audit_trigger_func();
CREATE TRIGGER audit_refunds
AFTER INSERT OR UPDATE OR DELETE ON refunds
FOR EACH ROW EXECUTE FUNCTION audit_trigger_func();
CREATE TRIGGER audit_merchant_configs
AFTER INSERT OR UPDATE OR DELETE ON merchant_configs
FOR EACH ROW EXECUTE FUNCTION audit_trigger_func();
To pass actor context from your application, set session variables before each transaction:
-- In your Go/Ruby/Python code, at the start of each request:
SET LOCAL app.actor_id = 'user_abc123';
SET LOCAL app.actor_ip = '203.0.113.42';
SET LOCAL scopes the variable to the current transaction, so it's automatically cleaned up. No risk of leaking one user's identity into another request's audit entries.
Making the Audit Log Tamper-Evident
An audit log that can be silently modified isn't an audit log. For PCI DSS and SOC 2 compliance, you need to demonstrate that logs haven't been altered. Two approaches:
| Approach | How It Works | Trade-off |
|---|---|---|
| Append-only permissions | REVOKE UPDATE, DELETE on audit_log from the app role. Only a separate admin role can modify. | Simple but relies on role discipline |
| Hash chaining | Each row includes a SHA-256 hash of its content + the previous row's hash. Tampering breaks the chain. | Stronger guarantee but adds write latency |
| External replication | Stream audit rows to an immutable store (S3 with Object Lock, or a separate read-only replica). | Best for compliance but adds infrastructure |
We use append-only permissions as the baseline and replicate to S3 nightly for long-term retention:
-- Restrict the application role
REVOKE UPDATE, DELETE ON audit_log FROM app_role;
GRANT INSERT, SELECT ON audit_log TO app_role;
-- The trigger function runs as SECURITY DEFINER (superuser-owned),
-- so it can still INSERT even if the calling role is restricted.
Performance Considerations
Triggers add overhead to every write. On a payments table processing 5,000 transactions per second, that overhead matters. Here's what we measured and how we tuned it:
- Partition the audit table by month. Use PostgreSQL's native range partitioning on
created_at. This makes dropping old partitions instant instead of running slow DELETEs. - Store only changed fields for UPDATEs. Instead of storing the full old and new row, you can store only the diff. We chose full rows for easier debugging, but if storage is a concern, the diff approach cuts size by 60-70%.
- Use AFTER triggers, not BEFORE. AFTER triggers don't block the main transaction's write path. The audit INSERT happens after the row is committed, so it doesn't add to the critical path latency as much.
- Batch archive to cold storage. A nightly job exports partitions older than 90 days to Parquet files on S3, then drops the partition. Auditors get a Parquet reader; the database stays lean.
Querying the Audit Trail
Common queries during incident investigation and compliance audits:
-- Who changed this payment's status?
SELECT operation, old_data->>'status' AS old_status,
new_data->>'status' AS new_status,
actor_id, created_at
FROM audit_log
WHERE table_name = 'payments' AND record_id = 'pay_abc123'
AND 'status' = ANY(changed_fields)
ORDER BY created_at;
-- All manual changes (no actor = direct SQL)
SELECT * FROM audit_log
WHERE actor_id IS NULL
AND created_at > now() - INTERVAL '7 days'
ORDER BY created_at DESC;
-- Refund volume by actor (fraud investigation)
SELECT actor_id, COUNT(*) AS refund_count,
SUM((new_data->>'amount')::NUMERIC) AS total_refunded
FROM audit_log
WHERE table_name = 'refunds' AND operation = 'INSERT'
AND created_at > now() - INTERVAL '30 days'
GROUP BY actor_id
ORDER BY total_refunded DESC;
Lessons from Production
- Don't audit everything. Session tables, cache tables, and analytics staging tables generate noise. Audit only tables with financial or compliance significance.
- Monitor audit table growth. We set alerts when the audit table exceeds 50GB per partition. Unexpected growth usually means a runaway background job doing unnecessary updates.
- Test trigger behavior in CI. Write integration tests that verify audit rows are created for critical operations. We caught a migration that accidentally dropped a trigger — the test caught it before production.
- Include the "why" when possible. We added an
app.audit_reasonsession variable for manual operations. When an engineer runs a fix, they set the reason:SET LOCAL app.audit_reason = 'INCIDENT-1234: correcting duplicate charge'. - Plan for schema evolution. When you add columns to an audited table, the JSONB approach handles it automatically — new fields appear in
new_datawithout any trigger changes.
References
- PostgreSQL Documentation — Trigger Functions in PL/pgSQL
- PostgreSQL Documentation — Table Partitioning
- PCI Security Standards Council — Requirement 10: Log and Monitor Access
- PostgreSQL Wiki — Audit Trigger (9.1+)
Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. SQL examples are simplified for clarity — always test thoroughly in a staging environment before applying to production payment databases.