Data Encryption at Rest in Payment Systems: A Practical Engineering Guide

Beyond "PCI Says So"

Let's get the obvious out of the way: yes, PCI DSS Requirement 3 mandates that stored cardholder data be rendered unreadable. But if compliance is your only motivation for encryption at rest, you're thinking about it wrong.

Encryption at rest is a breach containment strategy. When — not if — someone gets unauthorized access to your database backups, your disk snapshots, or your decommissioned hardware, encryption is the difference between a security incident and a catastrophic data breach. It's defense in depth. Your firewall rules, your access controls, your network segmentation — those are all important. But encryption at rest is the last line of defense when everything else has already failed.

I've seen teams treat encryption at rest as a checkbox exercise. They enable full-disk encryption on their EC2 volumes, tell the auditor "we encrypt at rest," and move on. That's not wrong, but it's incomplete. A compromised application with database credentials can read every plaintext row regardless of whether the underlying disk is encrypted. You need to think in layers.

The Three Layers of Encryption at Rest

There are three distinct layers where you can encrypt data at rest, and they protect against different threat models. Understanding which threats each layer addresses is the key to building a real encryption strategy rather than a compliance theater one.

Encryption Layers — Outside In

Layer 1: Full-Disk Encryption (FDE)

Encrypts the entire storage volume. Protects against physical theft, improper disk disposal, and unauthorized snapshot access. Transparent to applications — no code changes needed.

Layer 2: Transparent Data Encryption (TDE)

Database-level encryption of data files, tablespaces, or backups. Protects against unauthorized access to database files and backup tapes. Still transparent to SQL queries — the database engine handles encrypt/decrypt.

Layer 3: Application-Level Encryption (ALE)

Your application encrypts specific fields before they ever reach the database. Protects against compromised database credentials, SQL injection, rogue DBAs, and application-layer breaches. This is the only layer where a stolen database dump is still useless.

Here's the thing most teams miss: you usually need all three. FDE is table stakes — it's a cloud provider checkbox. TDE adds protection at the database layer. But only application-level encryption protects you when an attacker has valid database credentials, which is the most common real-world attack vector for payment data theft.

When to Use Which Layer

Approach	Protection Scope	Perf Impact	Key Mgmt Complexity	PCI DSS Coverage
FDE	Physical media, snapshots	Negligible (<2%)	Low — cloud KMS handles it	Partial (Req 3.4.1)
TDE	DB files, backups, WAL	Low (3–8%)	Moderate — DB key rotation	Partial (Req 3.5)
ALE	Individual fields/columns	Moderate (5–15%)	High — per-field key strategy	Full (Req 3.5, 3.6, 3.7)

For payment systems specifically, I recommend FDE as the baseline (just enable EBS encryption or equivalent), TDE if your database supports it natively, and ALE for any column containing PAN, CVV, or other sensitive cardholder data. The performance overhead of ALE is real but manageable if you're selective about what you encrypt — more on that later.

Key Management Is the Hard Part

I've never seen an encryption implementation fail because the cipher was weak. It's always the key management. Every single time. You can use AES-256-GCM with perfect implementation, and it means nothing if your encryption keys are sitting in a config file next to the encrypted data.

The industry has converged on a pattern called envelope encryption, and once you understand it, everything else clicks into place. The idea is simple: you have two tiers of keys. A Data Encryption Key (DEK) encrypts your actual data. A Key Encryption Key (KEK) encrypts the DEK. The KEK lives in a Hardware Security Module (HSM) or a managed KMS service and never leaves that boundary.

Envelope Encryption Pattern

Plaintext
PAN / CVV

DEK
AES-256-GCM

Encrypted
Ciphertext

DEK
Plaintext key

KEK
In HSM / KMS

Encrypted DEK
Stored with data

Why this indirection? Because rotating the KEK is cheap — you just re-encrypt the DEKs, not all the data. And the KEK never leaves the HSM boundary, so even if your entire application server is compromised, the attacker gets encrypted DEKs they can't unwrap without access to the KMS.

In practice, AWS KMS, GCP Cloud KMS, or HashiCorp Vault's Transit engine all implement this pattern. The managed services handle the HSM-backed key storage, and you call their API to wrap/unwrap DEKs. Don't roll your own key hierarchy unless you have a very specific reason and a dedicated security team.

Envelope Encryption in Go

Here's a practical implementation of envelope encryption using AES-256-GCM. This is the pattern I've used in production payment services — generate a unique DEK per record, encrypt the data, then encrypt the DEK with your KEK (in practice, you'd call KMS for that last step).

package encryption

import (
    "crypto/aes"
    "crypto/cipher"
    "crypto/rand"
    "fmt"
    "io"
)

// EncryptedPayload holds the ciphertext and the wrapped DEK.
// Store both together — the encrypted DEK is useless without the KEK.
type EncryptedPayload struct {
    Ciphertext   []byte
    Nonce        []byte
    EncryptedDEK []byte
}

// GenerateDEK creates a random 256-bit data encryption key.
func GenerateDEK() ([]byte, error) {
    dek := make([]byte, 32) // 256 bits
    if _, err := io.ReadFull(rand.Reader, dek); err != nil {
        return nil, fmt.Errorf("generating DEK: %w", err)
    }
    return dek, nil
}

// Encrypt encrypts plaintext using AES-256-GCM with a fresh DEK.
// In production, replace wrapDEK with a call to AWS KMS or Vault Transit.
func Encrypt(plaintext, kek []byte) (*EncryptedPayload, error) {
    dek, err := GenerateDEK()
    if err != nil {
        return nil, err
    }

    // Encrypt the data with the DEK
    block, err := aes.NewCipher(dek)
    if err != nil {
        return nil, fmt.Errorf("creating cipher: %w", err)
    }

    gcm, err := cipher.NewGCM(block)
    if err != nil {
        return nil, fmt.Errorf("creating GCM: %w", err)
    }

    nonce := make([]byte, gcm.NonceSize())
    if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
        return nil, fmt.Errorf("generating nonce: %w", err)
    }

    ciphertext := gcm.Seal(nil, nonce, plaintext, nil)

    // Wrap the DEK with the KEK (simplified — use KMS in production)
    encryptedDEK, err := wrapKey(dek, kek)
    if err != nil {
        return nil, fmt.Errorf("wrapping DEK: %w", err)
    }

    // Zero out the plaintext DEK from memory
    for i := range dek {
        dek[i] = 0
    }

    return &EncryptedPayload{
        Ciphertext:   ciphertext,
        Nonce:        nonce,
        EncryptedDEK: encryptedDEK,
    }, nil
}

// wrapKey encrypts the DEK using the KEK with AES-256-GCM.
// In production, this would be a KMS Encrypt API call.
func wrapKey(dek, kek []byte) ([]byte, error) {
    block, err := aes.NewCipher(kek)
    if err != nil {
        return nil, err
    }
    gcm, err := cipher.NewGCM(block)
    if err != nil {
        return nil, err
    }
    nonce := make([]byte, gcm.NonceSize())
    if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
        return nil, err
    }
    return gcm.Seal(nonce, nonce, dek, nil), nil
}

A few things to note: we generate a unique DEK per encryption operation. This limits the blast radius if a single DEK is compromised. We also zero out the plaintext DEK after use — it's not bulletproof in a garbage-collected language, but it reduces the window where the key sits in memory. And we use GCM mode, which provides both confidentiality and integrity (authenticated encryption). Never use ECB or plain CBC for payment data.

Column-Level Encryption: Be Selective

One of the most common questions I get is whether to encrypt entire database rows or individual columns. The answer for payment systems is almost always column-level. Here's why.

You need to encrypt PAN (Primary Account Number) and CVV — that's non-negotiable under PCI DSS. But you probably don't need to encrypt the transaction timestamp, the merchant ID, or the currency code. Encrypting everything means you can't index, you can't query, and your reporting pipeline grinds to a halt.

The pattern I've used is: encrypt the sensitive columns (PAN, CVV, cardholder name) at the application layer before writing to the database. Store a truncated or hashed version alongside for lookups — the last four digits of the PAN, for example, which PCI allows in the clear. This gives you the ability to search and filter without decrypting, while the actual sensitive data stays encrypted.

Tip: Store the encrypted DEK alongside the encrypted column data. A common schema pattern is: pan_encrypted BYTEA, pan_last4 VARCHAR(4), pan_dek_encrypted BYTEA, pan_key_version INT. The key version lets you track which KEK was used, which is critical for key rotation.

Key Rotation Without Downtime

Key rotation is a PCI requirement (Requirement 3.7.4), but doing it without downtime is the real engineering challenge. The pattern that works is what I call the "dual-read" approach.

When you rotate the KEK, you don't re-encrypt all your data immediately. Instead, you re-wrap the existing DEKs with the new KEK. New writes use the new KEK version. For reads, your decryption path checks the key_version column and uses the appropriate KEK to unwrap the DEK. You keep the old KEK active (but not for new encryptions) until you've lazily re-wrapped all existing DEKs — either through a background migration or by re-wrapping on read.

This means your system is always reading with both the old and new KEK simultaneously. No downtime, no big-bang migration. The background job gradually re-wraps everything, and once it's done, you can decommission the old KEK version.

Performance: Encrypt Only What You Must

AES-256-GCM on modern hardware with AES-NI instructions is fast — we're talking single-digit microseconds for a typical PAN-length payload. The real performance cost isn't the cipher itself; it's the KMS round-trip for DEK unwrapping. If you're calling AWS KMS for every decrypt operation, you're adding 5–15ms of network latency per call.

The solution is DEK caching. After unwrapping a DEK, cache it in memory (with a TTL) so subsequent decryptions of records using the same DEK don't require a KMS call. This is a tradeoff — cached plaintext DEKs in memory are a risk — but with a short TTL (5–10 minutes) and proper memory handling, it's the pragmatic choice for high-throughput payment systems.

The other performance lever is being selective. Don't encrypt your entire transactions table. Encrypt the PAN column, the CVV (which you shouldn't be storing at all post-authorization, but that's another article), and the cardholder name. Leave the amount, currency, timestamp, and status in the clear so your indexes and queries work normally.

Common Mistakes That Will Get You Breached

Key Management Pitfalls — Don't Learn These the Hard Way

Storing keys next to encrypted data. If your encryption key is in the same database, S3 bucket, or config file as the encrypted data, you don't have encryption — you have obfuscation. Keys must live in a separate trust boundary (KMS, HSM, Vault).
Never rotating keys. A key that's been in use for three years has had three years of exposure. Rotate KEKs at least annually, and have the automation to do it without a maintenance window.
Using ECB mode. ECB encrypts identical plaintext blocks to identical ciphertext blocks. For structured data like PANs, this leaks patterns. Always use an authenticated mode like GCM or CCM.
Reusing nonces. With AES-GCM, a nonce/key pair must never repeat. Reusing a nonce with the same key completely breaks the authentication guarantee and leaks plaintext via XOR. Use random nonces with unique-per-record DEKs to make this practically impossible.
Logging decrypted values. Your encryption is worthless if the plaintext PAN shows up in application logs, error messages, or stack traces. Audit every log path.

The most insidious mistake I've seen in production was a team that implemented AES-256 encryption perfectly — then stored the encryption key as an environment variable that got dumped into a crash report and shipped to their error tracking service. The encryption was technically flawless. The operational security around the key was nonexistent.

References

Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Pricing and features mentioned are subject to change — always verify with official documentation.

Beyond "PCI Says So"

The Three Layers of Encryption at Rest

When to Use Which Layer

Key Management Is the Hard Part

Envelope Encryption in Go

Column-Level Encryption: Be Selective

Key Rotation Without Downtime

Performance: Encrypt Only What You Must

Common Mistakes That Will Get You Breached

References

Related Articles