The $43,000 Lesson
It started with a simple POST to /v1/charges. The client sent a payment request, our gateway processed it, Stripe debited the card — and then the response timed out. The client's retry logic kicked in, sent the same request again, and Stripe happily charged the card a second time. Different request, same intent, two charges.
We didn't catch it until Monday morning when the finance team flagged 127 duplicate transactions from the weekend. Some customers had been charged three or four times. The support queue exploded. It took us two days to identify every duplicate, issue refunds, and send apology emails. All because we didn't have idempotency on our payment endpoint.
That was the last time. Here's the pattern I've used on every payment API since.
What Idempotency Actually Means
An idempotent operation produces the same result no matter how many times you execute it. For a payment API, that means: if a client sends the same charge request twice (or ten times), the customer gets charged exactly once, and every response returns the same result.
This isn't optional for payment systems. Networks are unreliable. Clients retry. Load balancers re-route. Mobile apps lose connectivity mid-request. If your payment endpoint isn't idempotent, you will double-charge someone. It's a matter of when, not if.
Key principle: Every mutating payment endpoint (charges, refunds, transfers) must be idempotent. GET requests are naturally idempotent. POST requests that create resources or trigger side effects are where things go wrong.
Idempotency Key Design
The mechanism is straightforward: the client sends a unique key with each request. The server uses that key to detect duplicates. If it's seen the key before, it returns the cached response instead of processing again.
I've settled on UUID v4 for idempotency keys, generated client-side. Here's why:
- Client-generated — the client knows its intent. If it's retrying the same operation, it sends the same key. If it's a new operation, it generates a new key. Server-generated keys can't distinguish retries from new requests.
- UUID v4 — 122 bits of randomness means collision probability is negligible. I've processed over 80 million payment requests and never seen a collision. Some teams use composite keys like
{merchant_id}:{order_id}, which works too, but UUIDs are simpler and universal. - Passed via header —
Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000. Keep it out of the request body so your idempotency logic stays decoupled from your business logic.
The Idempotency Check Flow
Every payment request follows this decision tree. The key insight is that the lookup happens before any business logic or external API calls.
completed or in-progress?
response
409 Conflict
"in-progress"
(call PSP)
mark "completed"
to client
PostgreSQL Implementation — The Foundation
Your idempotency store needs to be durable. If the server crashes after processing a payment but before returning the response, the client will retry — and you need that stored result. PostgreSQL is my go-to for this because you get ACID guarantees and the ON CONFLICT clause makes upserts clean.
CREATE TABLE idempotency_keys (
key UUID PRIMARY KEY,
status TEXT NOT NULL DEFAULT 'in_progress',
request_path TEXT NOT NULL,
request_hash BYTEA NOT NULL,
response_code INTEGER,
response_body JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
completed_at TIMESTAMPTZ,
expires_at TIMESTAMPTZ NOT NULL DEFAULT now() + INTERVAL '24 hours'
);
CREATE INDEX idx_idempotency_expires ON idempotency_keys (expires_at)
WHERE status = 'in_progress';
-- Attempt to claim the key. Returns nothing if inserted (new request),
-- returns the existing row if the key was already present.
INSERT INTO idempotency_keys (key, request_path, request_hash)
VALUES ($1, $2, $3)
ON CONFLICT (key) DO NOTHING
RETURNING *;
The request_hash column is critical. If a client reuses an idempotency key with a different request body, that's a bug on their end — not a retry. I hash the request body with SHA-256 and compare it. If the hashes don't match, I return a 422 Unprocessable Entity explaining the mismatch. Stripe does the same thing.
Go Middleware for Idempotency
The idempotency check wraps your payment handler as middleware. This keeps the logic decoupled — your handler doesn't need to know about idempotency at all.
func IdempotencyMiddleware(db *sql.DB) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
key := r.Header.Get("Idempotency-Key")
if key == "" {
http.Error(w, `{"error":"missing Idempotency-Key header"}`, 400)
return
}
bodyBytes, _ := io.ReadAll(r.Body)
r.Body = io.NopCloser(bytes.NewBuffer(bodyBytes))
reqHash := sha256.Sum256(bodyBytes)
// Try to claim the key
var existing IdempotencyRecord
err := db.QueryRowContext(r.Context(),
`INSERT INTO idempotency_keys (key, request_path, request_hash)
VALUES ($1, $2, $3)
ON CONFLICT (key) DO UPDATE SET key = EXCLUDED.key
RETURNING *`,
key, r.URL.Path, reqHash[:],
).Scan(&existing.Key, &existing.Status, /* ... */)
if existing.Status == "completed" {
if !bytes.Equal(existing.RequestHash, reqHash[:]) {
http.Error(w, `{"error":"idempotency key reused with different request"}`, 422)
return
}
w.Header().Set("Idempotent-Replayed", "true")
w.WriteHeader(existing.ResponseCode)
w.Write(existing.ResponseBody)
return
}
if existing.Status == "in_progress" {
w.Header().Set("Retry-After", "2")
http.Error(w, `{"error":"request already in progress"}`, 409)
return
}
// New request — capture the response
rec := &responseRecorder{ResponseWriter: w, statusCode: 200}
next.ServeHTTP(rec, r)
// Store the result
db.ExecContext(r.Context(),
`UPDATE idempotency_keys
SET status = 'completed', response_code = $2,
response_body = $3, completed_at = now()
WHERE key = $1`,
key, rec.statusCode, rec.body.Bytes(),
)
})
}
}
Notice the Idempotent-Replayed: true header on cached responses. This tells the client "I didn't process this again, I'm returning a stored result." Stripe includes this header too, and it's invaluable for debugging.
Redis Cache for High-Throughput Systems
PostgreSQL is durable but adds latency on every request — you're hitting disk for the lookup. For high-throughput payment APIs (we were doing 3,000+ charges per second), I add a Redis layer in front. Redis handles the fast-path lookup; PostgreSQL remains the source of truth.
func (s *IdempotencyStore) Check(ctx context.Context, key string) (*CachedResponse, error) {
// Fast path: check Redis first
cached, err := s.redis.Get(ctx, "idem:"+key).Bytes()
if err == nil {
var resp CachedResponse
json.Unmarshal(cached, &resp)
return &resp, nil
}
// Slow path: check PostgreSQL
row := s.db.QueryRowContext(ctx,
`SELECT response_code, response_body FROM idempotency_keys
WHERE key = $1 AND status = 'completed'`, key)
var resp CachedResponse
if err := row.Scan(&resp.Code, &resp.Body); err == nil {
// Backfill Redis for next time
data, _ := json.Marshal(resp)
s.redis.Set(ctx, "idem:"+key, data, 24*time.Hour)
return &resp, nil
}
return nil, ErrNotFound // New request
}
Choosing Your Idempotency Strategy
There's no one-size-fits-all. The right approach depends on your throughput, durability requirements, and operational complexity budget.
I started with PostgreSQL-only on our first payment service. It handled 200 charges per second without breaking a sweat. We only added the Redis layer when we scaled to a multi-tenant platform processing for dozens of merchants simultaneously. Don't over-engineer it on day one.
Edge Cases That Will Bite You
The happy path is easy. It's the edge cases that cause production incidents.
Partial Failures
What happens when Stripe charges the card successfully, but your database write fails? The customer is charged, but you have no record of it. This is the scariest failure mode. My approach: write the idempotency key as "in-progress" before calling the PSP, then update it to "completed" after. If the update fails, a background reconciliation job picks up stale "in-progress" keys and checks the PSP for the actual charge status.
Warning — never delete in-progress keys on failure. If the PSP actually processed the charge but you didn't get the response, deleting the key means the next retry creates a genuine duplicate. Always leave the key and let a reconciliation process resolve the ambiguity.
Timeout Handling
Clients will time out before your server finishes processing. When they retry, the key is "in-progress." I return 409 Conflict with a Retry-After: 2 header. The client waits two seconds and tries again. If the original request completed in the meantime, they get the cached response. If it's still processing, they get another 409. After 30 seconds of this, I consider the original request failed and allow reprocessing.
Key Expiration
Idempotency keys shouldn't live forever. I expire them after 24 hours — long enough to handle any reasonable retry scenario, short enough to keep the table manageable. A nightly cron job cleans up expired keys. Stripe uses the same 24-hour window.
Pro tip: Add a request_path column to your idempotency table. A key should only be valid for the endpoint it was originally used on. If a client accidentally sends the same key to /v1/charges and /v1/refunds, you want to catch that — not silently return a charge response for a refund request.
How the Big PSPs Handle It
I've integrated with most major payment processors, and they all approach idempotency slightly differently:
- Stripe — accepts an
Idempotency-Keyheader on all POST requests. Keys expire after 24 hours. They store the full response and replay it, including error responses. If you reuse a key with different parameters, you get a 400 error. This is the gold standard implementation. - Adyen — uses a
referencefield in the request body as the idempotency mechanism. It's merchant-scoped, so the same reference from different merchants won't collide. Less flexible than a header-based approach, but it works well for their use case. - PayPal — supports a
PayPal-Request-Idheader. Similar to Stripe's approach but with a 72-hour expiration window. They also return aDUPLICATE_TRANSACTIONerror code when a duplicate is detected, which is helpful for client-side handling.
If you're building a payment API, follow Stripe's model. Header-based keys, 24-hour expiration, full response caching, and parameter mismatch detection. It's battle-tested at massive scale.
The Checklist
Before you ship an idempotent payment endpoint, verify these:
- Every POST endpoint that moves money requires an
Idempotency-Keyheader - Keys are stored durably (PostgreSQL) before calling any external PSP
- Request body hashes are compared to detect key reuse with different parameters
- In-progress requests return
409 ConflictwithRetry-After - A reconciliation job handles stale in-progress keys
- Keys expire after 24 hours with automated cleanup
- Cached responses include the
Idempotent-Replayed: trueheader
Idempotency isn't glamorous. It doesn't show up in feature demos or sprint reviews. But it's the difference between a payment system that works and one that costs you $43,000 on a quiet Saturday. Build it before you need it.
References
- Stripe API Documentation — Idempotent Requests
- Adyen Documentation — API Idempotency
- PostgreSQL Documentation — INSERT ON CONFLICT (Upsert)
- Go Standard Library — net/http Package Documentation
- Redis Documentation — Scripting with Lua
- PayPal Developer — Idempotent API Calls
Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Code examples are simplified for clarity and may omit error handling, logging, and security measures — always review and adapt for your specific use case and security requirements. This is not financial or legal advice.