Why Table-Driven Tests Fall Short
I love table-driven tests. They're clean, idiomatic, and they make code review easy. But here's the thing — when I was working on a settlement engine that processed card transactions, our table-driven tests had 94% coverage and still let a rounding bug slip into production. The bug only manifested when a specific sequence of partial refunds hit a currency with zero decimal places (like JPY). No one thought to write that test case, because no one imagined that exact scenario.
That's the fundamental problem. Table-driven tests only catch the bugs you can imagine. Financial systems fail in ways you can't.
We needed tests that could explore the input space on their own, verify outputs against known-good snapshots, and simulate the messy state transitions that real payment flows produce. Here's what we built.
Property-Based Testing for Payment Calculations
The idea behind property-based testing is simple: instead of specifying exact inputs and outputs, you define properties that should always hold true, and the framework generates hundreds of random inputs to try to break them.
For payment amount calculations, the properties are obvious once you think about them. A charge followed by a full refund should always net to zero. Splitting a payment into N parts and summing them should equal the original. Converting USD to EUR and back should never drift by more than one minor unit.
We used Go's built-in testing/quick package initially, then moved to pgregory.net/rapid for better shrinking and custom generators. Here's a simplified version of what our property tests looked like:
func TestRefundNetsToZero(t *testing.T) {
rapid.Check(t, func(t *rapid.T) {
amount := rapid.Int64Range(1, 999999999).Draw(t, "amount")
currency := rapid.SampledFrom([]string{"USD", "EUR", "GBP", "JPY"}).Draw(t, "currency")
charge := NewMoney(amount, currency)
refund := charge.Negate()
result := charge.Add(refund)
if result.Amount != 0 {
t.Fatalf("charge + full refund != 0: got %d for %s %d",
result.Amount, currency, amount)
}
})
}
This test found a bug in our Negate() method within the first week. For JPY amounts, we were accidentally applying decimal scaling that didn't exist. The random generator hit JPY with a large amount, and the property failed. A table-driven test would have caught it too — if someone had thought to test JPY specifically. Nobody did.
Golden File Testing for Settlement Reports
Our settlement engine generates CSV and ISO 20022 XML reports that get sent to acquiring banks. These reports have strict formatting requirements — wrong column order, bad date formats, or incorrect decimal precision and the bank rejects the whole batch.
Golden file testing works perfectly here. You generate a report from a known set of transactions, save the output as a "golden" file, and future test runs compare against it. Any change to the output — intentional or not — causes a diff.
func TestSettlementReport(t *testing.T) {
transactions := loadFixtures(t, "testdata/settlement_batch.json")
report := GenerateSettlementCSV(transactions)
golden := filepath.Join("testdata", "golden", t.Name()+".csv")
if *update {
os.WriteFile(golden, []byte(report), 0644)
return
}
expected, _ := os.ReadFile(golden)
if diff := cmp.Diff(string(expected), report); diff != "" {
t.Errorf("settlement report mismatch (-want +got):\n%s", diff)
}
}
The -update flag is key. When you intentionally change the report format, you run go test -update to regenerate the golden files, review the diff in your PR, and commit. It turns report format changes into reviewable code changes. We caught three accidental format regressions in the first month after adopting this pattern.
Test Fixtures and Factories for Payment State Machines
A payment goes through many states: created, authorized, captured, partially refunded, fully refunded, disputed, settled. Testing every transition path with raw struct literals is painful and brittle. One new field on the Payment struct and you're updating 50 test files.
We built a factory pattern inspired by what I'd seen in Ruby's FactoryBot, but idiomatic to Go:
func NewTestPayment(t *testing.T, opts ...PaymentOption) *Payment {
t.Helper()
p := &Payment{
ID: uuid.New(),
Amount: NewMoney(10000, "USD"), // $100.00
Status: StatusCreated,
CreatedAt: time.Now(),
Merchant: DefaultTestMerchant(),
}
for _, opt := range opts {
opt(p)
}
return p
}
// Usage in tests:
payment := NewTestPayment(t,
WithAmount(50000, "EUR"),
WithStatus(StatusCaptured),
WithPartialRefund(10000),
)
The functional options pattern keeps the factory flexible without a massive constructor signature. When we added a RiskScore field to Payment, we updated the factory once and every test that didn't care about risk scores kept working.
Integration Testing with Gateway Mocks
Mocking external payment gateways is non-negotiable — you can't hit Stripe's API in CI on every push. But the mock needs to be realistic enough to be useful. We built an httptest.Server that simulated our gateway's behavior, including latency, error rates, and idempotency key handling.
func NewMockGateway(t *testing.T) *MockGateway {
t.Helper()
gw := &MockGateway{
charges: make(map[string]*Charge),
idempotent: make(map[string]string),
}
gw.Server = httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Simulate 200ms network latency
time.Sleep(200 * time.Millisecond)
idempKey := r.Header.Get("Idempotency-Key")
if prev, ok := gw.idempotent[idempKey]; ok {
w.Write([]byte(prev))
return
}
// ... handle charge creation, capture, refund
}))
t.Cleanup(gw.Server.Close)
return gw
}
The critical detail: we made the mock stateful. A charge had to be created before it could be captured. A refund couldn't exceed the captured amount. This caught a whole class of bugs where our code was calling gateway endpoints in the wrong order or with stale data. One test discovered that our retry logic was sending captures for already-captured charges — something that would have triggered duplicate settlements in production.
Warning: Don't mock at the HTTP client level with a generic round-tripper. Mock at the server level with httptest.Server. Client-level mocks skip connection handling, timeouts, and header serialization — exactly the things that break in production. I learned this the hard way when a Content-Type header mismatch caused silent JSON parsing failures that our client-level mocks never caught.
Fuzzing for Currency Conversion Edge Cases
Go 1.18 shipped native fuzzing support, and it's been a game-changer for our currency conversion code. Currency conversion sounds simple until you deal with triangulated rates (converting KRW to BRL via USD), currencies with different minor unit scales, and amounts that push the boundaries of int64 arithmetic.
func FuzzCurrencyConvert(f *testing.F) {
// Seed corpus with known tricky values
f.Add(int64(1), "USD", "JPY") // minimal amount
f.Add(int64(999999999), "BHD", "USD") // 3-decimal currency
f.Add(int64(9223372036854775), "USD", "KRW") // large amount
f.Fuzz(func(t *testing.T, amount int64, from, to string) {
if amount <= 0 {
t.Skip()
}
result, err := Convert(NewMoney(amount, from), to)
if err != nil {
return // invalid currency pair is fine
}
if result.Amount < 0 {
t.Errorf("negative conversion result: %d %s -> %d %s",
amount, from, result.Amount, to)
}
})
}
Within 30 seconds of running go test -fuzz=FuzzCurrencyConvert, the fuzzer found an integer overflow when converting large BHD (Bahraini Dinar, 3 decimal places) amounts to KRW (Korean Won, 0 decimal places). The multiplication overflowed int64 silently. We switched to math/big for intermediate calculations and added overflow checks. That bug had been in production for months — no customer had triggered it yet, but it was a ticking time bomb.
Key Lessons Learned
- Layer your testing strategy. Table-driven tests for the known cases, property-based tests for the invariants, golden files for output formats, fuzz tests for the unknown unknowns. Each layer catches different classes of bugs.
- Make test failures financial. When a test fails, the error message should include the monetary amount, currency, and transaction state. "Expected 0, got 1" is useless. "Refund of JPY 1000 left residual amount of 1 minor unit" tells you exactly what went wrong.
- Invest in test fixtures early. The factory pattern pays for itself within a week. Every new feature needs test payments in various states, and copy-pasting struct literals is how you get tests that lie.
- Run fuzz tests in CI nightly. Fuzzing needs time to explore. We run a 5-minute fuzz session in our nightly CI pipeline. It's found three bugs in six months that no other test caught.
- Golden files belong in version control. They're documentation. When a new developer asks "what does the settlement report look like?", the answer is in
testdata/golden/.
Takeaway: The goal isn't 100% coverage — it's confidence that your system handles money correctly. Property-based testing and fuzzing explore the input space in ways humans can't. Golden files catch regressions humans won't notice. Together with table-driven tests, they form a safety net that actually holds.
References
- Go Standard Library — testing package documentation
- Go Fuzzing Documentation — Tutorial and reference
- testing/quick — Utility functions for black-box testing
- Testify — Go testing toolkit on GitHub
Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Code examples are simplified for illustration — always adapt patterns to your specific requirements and verify with official documentation.