Engineering a Chargeback Management System — From Dispute Ingestion to Automated Representment
Chargebacks are the most expensive problem in payment processing that nobody talks about until it's too late. I've built dispute handling pipelines across multiple payment platforms, and I learned the hard way what actually matters — and what the documentation never tells you.
Every payment platform eventually hits the same wall: a merchant's chargeback ratio creeps past 1%, Visa sends a monitoring notice, and suddenly you're scrambling to build dispute tooling that should have existed six months ago. I've been on both sides of that scramble, and this article covers the engineering patterns that actually work.
The Chargeback Lifecycle
Before writing any code, you need to understand the full dispute lifecycle. A chargeback isn't a single event — it's a multi-stage process with strict deadlines at every step. Miss one deadline and you automatically lose, regardless of how strong your evidence is.
The critical thing to internalize: you typically have 18 to 30 days from the moment a dispute notification lands to submit your representment package. That clock starts ticking whether your system processes the webhook at 2 AM on a Saturday or not. Every hour of ingestion latency is an hour stolen from your evidence collection window.
In practice, I model the lifecycle as a state machine with strict transition rules. A dispute moves through RECEIVED → ENRICHED → EVIDENCE_COLLECTED → REPRESENTED → RESOLVED. Each transition has a deadline, and a background scheduler watches for disputes that are stuck in any state too long. If a dispute sits in ENRICHED for more than 48 hours without evidence being attached, it triggers an alert.
Ingesting Disputes from Card Networks
This is where the real engineering starts. Visa and Mastercard have fundamentally different approaches to dispute notification, and if you're building a platform that handles both, you're maintaining two parallel ingestion pipelines.
Visa uses VROL (Visa Resolve Online) as its primary dispute management system. Disputes arrive as structured records that you pull via API or receive through your acquirer's integration. Visa also offers RDR (Rapid Dispute Resolution), which lets you auto-resolve disputes before they become formal chargebacks — this is a huge win if you can implement it, because RDR resolutions don't count toward your chargeback ratio.
Mastercard's equivalent is the Mastercom system, and they also partner with Ethoca for early alert notifications. Ethoca alerts give you a heads-up that a dispute is coming, often before the formal chargeback is filed. This pre-chargeback window is gold — you can issue a refund proactively and avoid the chargeback entirely.
The webhook payloads from these systems are not fun to parse. Reason codes come in different formats, amounts may be in different currencies than the original transaction, and the transaction identifiers don't always match cleanly to your internal records. I've found that building a normalization layer early — one that maps every incoming dispute to a canonical internal format — saves enormous pain later.
// Simplified dispute normalization
type RawDispute struct {
Network string // "visa" | "mastercard"
NetworkRef string // ARN or Mastercom case ID
ReasonCode string // Network-specific code
Amount int64 // In minor units
Currency string // ISO 4217
TransactionID string // May need fuzzy matching
FiledAt time.Time
RespondBy time.Time
}
type NormalizedDispute struct {
ID string
OriginalTxn string // Your internal transaction ID
Category DisputeCategory // FRAUD, SERVICE, PROCESSING, AUTH
ReasonCode ReasonCode // Mapped internal code
Amount Money
Deadline time.Time
Network Network
RawPayload json.RawMessage // Always keep the original
}
One lesson I learned the hard way: always store the raw payload alongside your normalized version. When a dispute goes to arbitration six months later and someone asks "what exactly did the network send us?", you'll be glad you kept it.
Reason Code Mapping and Routing
Reason codes are the DNA of a chargeback. They tell you why the cardholder disputed the charge, and they dictate what evidence you need to win. The problem is that Visa and Mastercard use completely different code systems, and they change them periodically.
I map network-specific codes to four internal categories: FRAUD, SERVICE, PROCESSING, and AUTHORIZATION. This simplifies routing logic significantly. A fraud dispute routes to the fraud evidence pipeline (pull 3DS results, device fingerprints, IP geolocation). A service dispute routes to the fulfillment pipeline (pull shipping data, delivery confirmation, customer support transcripts).
The routing decision also determines whether to fight or accept. Some disputes aren't worth representing — if the transaction amount is under your cost-per-dispute threshold, or if you know the evidence is weak, it's cheaper to accept the loss and move on. I typically set an auto-accept threshold around $15-20 for disputes where the win probability is below 30%.
Evidence Collection Pipeline
Winning a chargeback representment comes down to one thing: presenting the right evidence for the specific reason code, formatted in a way the issuer can't ignore. This means your system needs to automatically gather evidence from multiple internal services the moment a dispute lands.
For a typical e-commerce dispute, the evidence package might pull from five or six different sources:
- Payment service — transaction details, authorization response, AVS/CVV results, 3D Secure authentication proof
- Order service — what was purchased, pricing breakdown, any applied discounts
- Shipping service — tracking number, carrier, delivery confirmation with signature if available
- Customer service — support ticket history, chat transcripts, email communications
- Auth service — account creation date, login history, device fingerprints, IP addresses used
- Refund service — whether a partial or full refund was already issued
I build this as an async pipeline. When a dispute enters the ENRICHED state, a fan-out job fires requests to each evidence source in parallel. Each source returns what it has, and a collector aggregates everything into a structured evidence package. Sources that fail or time out get retried independently — you don't want a slow shipping API to block your entire evidence collection.
The evidence package then gets scored. I assign a confidence rating based on what evidence is available versus what's needed for the specific reason code. A fraud dispute with 3DS proof and matching AVS gets a high confidence score. A "product not received" dispute without tracking data gets a low score — and that low score might trigger the auto-accept path instead of wasting time on a representment you'll lose.
Warning: Keep your chargeback ratio below 0.9% at all costs. Visa's Dispute Monitoring Program (VDMP) kicks in at 0.9% and 100 disputes per month. Once you're in the program, you face fines starting at $50 per dispute, escalating to $25,000/month in review fees. Mastercard's Excessive Chargeback Program (ECP) has similar thresholds. Getting into these programs is easy — getting out takes months of sustained improvement.
Automated Representment
Once you have evidence collected and scored, the representment itself can be largely automated. The key insight is that representment responses are highly templated — the same reason code always requires the same structure of response, just with different transaction-specific data filled in.
I maintain a template library keyed by internal dispute category and reason code. Each template defines the required evidence fields, the narrative structure, and the formatting rules for the specific network. Visa and Mastercard have different expectations for how evidence should be presented, and getting the format wrong can tank your win rate even with strong evidence.
// Template-based representment builder
type RepresentmentTemplate struct {
Category DisputeCategory
ReasonCodes []ReasonCode
RequiredFields []EvidenceField
OptionalFields []EvidenceField
Narrative string // Go template with evidence placeholders
MaxPages int // Network-specific page limits
}
func BuildRepresentment(dispute NormalizedDispute, evidence EvidencePackage) (*Representment, error) {
template := GetTemplate(dispute.Category, dispute.ReasonCode)
// Verify all required fields are present
for _, field := range template.RequiredFields {
if !evidence.Has(field) {
return nil, fmt.Errorf("missing required evidence: %s", field)
}
}
// Render narrative with evidence data
narrative, err := RenderNarrative(template.Narrative, evidence)
if err != nil {
return nil, fmt.Errorf("narrative render failed: %w", err)
}
return &Representment{
DisputeID: dispute.ID,
Narrative: narrative,
Evidence: evidence.Attachments(),
SubmitBy: dispute.Deadline.Add(-24 * time.Hour), // Buffer
}, nil
}
Notice the 24-hour buffer before the deadline. Never submit at the last minute. Network systems have processing delays, and a submission that arrives at 11:59 PM on the deadline day might not register until the next day. I've seen disputes lost this way. Build in at least a day of buffer, and alert aggressively if anything is within 72 hours of its deadline without being submitted.
Win rate tracking is essential for tuning your templates. I log the outcome of every representment and correlate it back to the template version, the evidence confidence score, and the reason code. Over time, this data tells you which templates are working, which evidence fields actually matter, and where your win rate is weakest. A/B testing different narrative approaches for the same reason code can yield surprising improvements — I've seen win rate jumps of 8-12% just from restructuring how evidence is presented.
Metrics That Matter
You can't improve what you don't measure. These are the numbers I track on every chargeback management system I build, and they should be on a real-time dashboard that the payments team checks daily.
Chargeback ratio is the number everyone watches, but it's a lagging indicator. By the time your ratio spikes, the damage is already done. I prefer to monitor dispute velocity — the rate of new disputes per hour — as an early warning signal. A sudden spike in dispute velocity often means a fraud ring hit your platform or a product issue is driving legitimate complaints.
Win rate tells you how effective your representment process is. Industry average hovers around 30-40%. If you're below that, your evidence collection or templates need work. If you're consistently above 60%, you're doing well — but also check whether you're being too selective about which disputes to fight. You might be leaving money on the table by auto-accepting disputes you could win.
Time-to-respond measures how quickly your system goes from receiving a dispute notification to having a complete evidence package ready. Under 4 hours is good. Under 1 hour is excellent. This metric directly impacts your win rate because faster response times mean more time for human review of edge cases.
Cost per dispute is the total cost of handling a chargeback — network fees, processing costs, and the engineering/ops time spent. This number determines your auto-accept threshold. If it costs you $20 to fight a dispute and the transaction was $18, the math doesn't work regardless of your win probability.
Putting It All Together
The architecture I've described isn't theoretical — it's the pattern I've seen work across multiple payment platforms handling thousands of disputes per month. The key principles are: normalize early, route by reason code, collect evidence in parallel, automate representment with templates, and measure everything.
Start with the ingestion pipeline and the state machine. Get disputes flowing into your system reliably before you worry about automation. Then build evidence collection one source at a time, starting with whatever data source covers your most common reason codes. Automated representment comes last — you need enough historical data to know what works before you can template it effectively.
The merchants and platforms that handle chargebacks well don't just save money on dispute fees. They build institutional knowledge about why disputes happen, which feeds back into fraud prevention, product quality, and customer experience. The chargeback management system becomes a feedback loop that makes the entire payment operation better.
References
- Visa Core Rules and Visa Product and Service Rules
- Mastercard Chargeback Guide
- PCI Security Standards Council — Document Library
Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Pricing and features mentioned are subject to change — always verify with official documentation.