April 16, 2026 9 min read

Automating PCI DSS Evidence Collection with CI/CD — How We Cut Audit Prep from 6 Weeks to 3 Days

Every year, our team spent six painful weeks gathering screenshots, exporting logs, and chasing down proof that we were actually doing what our security policies said. Last year, we finally automated the whole thing. Here's exactly how we did it.

The Annual Compliance Fire Drill

If you've ever been through a PCI DSS audit at a payment company, you know the drill. About two months before the QSA shows up, someone sends a spreadsheet with 300+ line items. Each row needs evidence — a screenshot, a log export, a policy document, a config dump. And every team owns a different slice.

Our first few audits looked like this: the infrastructure team scrambling to export CloudTrail logs, developers digging through Jira to prove we did code reviews, the security team manually running Nessus scans and saving PDFs. Half the evidence was stale by the time we compiled it. The QSA would ask for something slightly different than what we'd prepared, and we'd burn another week pulling fresh exports.

The worst part wasn't the work itself — it was the context switching. Engineers who should have been shipping features were instead taking screenshots of IAM policies and writing explanations for why a particular firewall rule existed.

The turning point came when our QSA told us: "I don't care how you produce the evidence, as long as it's current, complete, and I can verify its integrity." That one sentence changed everything.

The Pipeline That Generates Its Own Proof

The core idea is simple: if your CI/CD pipeline already runs security checks, why not capture the output as audit evidence? Every scan, every test, every policy check already produces structured output. We just needed to collect it, timestamp it, and store it somewhere tamper-proof.

Here's what our evidence-generating pipeline looks like end to end:

Code Push
Git commit signed
SAST Scan
SonarQube report
Container Scan
Trivy CVE report
DAST Scan
OWASP ZAP report
Evidence Store
S3 + Object Lock

Each stage produces a JSON or PDF artifact. Those artifacts get SHA-256 hashed, timestamped, and pushed to an S3 bucket with object lock enabled. The QSA gets a read-only dashboard that links each PCI requirement to its corresponding evidence artifacts.

Mapping Requirements to Automated Checks

The first real work was sitting down with our QSA's evidence request list and figuring out which PCI DSS requirements could be satisfied by automated tooling. Not everything can be — you still need written policies and physical security documentation. But a surprising amount of the technical requirements map cleanly to things a CI/CD pipeline already does or can easily do.

PCI DSS Req What It Covers Automated Tool(s) Status
Req 2 No default passwords, hardened configs Trivy Terraform Automated
Req 6 Secure SDLC, code reviews, vuln management SonarQube ZAP Trivy Automated
Req 8 Access controls, MFA, unique IDs AWS Config IAM Analyzer Automated
Req 10 Logging and monitoring Falco CloudTrail Automated
Req 11 Vulnerability scanning, penetration testing Trivy ZAP Semi-auto
Req 1 Network segmentation, firewall rules Terraform AWS Config Automated

Req 11 is "semi-auto" because while we run automated vulnerability scans continuously, the annual penetration test still requires a human. But even there, we automated the scoping document and the remediation tracking.

The Toolchain in Practice

We didn't adopt all of these at once. It took about three months to get the full pipeline running. Here's what each tool handles and why we picked it:

Trivy runs on every container image build. It catches known CVEs in OS packages and application dependencies, and it can also scan Terraform files for misconfigurations. We gate deployments on critical/high findings — no exceptions. The JSON output becomes evidence for Req 6.5 (address common coding vulnerabilities) and Req 11.3 (vulnerability scanning).

SonarQube handles static analysis. It flags code quality issues, security hotspots, and tracks whether code reviews actually happened via branch analysis. The quality gate report maps directly to Req 6.3 (secure software development).

OWASP ZAP runs as a DAST scan against our staging environment after every deployment. It catches the runtime stuff that SAST misses — injection flaws, broken auth, security misconfigurations in the running application. We run the baseline scan on every PR and the full scan nightly.

Falco monitors our Kubernetes clusters at runtime. It detects unexpected process execution, privilege escalation attempts, and suspicious file access patterns. The alert logs feed directly into our Req 10 (logging and monitoring) evidence package.

GitHub Actions: The Glue

Here's a simplified version of our evidence collection workflow. The real one has more error handling, but this captures the structure:

name: pci-evidence-collection
on:
  push:
    branches: [main]
  schedule:
    - cron: '0 2 * * 1'  # Weekly full scan

jobs:
  evidence:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Container vulnerability scan
        run: |
          trivy image --format json \
            --output trivy-report.json \
            ${{ env.IMAGE_TAG }}

      - name: SAST analysis
        run: |
          sonar-scanner \
            -Dsonar.projectKey=${{ env.PROJECT }} \
            -Dsonar.sources=./src

      - name: DAST scan (staging)
        run: |
          zap-baseline.py -t ${{ env.STAGING_URL }} \
            -J zap-report.json

      - name: Package and sign evidence
        run: |
          TIMESTAMP=$(date -u +%Y%m%dT%H%M%SZ)
          sha256sum *-report.json > checksums.txt
          tar czf evidence-${TIMESTAMP}.tar.gz \
            *-report.json checksums.txt

      - name: Upload to immutable store
        run: |
          aws s3 cp evidence-${TIMESTAMP}.tar.gz \
            s3://pci-evidence-vault/${TIMESTAMP}/ \
            --storage-class GLACIER_IR
        env:
          AWS_REGION: ap-southeast-1

The schedule trigger is important. PCI DSS requires evidence of regular scanning, not just on-deploy scanning. The weekly cron job ensures we have continuous evidence even during quiet periods with no deployments.

6 wks → 3 days
Audit prep time
94%
Evidence automated
12
PCI requirements covered

Immutable Storage: The Part Auditors Actually Care About

Generating evidence is only half the battle. The QSA needs to trust that the evidence hasn't been tampered with after the fact. This is where S3 Object Lock comes in.

We configured our evidence bucket with governance-mode object lock and a 400-day retention period. Once an evidence artifact lands in the bucket, nobody — not even account admins — can modify or delete it within that window. Each object also gets a SHA-256 checksum stored as metadata, and we maintain a separate append-only log of all evidence submissions.

Pro tip: use S3 Glacier Instant Retrieval for evidence older than 90 days. Our storage costs dropped 68% after implementing lifecycle policies, and retrieval is still fast enough for audit requests.

The QSA can independently verify any artifact by checking the hash against our evidence log. We built a simple read-only web UI that lets them browse evidence by PCI requirement, date range, or artifact type. It saved hours of back-and-forth during the actual audit.

How the QSA Responded

I won't pretend the first automated audit was seamless. Our QSA had questions about the pipeline itself — how do we ensure the CI/CD system isn't compromised? Who has access to modify the workflow files? We ended up adding the GitHub audit log and branch protection evidence to the package too.

But once they understood the system, the feedback was overwhelmingly positive. The auditor told us it was the most organized evidence package they'd reviewed that quarter. The key factors:

The audit itself took two days instead of the usual five. Most of that time was spent on the non-technical requirements (physical security, HR policies, vendor management) that we hadn't automated yet.

What I'd Do Differently

Start with the evidence mapping, not the tooling. We wasted a month setting up SonarQube before realizing we needed to restructure how we captured its output. If I did it again, I'd start by listing every evidence artifact the QSA needs, then work backward to figure out which tool produces it and in what format.

Also, get your QSA involved early. We showed them the pipeline design before building it, and their feedback saved us from a few dead ends. Not every QSA will be open to this, but most are tired of reviewing screenshot folders too.

References

Disclaimer: This article reflects the author's personal experience and opinions. Product names, logos, and brands are property of their respective owners. Pricing and features mentioned are subject to change — always verify with official documentation.