Technical deep-dive

How PullLight reviews pull requests

A 7-step walkthrough of the pipeline that catches real CVEs before they ship — and nothing posts to your PR without your sign-off.

TL;DR

The pipeline

Seven steps from GitHub webhook to a finding in your /reviews queue.

PR opened / updated
       │
       ▼
< span class="hiw-pipe">├─ GitHub webhook fires
       │
       ▼
< span class="hiw-step">1  Fetch diff + surrounding lines per hunk
       │
       ▼
< span class="hiw-step">2  Fetch top-level config files (automatically — no setup required)
       │    package.json, requirements.txt, go.mod, Cargo.toml,Gemfile, pom.xml, build.gradle, Dockerfile, .eslintrc, tsconfig.json
       │
       ▼
< span class="hiw-step">3  Prompt construction — diff + context + 9-bucket CWE taxonomy prior
       │
       ▼
< span class="hiw-step">4  Model inference (claude-sonnet-4-5)
       │
       ▼
< span class="hiw-step">5  Post-processing: dedup → severity → line anchors → CVSS estimate
       │
       ▼
< span class="hiw-step">6  Finding lands in /reviews — waiting for your approval
       │
       ▼
< span class="hiw-step">7  You approve ──► GitHub API posts review comment to the PR

Step 01

Webhook fires

PullLight receives a GitHub pull_request event (opened or synchronized/updated). No polling. No background jobs scanning branches.

Step 02

Diff + context fetch

PullLight calls GET /repos/{owner}/{repo}/pulls/{pull_number}/files. For each changed file, we read the full diff. For each hunk, we fetch N lines of surrounding context (default: 20 above/below). This gives the model enough context to trace variable origins without loading the entire file.

Step 03

Config files for convention awareness

We fetch repo root files — package.json, requirements.txt, go.mod, Cargo.toml, Gemfile, pom.xml, build.gradle, Dockerfile, .eslintrc, tsconfig.json — whatever exists. This surfaces naming conventions, dependency versions, and lint rules so the model can flag version-specific vulnerability patterns (e.g., "this requires qs >= 4.19.2 because CVE-2024-29041 was fixed in that version").

Step 04

Prompt construction

The prompt includes: (a) the diff with context, (b) the bug-class taxonomy (9 CWE buckets with descriptions and example sinks), (c) instructions to output structured JSON with severity, category, file:line, description, and suggested fix. The taxonomy prior acts like a security-engineering cheat sheet — it biases the model toward known vulnerability patterns rather than generic code style feedback.

Step 05

Model inference

claude-sonnet-4-5 via Polsia AI proxy. Payload is the diff + context + config files + taxonomy prompt. No full repo clone. No git history. No .env files or credentials.

Step 06

Post-processing

Raw model output goes through: dedup (remove findings already commented on by other bots), severity classification (critical/high/medium/low based on CVSS-equivalent scoring), line-anchor resolution (map model file references to actual line numbers in the PR diff), CVSS estimation.

Step 07

Human approval gate

All findings land in /reviews. Nothing auto-posts. You approve individual comments, dismiss false positives, or skip the review entirely. Only approved comments hit the PR via GitHub's POST /repos/{owner}/{repo}/pulls/{pull_number}/reviews API.

What we actually look for

PullLight is trained on a 9-bucket CWE taxonomy. Every finding is tagged to one of these. The model doesn't just look for "bad code" — it looks for specific attack primitives.

CWE Bug Class Example Sinks
CWE-78 Remote Code Execution eval(), new Function(), exec(), spawn(), deserialization
CWE-502 Insecure Deserialization pickle.load(), yaml.load() (without Loader=), fastjson
CWE-22 Path Traversal fs.writeFile(filename), sendFile(path), URL path join
CWE-1321 Prototype Pollution Object.assign({}, userInput), deep-merge without key validation
CWE-89 SQL Injection Template SQL with string interpolation, ORM raw queries
CWE-918 Server-Side Request Forgery fetch(url), axios.get(url) on user-supplied input
CWE-287 Authentication Bypass Middleware logic gaps, token expiry not checked, route ordering
CWE-94 Server-Side Template Injection render(template, ctx) with unsanitized user input
CWE-78 OS Command Injection child_process.exec(), shell=True subprocess calls
Browse the full CWE taxonomy →

How we keep noise down

Four-stage filtering pipeline. Each finding must pass all four stages before it reaches /reviews.

Stage 1 — Severity floor

Low-impact findings dropped

Findings below a minimum severity threshold (based on CVSS-equivalent impact scoring) are dropped before human presentation. Style preferences and variable naming suggestions don't make the cut.

Stage 2 — Confidence threshold

Low-confidence findings held

The model outputs a confidence score for each finding. Below a configurable floor, findings are held but not surfaced. This prevents hallucinated vulnerability claims.

Stage 3 — Dedup

No duplicate noise

Before posting, PullLight checks the existing PR comment thread (via GitHub API) for findings already commented on by CodeRabbit, Copilot, Sonar, Semgrep, etc. If a comment with matching file:line + similar description already exists, the finding is suppressed.

Stage 4 — Suppression rules

Generated code, deps, and tests excluded

Findings are dropped if the code matches patterns like *.min.js, dist/, node_modules/, vendor/, test/, *.lock, __generated__/. Test fixtures, vendored deps, and compiled output never surface as findings.

Result: findings that reach /reviews are actionable, novel, and non-duplicative. You get fewer comments. They mean more.

A worked example: jsPDF path traversal

Here's exactly what PullLight saw and produced for a real CVE — from PR diff to /reviews comment.

The PR diff PullLight analyzed

// src/modules/fileloading.js — vulnerable route handler import { jsPDF } from 'jspdf'; // User uploads a filename from client, server-side PDF generation app.post('/api/generate-report', (req, res) => { const filename = req.body.filename; // <-- User-supplied, no validation const doc = new jsPDF(); doc.text('Report generated: ' + new Date().toISOString(), 10, 20); doc.save(filename); // <-- Pass-through to writeFile, no path sanitization res.json({ status: 'ok' }); });

Context PullLight fetched

The full file contents, the package.json showing "jspdf": "^3.0.4" (vulnerable version range), and the surrounding writeFile() function definition showing that doc.save() calls writeFile directly without path resolution.

What PullLight flagged (simplified prompt taxonomy excerpt)

CWE-22 Path Traversal: File paths constructed from user input without path.resolve() validation. Attacker supplies "../" sequences to escape the intended directory. Example vulnerable pattern: fs.writeFile(userControlledFilename).

Raw model output (structured JSON)

{ "severity": "critical", "cwe": "CWE-22", "file": "src/modules/fileloading.js", "line": 48, "description": "The `filename` parameter from `req.body.filename` is passed directly to `doc.save()` without any path sanitization. An attacker can supply `../../app/config/evil.js` to write files outside the intended output directory. In server-side Node.js deployments, this enables arbitrary file write → RCE. CVSS 9.2.", "fix": "Use path.resolve() and validate the resolved path stays within process.cwd(). Reject filenames with '..' or absolute path components." }

After all four filtering stages

Severity: Critical → passes severity floor. Confidence: High (strong model flag on unvalidated user input → file write) → passes confidence threshold. Dedup: no existing comment at this file:line → not suppressed. Suppression rules: not in node_modules/, test/, or dist/ → passes.

The final /reviews item

🔴 CRITICAL / CWE-22 src/modules/fileloading.js line ~48

The filename parameter from req.body.filename is passed directly to doc.save() without any path sanitization. An attacker can supply ../../app/config/evil.js to write files outside the intended output directory. In server-side Node.js deployments, this enables arbitrary file write → RCE.

Suggested fix:

const pathModule = require('path'); const resolvedPath = pathModule.resolve(options.filenName); if (!resolvedPath.startsWith(process.cwd())) { throw new Error('Invalid filename: path traversal detected'); }

After you approve in /reviews, this posts as a GitHub PR review comment.

See the full jsPDF case study →

What we don't do

Explicitly. Senior engineers always ask.

What we DON'T do Why it matters
We don't train on your code API access — Anthropic's commercial terms contractually prohibit training on customer data. Your diffs are never stored beyond the current session and are never used to fine-tune any model.
We don't auto-merge PullLight posts comments only. The Check Run reports a risk score (0–100) but conclusion stays neutral. You decide what to do with findings.
We don't approve PRs Same as above — comments only, no approval state changes on GitHub.
We don't post style nits No formatting, no lint complaints, no variable naming feedback unless it's a security risk (e.g., eval(userVariable)).
We don't read your full repo Diff + on-demand file reads for referenced functions only. No clone, no branch scan, no git history traversal.
We don't post without your approval Every finding sits in /reviews until you explicitly approve. No auto-publish.

Why PullLight beats the competition on the benchmark

The benchmark: 8 real CVEs, all publicly confirmed and fixed before PullLight's testing. PullLight caught all 8. Every competitor (CodeRabbit, Greptile, Copilot PR Review, Qodo) caught 0.

Why general-purpose reviewers miss these bugs

CodeRabbit, Copilot PR Review, and Qodo are built for general code quality — readability, style, correctness. They are not security-purpose-built. They lack the bug-class taxonomy prior that makes PullLight effective on known vulnerability patterns.

When CodeRabbit reviews a diff like doc.save(req.body.filename), it sees a normal API call. The vulnerability lives in the origin of filename (user-supplied HTTP body) — not in the save call itself. A generic reviewer has no model of "user input → file write = path traversal." PullLight does, because the taxonomy prior encodes exactly this pattern.

The three advantages that close the gap

1

Bug-class-aware prompting

The taxonomy prior biases the model toward known vulnerability patterns. It's the difference between "here's a PDF library call" and "here's a file write from user input — check for path traversal."

2

Larger context window

PullLight fetches surrounding code and config files — not just the diff hunk. Knowing the version range was jspdf: "^3.0.4" (vulnerable) is part of the context. General-purpose reviewers miss this.

3

Post-filter dedup

Many competitors post confidently and wrong, cluttering the PR thread. PullLight's four-stage filter ensures only novel, high-confidence, high-severity findings surface. You don't learn to ignore the bot.

See the full benchmark (8 CVEs, auditable) →