Which AI model does PullLight use?

PullLight uses Claude Sonnet 4-5 via the Polsia AI proxy — commercial API access, not a consumer product. Anthropic's API terms explicitly prohibit using customer inputs for model training. Your code is never used to improve any model.

Does PullLight train on my code?

No. PullLight uses Anthropic's API — commercial terms that contractually prohibit training on customer data. Your PR diffs are never stored beyond the current analysis session and are never used to fine-tune or improve any model. See /trust for the full data handling breakdown.

How does PullLight reduce false positives?

PullLight runs all findings through a four-stage filter: severity floor (low-impact findings dropped), confidence threshold (low-confidence findings held), dedup against existing PR comments (no duplicate noise), and suppression rules (generated code, vendored deps, and test fixtures are excluded). Only findings that pass all four stages reach your /reviews queue.

Can PullLight auto-merge or approve PRs?

No. PullLight posts review comments only. It never approves, merges, or blocks PRs. The Check Run reports a risk score (0–100) but conclusion stays neutral — you decide what to do with the findings. All comments require explicit human approval in /reviews before anything posts to your PR.

How long does a review take?

Most PRs are reviewed within 30–90 seconds of the webhook firing. Complex PRs with large diffs (500+ file changes) may take up to 3 minutes. Findings land in /reviews immediately — you approve what you want posted, and approved comments post to the PR via GitHub API.

How PullLight Reviews PRs — The Technical Pipeline

The pipeline

Seven steps from GitHub webhook to a finding in your /reviews queue.

PR opened / updated
       │
       ▼
< span class="hiw-pipe">├─ GitHub webhook fires
       │
       ▼
< span class="hiw-step">1  Fetch diff + surrounding lines per hunk
       │
       ▼
< span class="hiw-step">2  Fetch top-level config files (automatically — no setup required)
       │    package.json, requirements.txt, go.mod, Cargo.toml,
       │    Gemfile, pom.xml, build.gradle, Dockerfile, .eslintrc, tsconfig.json
       │
       ▼
< span class="hiw-step">3  Prompt construction — diff + context + 9-bucket CWE taxonomy prior
       │
       ▼
< span class="hiw-step">4  Model inference (claude-sonnet-4-5)
       │
       ▼
< span class="hiw-step">5  Post-processing: dedup → severity → line anchors → CVSS estimate
       │
       ▼
< span class="hiw-step">6  Finding lands in /reviews — waiting for your approval
       │
       ▼
< span class="hiw-step">7  You approve ──► GitHub API posts review comment to the PR

Step 01

Webhook fires

PullLight receives a GitHub pull_request event (opened or synchronized/updated). No polling. No background jobs scanning branches.

Step 02

Diff + context fetch

PullLight calls GET /repos/{owner}/{repo}/pulls/{pull_number}/files. For each changed file, we read the full diff. For each hunk, we fetch N lines of surrounding context (default: 20 above/below). This gives the model enough context to trace variable origins without loading the entire file.

Step 03

Config files for convention awareness

We fetch repo root files — package.json, requirements.txt, go.mod, Cargo.toml, Gemfile, pom.xml, build.gradle, Dockerfile, .eslintrc, tsconfig.json — whatever exists. This surfaces naming conventions, dependency versions, and lint rules so the model can flag version-specific vulnerability patterns (e.g., "this requires qs >= 4.19.2 because CVE-2024-29041 was fixed in that version").

Step 04

Prompt construction

The prompt includes: (a) the diff with context, (b) the bug-class taxonomy (9 CWE buckets with descriptions and example sinks), (c) instructions to output structured JSON with severity, category, file:line, description, and suggested fix. The taxonomy prior acts like a security-engineering cheat sheet — it biases the model toward known vulnerability patterns rather than generic code style feedback.

Step 05

Model inference

claude-sonnet-4-5 via Polsia AI proxy. Payload is the diff + context + config files + taxonomy prompt. No full repo clone. No git history. No .env files or credentials.

Step 06

Post-processing

Raw model output goes through: dedup (remove findings already commented on by other bots), severity classification (critical/high/medium/low based on CVSS-equivalent scoring), line-anchor resolution (map model file references to actual line numbers in the PR diff), CVSS estimation.

Step 07

Human approval gate

All findings land in /reviews. Nothing auto-posts. You approve individual comments, dismiss false positives, or skip the review entirely. Only approved comments hit the PR via GitHub's POST /repos/{owner}/{repo}/pulls/{pull_number}/reviews API.

What we actually look for

PullLight is trained on a 9-bucket CWE taxonomy. Every finding is tagged to one of these. The model doesn't just look for "bad code" — it looks for specific attack primitives.

CWE	Bug Class	Example Sinks
CWE-78	Remote Code Execution	`eval()`, `new Function()`, `exec()`, `spawn()`, deserialization
CWE-502	Insecure Deserialization	`pickle.load()`, `yaml.load()` (without Loader=), `fastjson`
CWE-22	Path Traversal	`fs.writeFile(filename)`, `sendFile(path)`, URL path join
CWE-1321	Prototype Pollution	`Object.assign({}, userInput)`, deep-merge without key validation
CWE-89	SQL Injection	Template SQL with string interpolation, ORM raw queries
CWE-918	Server-Side Request Forgery	`fetch(url)`, `axios.get(url)` on user-supplied input
CWE-287	Authentication Bypass	Middleware logic gaps, token expiry not checked, route ordering
CWE-94	Server-Side Template Injection	`render(template, ctx)` with unsanitized user input
CWE-78	OS Command Injection	`child_process.exec()`, `shell=True` subprocess calls

Browse the full CWE taxonomy →

How we keep noise down

Four-stage filtering pipeline. Each finding must pass all four stages before it reaches /reviews.

Stage 1 — Severity floor

Low-impact findings dropped

Findings below a minimum severity threshold (based on CVSS-equivalent impact scoring) are dropped before human presentation. Style preferences and variable naming suggestions don't make the cut.

Stage 2 — Confidence threshold

Low-confidence findings held

The model outputs a confidence score for each finding. Below a configurable floor, findings are held but not surfaced. This prevents hallucinated vulnerability claims.

Stage 3 — Dedup

No duplicate noise

Before posting, PullLight checks the existing PR comment thread (via GitHub API) for findings already commented on by CodeRabbit, Copilot, Sonar, Semgrep, etc. If a comment with matching file:line + similar description already exists, the finding is suppressed.

Stage 4 — Suppression rules

Generated code, deps, and tests excluded

Findings are dropped if the code matches patterns like *.min.js, dist/, node_modules/, vendor/, test/, *.lock, __generated__/. Test fixtures, vendored deps, and compiled output never surface as findings.

Result: findings that reach /reviews are actionable, novel, and non-duplicative. You get fewer comments. They mean more.

A worked example: jsPDF path traversal

Here's exactly what PullLight saw and produced for a real CVE — from PR diff to /reviews comment.

The PR diff PullLight analyzed

// src/modules/fileloading.js — vulnerable route handler import { jsPDF } from 'jspdf'; // User uploads a filename from client, server-side PDF generation app.post('/api/generate-report', (req, res) => { const filename = req.body.filename; // <-- User-supplied, no validation const doc = new jsPDF(); doc.text('Report generated: ' + new Date().toISOString(), 10, 20); doc.save(filename); // <-- Pass-through to writeFile, no path sanitization res.json({ status: 'ok' }); });

Context PullLight fetched

The full file contents, the package.json showing "jspdf": "^3.0.4" (vulnerable version range), and the surrounding writeFile() function definition showing that doc.save() calls writeFile directly without path resolution.

What PullLight flagged (simplified prompt taxonomy excerpt)

CWE-22 Path Traversal: File paths constructed from user input without path.resolve() validation. Attacker supplies "../" sequences to escape the intended directory. Example vulnerable pattern: fs.writeFile(userControlledFilename).

Raw model output (structured JSON)

{ "severity": "critical", "cwe": "CWE-22", "file": "src/modules/fileloading.js", "line": 48, "description": "The `filename` parameter from `req.body.filename` is passed directly to `doc.save()` without any path sanitization. An attacker can supply `../../app/config/evil.js` to write files outside the intended output directory. In server-side Node.js deployments, this enables arbitrary file write → RCE. CVSS 9.2.", "fix": "Use path.resolve() and validate the resolved path stays within process.cwd(). Reject filenames with '..' or absolute path components." }

After all four filtering stages

Severity: Critical → passes severity floor. Confidence: High (strong model flag on unvalidated user input → file write) → passes confidence threshold. Dedup: no existing comment at this file:line → not suppressed. Suppression rules: not in node_modules/, test/, or dist/ → passes.

The final /reviews item

🔴 CRITICAL / CWE-22 src/modules/fileloading.js line ~48

The filename parameter from req.body.filename is passed directly to doc.save() without any path sanitization. An attacker can supply ../../app/config/evil.js to write files outside the intended output directory. In server-side Node.js deployments, this enables arbitrary file write → RCE.

Suggested fix:

const pathModule = require('path'); const resolvedPath = pathModule.resolve(options.filenName); if (!resolvedPath.startsWith(process.cwd())) { throw new Error('Invalid filename: path traversal detected'); }

After you approve in /reviews, this posts as a GitHub PR review comment.

See the full jsPDF case study →

What we don't do

Explicitly. Senior engineers always ask.

What we DON'T do	Why it matters
We don't train on your code	API access — Anthropic's commercial terms contractually prohibit training on customer data. Your diffs are never stored beyond the current session and are never used to fine-tune any model.
We don't auto-merge	PullLight posts comments only. The Check Run reports a risk score (0–100) but conclusion stays neutral. You decide what to do with findings.
We don't approve PRs	Same as above — comments only, no approval state changes on GitHub.
We don't post style nits	No formatting, no lint complaints, no variable naming feedback unless it's a security risk (e.g., `eval(userVariable)`).
We don't read your full repo	Diff + on-demand file reads for referenced functions only. No clone, no branch scan, no git history traversal.
We don't post without your approval	Every finding sits in `/reviews` until you explicitly approve. No auto-publish.

Why PullLight beats the competition on the benchmark

The benchmark: 8 real CVEs, all publicly confirmed and fixed before PullLight's testing. PullLight caught all 8. Every competitor (CodeRabbit, Greptile, Copilot PR Review, Qodo) caught 0.

Why general-purpose reviewers miss these bugs

CodeRabbit, Copilot PR Review, and Qodo are built for general code quality — readability, style, correctness. They are not security-purpose-built. They lack the bug-class taxonomy prior that makes PullLight effective on known vulnerability patterns.

When CodeRabbit reviews a diff like doc.save(req.body.filename), it sees a normal API call. The vulnerability lives in the origin of filename (user-supplied HTTP body) — not in the save call itself. A generic reviewer has no model of "user input → file write = path traversal." PullLight does, because the taxonomy prior encodes exactly this pattern.

The three advantages that close the gap

1

Bug-class-aware prompting

The taxonomy prior biases the model toward known vulnerability patterns. It's the difference between "here's a PDF library call" and "here's a file write from user input — check for path traversal."

2

Larger context window

PullLight fetches surrounding code and config files — not just the diff hunk. Knowing the version range was jspdf: "^3.0.4" (vulnerable) is part of the context. General-purpose reviewers miss this.

3

Post-filter dedup

Many competitors post confidently and wrong, cluttering the PR thread. PullLight's four-stage filter ensures only novel, high-confidence, high-severity findings surface. You don't learn to ignore the bot.

See the full benchmark (8 CVEs, auditable) →

How PullLight reviews pull requests