Summary
CodeLens has basic check command for CI but lacks baseline diff scanning, strict mode thresholds, manifest-driven test suite, and golden snapshot regression for rules. Add full CI/CD integration suite so CodeLens can serve as a quality gate in GitHub Actions / GitLab CI / Jenkins.
Worker consensus (7 reports)
| Worker |
Source |
Contribution |
| CodeGraph |
update!/CodeLens_CodeGraph_Upgrade_Analysis.md #11 |
Git sync hooks (post-commit, post-merge, post-checkout) running codelens scan --incremental in background. Marker-fenced install. |
| CodeGraph |
same file #20 |
Agent benchmark harness — 7 real-world codebase fixtures (VS Code, Excalidraw, Django, Tokio, OkHttp, Gin, Alamofire). claude -p headless with --strict-mcp-config. CI regression fails if CodeLens+CodeLens result worse than baseline. |
| Opengrep |
update!/CodeLens_Opengrep_Upgrade_Analysis.md #47 |
codelens ci command — auto-detect CI env, --baseline-commit SHA, --diff-depth N (transitive dependents), --error-on-severity <level>, auto-upload SARIF to GitHub. codelens install-ci generates workflow file. |
| Semgrep |
update!/CodeLens_Upgrade_Issues_from_Semgrep.md CL-010 |
--baseline-commit <SHA> + --diff-scan flags to scan/secrets/dataflow/vuln-scan/smell/complexity/dead-code/taint. Output: {new_findings, preexisting_findings, total}. GitHub Actions codelens-pr-check.yml auto-sets baseline to PR base SHA. |
| Semgrep |
same file CL-012 |
Strict mode: --strict (exit non-zero on warning), --error (exit non-zero if severity ≥ high), --severity-threshold <level>, --max-findings N. |
| UBS |
update!/CodeLens_UBS_Upgrade_Analysis.md #7 |
Comparison/baseline delta scan — --comparison=<baseline.json>, compute delta per severity/command/language, --new-only / --show-resolved flags. SARIF automationDetails.guid. |
| UBS |
same file #8 |
--staged (scan only git diff --cached --name-only --diff-filter=ACMR files), --diff / --git-diff (working tree vs HEAD), --diff-vs=<ref>. Target <1s for <50 changed files. |
| UBS |
same file #9 |
Manifest-driven test suite — JSON schema with expect block (exit_code, totals, require_substrings, forbid_substrings). Port UBS run_manifest.py runner (~300 LOC). |
| UBS |
same file #10 |
Rule quality harness with golden snapshot regression — track 3 scopes × 4 metrics, compare vs golden, fail with diff. --update-goldens. |
| OpenTaint |
update!/CodeLens_vs_OpenTaint_Upgrade_Analysis.md B1 |
Rule test harness with @PositiveRuleSample / @NegativeRuleSample annotations. test-result.json per rule. CI workflow .github/workflows/codelens-rule-tests.yml. |
Proposed phased scope
Phase 1 — Baseline + diff scan (P1, 1-2 weeks)
Phase 2 — Strict mode + thresholds (P1, 3 days)
--strict (exit non-zero on warning)
--error (exit non-zero if severity ≥ high)
--severity-threshold <level>
--max-findings N (CI gate)
- Exit code evaluator at end of command execution
Phase 3 — codelens ci orchestration command (P1, 1-2 weeks)
- Auto-detect CI env from env vars (
GITHUB_ACTIONS, GITLAB_CI, JENKINS_URL, BITBUCKET_BUILD_NUMBER)
--baseline-commit SHA (default: PR base SHA)
--diff-depth N (include transitive dependents — reuse dependents_engine.py)
--error-on-severity <level>
- Auto-upload SARIF to GitHub code scanning (if in GitHub Actions)
codelens install-ci generates workflow file (.github/workflows/codelens.yml, .gitlab-ci.yml, Jenkinsfile, bitbucket-pipelines.yml)
Phase 4 — Manifest-driven test suite (P1, 1-2 weeks)
- New
benchmarks/manifest.json schema with per-test expect block
- Port UBS
run_manifest.py runner (~300 LOC, MIT license compatible)
--case, --list, --fail-fast, --tag, --verbose flags
- Capture artifacts per case (
benchmarks/artifacts/<case_id>/)
- Migrate existing fixtures, target 200+ cases in 3 months
Phase 5 — Rule quality harness + golden snapshots (P1, 1 week)
- New
benchmarks/rule_quality_harness.py + benchmarks/goldens/rule_coverage.json
- Track 3 scopes (all, campaign, smoke) × 4 metrics per scope
--update-goldens for intentional rule changes
- CI workflow
codelens-quality-gate.yml
Phase 6 — Git sync hooks (P2, 1 week, optional)
- 3 hook types:
post-commit, post-merge, post-checkout
- Run
codelens scan --incremental in background (via nohup ... & disown, never blocking git)
- Marker-fenced install:
codelens install --git-hooks / codelens uninstall --git-hooks
Phase 7 — Agent benchmark harness (P2, 3-4 weeks, optional)
- 7 real-world codebase fixtures (VS Code TS, Excalidraw, Django, Tokio, OkHttp, Gin, Alamofire)
- Each codebase has 1 canonical architecture question
claude -p headless with --strict-mcp-config, 4 runs per arm, median reported
- CI integration via
codelens-benchmark.yml on release tag
Acceptance criteria
Files
- New:
scripts/git_integration.py, scripts/commands/ci.py, scripts/commands/install_ci.py, scripts/commands/rule_test.py (already proposed in rule validation issue), benchmarks/manifest.json, benchmarks/run_manifest.py, benchmarks/rule_quality_harness.py, benchmarks/goldens/rule_coverage.json, scripts/templates/codelens-ci-{github,gitlab,jenkins}.yml.tmpl, .github/workflows/codelens-pr-check.yml, .github/workflows/codelens-quality-gate.yml
- Update:
scripts/codelens.py (new flags), scripts/commands/check.py, scripts/pre_commit_hook.py
Summary
CodeLens has basic
checkcommand for CI but lacks baseline diff scanning, strict mode thresholds, manifest-driven test suite, and golden snapshot regression for rules. Add full CI/CD integration suite so CodeLens can serve as a quality gate in GitHub Actions / GitLab CI / Jenkins.Worker consensus (7 reports)
update!/CodeLens_CodeGraph_Upgrade_Analysis.md#11post-commit,post-merge,post-checkout) runningcodelens scan --incrementalin background. Marker-fenced install.claude -pheadless with--strict-mcp-config. CI regression fails if CodeLens+CodeLens result worse than baseline.update!/CodeLens_Opengrep_Upgrade_Analysis.md#47codelens cicommand — auto-detect CI env,--baseline-commit SHA,--diff-depth N(transitive dependents),--error-on-severity <level>, auto-upload SARIF to GitHub.codelens install-cigenerates workflow file.update!/CodeLens_Upgrade_Issues_from_Semgrep.mdCL-010--baseline-commit <SHA>+--diff-scanflags to scan/secrets/dataflow/vuln-scan/smell/complexity/dead-code/taint. Output:{new_findings, preexisting_findings, total}. GitHub Actionscodelens-pr-check.ymlauto-sets baseline to PR base SHA.--strict(exit non-zero on warning),--error(exit non-zero if severity ≥ high),--severity-threshold <level>,--max-findings N.update!/CodeLens_UBS_Upgrade_Analysis.md#7--comparison=<baseline.json>, compute delta per severity/command/language,--new-only/--show-resolvedflags. SARIFautomationDetails.guid.--staged(scan onlygit diff --cached --name-only --diff-filter=ACMRfiles),--diff/--git-diff(working tree vs HEAD),--diff-vs=<ref>. Target <1s for <50 changed files.expectblock (exit_code, totals, require_substrings, forbid_substrings). Port UBSrun_manifest.pyrunner (~300 LOC).--update-goldens.update!/CodeLens_vs_OpenTaint_Upgrade_Analysis.mdB1@PositiveRuleSample/@NegativeRuleSampleannotations.test-result.jsonper rule. CI workflow.github/workflows/codelens-rule-tests.yml.Proposed phased scope
Phase 1 — Baseline + diff scan (P1, 1-2 weeks)
--baseline-commit <SHA>flag (merge UBS [ARCH] Replace flat registry with true graph data model (nodes + edges) #7 + Semgrep CL-010)--diff-scan/--staged/--diff-vs=<ref>flags (UBS [ARCH] Replace flat registry with true graph data model (nodes + edges) #8){new_findings, preexisting_findings, total_findings, delta_per_severity}automationDetails.guidfor grouping CI runs(rule_id, file, line, severity).codelens/baseline_<SHA>.jsonscripts/git_integration.pywrappinggit diff --name-onlyPhase 2 — Strict mode + thresholds (P1, 3 days)
--strict(exit non-zero on warning)--error(exit non-zero if severity ≥ high)--severity-threshold <level>--max-findings N(CI gate)Phase 3 —
codelens ciorchestration command (P1, 1-2 weeks)GITHUB_ACTIONS,GITLAB_CI,JENKINS_URL,BITBUCKET_BUILD_NUMBER)--baseline-commit SHA(default: PR base SHA)--diff-depth N(include transitive dependents — reusedependents_engine.py)--error-on-severity <level>codelens install-cigenerates workflow file (.github/workflows/codelens.yml,.gitlab-ci.yml,Jenkinsfile,bitbucket-pipelines.yml)Phase 4 — Manifest-driven test suite (P1, 1-2 weeks)
benchmarks/manifest.jsonschema with per-testexpectblockrun_manifest.pyrunner (~300 LOC, MIT license compatible)--case,--list,--fail-fast,--tag,--verboseflagsbenchmarks/artifacts/<case_id>/)Phase 5 — Rule quality harness + golden snapshots (P1, 1 week)
benchmarks/rule_quality_harness.py+benchmarks/goldens/rule_coverage.json--update-goldensfor intentional rule changescodelens-quality-gate.ymlPhase 6 — Git sync hooks (P2, 1 week, optional)
post-commit,post-merge,post-checkoutcodelens scan --incrementalin background (vianohup ... & disown, never blocking git)codelens install --git-hooks/codelens uninstall --git-hooksPhase 7 — Agent benchmark harness (P2, 3-4 weeks, optional)
claude -pheadless with--strict-mcp-config, 4 runs per arm, median reportedcodelens-benchmark.ymlon release tagAcceptance criteria
codelens scan --baseline-commit <SHA>reports only new findingscodelens check --strict --errorexits non-zero on any high-severity findingcodelens ciauto-detects GitHub Actions and uploads SARIFcodelens install-cigenerates valid workflow file for GitHub/GitLab/Jenkins/BitbucketFiles
scripts/git_integration.py,scripts/commands/ci.py,scripts/commands/install_ci.py,scripts/commands/rule_test.py(already proposed in rule validation issue),benchmarks/manifest.json,benchmarks/run_manifest.py,benchmarks/rule_quality_harness.py,benchmarks/goldens/rule_coverage.json,scripts/templates/codelens-ci-{github,gitlab,jenkins}.yml.tmpl,.github/workflows/codelens-pr-check.yml,.github/workflows/codelens-quality-gate.ymlscripts/codelens.py(new flags),scripts/commands/check.py,scripts/pre_commit_hook.py