pass production-simulated RAG LLM-last agentic repo-aware

DevPulse Evidence Dashboard

DevPulse is a production-simulated RAG + agentic developer change-intelligence system. This dashboard turns the repo artifacts into a visual project cockpit: PRD completion, RAG quality, goal-mode execution, repo-aware risk, patch simulation, and PR-ready review package.

Truth boundary: no real production SaaS, no live users, no live package registry, no real GitHub PR generation, no autonomous merge, and no production deployment.

PRD status
pass
v3.0 final validation
Evidence artifacts
32
EA-01 to EA-32
Failure scenarios
19
F-01 to F-19
Wrong-version rate
0.0
version-safe RAG gate

System Flow

1. Query Mode
Parse, extract version, route, retrieve.
2. Conflict Layer
Detect deprecated, stale, contradictory docs.
3. LLM-last Synthesis
Only synthesize after deterministic gates.
4. Goal Mode
Plan dependency migration tasks.
5. Repo-aware Scan
Map dependency risk to callsites.
6. PR Simulation
Generate patch, tests, triage, review bundle.

RAG Eval Hardening

pass

MetricValue
Eval queries180
Hybrid Recall@50.94
Reranker Recall@50.97
Conflict Macro F10.966
37-day queries2479

Repo-aware Risk

BLOCKED

MetricValue
Callsites found10
Risky callsites10
High-risk deps4
Medium-risk deps1
Low-risk deps0
ReadinessBLOCKED

Patch + PR Simulation

pass

MetricValue
Patch changes4
Apply recommendationDO_NOT_APPLY_WITHOUT_REVIEW
Before testspass
After patchreview_blocked

Goal-mode Final Plan

BLOCKED

FieldValue
Total tasks5
Safe tasks1
Risky tasks2
Blocked tasks0
Escalated tasks2
Recommended actionDo not proceed

Patch Risk Gates

DO_NOT_APPLY_WITHOUT_REVIEW

  • Confirm auth-sdk v3 authenticate signature and credential source.
  • Confirm profile-sdk replacement API getUserProfile is correct.
  • Confirm logging-lib structured payload shape against current docs.
  • Run real package install and test suite before merge.
  • Do not merge while aggregate repo readiness is BLOCKED.

Core Demo Reports

Query Mode Demo

EA-24 devpulse_demo_report

=== DevPulse Query Mode Demo ===
Status: pass
Evidence artifacts EA-01 to EA-24: present
Failure scenarios F-01 to F-10: represented
Version correctness: wrong_version_answer_rate = 0.0
Conflict detection: 9/9 conflict types covered in deterministic scenario catalog
Migration verdicts: SAFE, RISKY, BLOCKED
LLM-last: synthesis suppressed for BLOCKED verdicts
Citation assembly: programmatic citations from chunk metadata
Truth boundary: production-simulated, not production SaaS

Agentic Demo

EA-32 agentic_demo_report

=== DevPulse Goal Mode / Agentic Demo ===
Status: pass
Goal: Assess migration safety for this repo from SDK v2 to SDK v3
Parsed dependencies: 5
Dependency deltas: 5
Planned tasks: 5
Task runs: 6
Recovery actions: 3
Aggregate verdict: BLOCKED
Recommended action: Do not proceed

Task priority order:
- P1 auth-sdk | major | major version jump
- P1 logging-lib | major | major version jump
- P1 profile-sdk | major | major version jump
- P4 unknown-dep | unknown | missing or ambiguous target version
- P6 analytics-sdk | patch | patch or low-risk update

Recovery actions:
- auth-sdk | skip_and_escalate | escalated | capped=False
- profile-sdk | skip_and_escalate | escalated | capped=False
- unknown-dep | retry_with_related_version_evidence | success | capped=False

Staged recommendation:
{
  "safe_tasks_can_proceed_independently": [
    "analytics-sdk"
  ],
  "risky_tasks_require_caveats_and_reviewer_approval": [
    "logging-lib",
    "unknown-dep"
  ],
  "blocked_or_escalated_tasks_block_full_migration": [
    "auth-sdk",
    "profile-sdk"
  ]
}

Truth boundary: production-simulated, controlled registry, no live package registry, no real GitHub PR generation.

Final Completion

=== DevPulse PRD v3.0 Completion Report ===
status: pass
query_mode_artifacts: 24
goal_mode_artifacts: 8
total_evidence_artifacts: 32
failure_recovery_scenarios: 19
wrong_version_answer_rate: 0.0
agentic_eval_status: pass
aggregate_goal_verdict: BLOCKED

Truth boundary: production-simulated, non-production, controlled registry, no live users, no live package registry, no real GitHub PR generation.

Final verdict: DevPulse v3.0 repo-evidence implementation is complete.

Key Artifact Links

Artifact Inventory

Generated artifacts discovered under outputs/.

CategoryArtifactSize bytes
evidence outputs/evidence/adversarial_trap_results.json 37418
evidence outputs/evidence/agent_goal_parse_sample.json 1509
evidence outputs/evidence/agent_task_execution_trace.json 4365
evidence outputs/evidence/agent_task_plan.json 2770
evidence outputs/evidence/agentic_demo_report.txt 1200
evidence outputs/evidence/agentic_eval_results.json 1610
evidence outputs/evidence/bm25_index_stats.txt 98
evidence outputs/evidence/chunk_metadata_sample.json 4604
evidence outputs/evidence/citation_assembly_sample.json 3498
evidence outputs/evidence/conflict_alerts_schema.sql 342
evidence outputs/evidence/conflict_detection_report.json 1983
evidence outputs/evidence/cost_latency_report.json 242
evidence outputs/evidence/dependency_delta_report.json 2878
evidence outputs/evidence/devpulse_demo_report.txt 508
evidence outputs/evidence/embedding_swap_log.txt 141
evidence outputs/evidence/fallback_events_log.json 2567
evidence outputs/evidence/freshness_report.json 245
evidence outputs/evidence/goal_mode_core_probe.json 13567
evidence outputs/evidence/goal_mode_failure_scenarios_f11_f19.json 1898
evidence outputs/evidence/golden_eval_results.json 1049
evidence outputs/evidence/hybrid_retrieval_report.json 22912
evidence outputs/evidence/ingest_summary.json 372
evidence outputs/evidence/langfuse_trace_export.json 389
evidence outputs/evidence/migration_decision_samples.json 21505
evidence outputs/evidence/pgvector_index_stats.txt 110
evidence outputs/evidence/plan_summary_report.json 1161
evidence outputs/evidence/query_audit_log_sample.json 6231
evidence outputs/evidence/query_mode_core_probe.json 26779
evidence outputs/evidence/query_mode_failure_scenarios_f01_f10.json 1918
evidence outputs/evidence/recovery_decision_log.json 1669
evidence outputs/evidence/retrieval_traces_sample.json 2734
evidence outputs/evidence/sentry_error_summary.txt 205
evidence outputs/evidence/simple_query_results.json 11675
evidence outputs/evidence/synthesis_grounding_report.json 196
evidence outputs/evidence/version_coverage_matrix.json 217
evidence outputs/evidence/version_filter_audit.json 6379
patches outputs/patches/proposed_file_changes.json 2678
patches outputs/patches/proposed_migration_patch.diff 1582
pr_simulation outputs/pr_simulation/pr_body.md 1208
pr_simulation outputs/pr_simulation/pr_diff.patch 1582
pr_simulation outputs/pr_simulation/pr_title.txt 74
pr_simulation outputs/pr_simulation/reviewer_checklist.md 519
pr_simulation outputs/pr_simulation/rollback_plan.md 773
rag_eval outputs/rag_eval/conflict_confusion_matrix.json 2602
rag_eval outputs/rag_eval/corpus_perturbation_report.json 1577
rag_eval outputs/rag_eval/rag_eval_hardening_summary_v35.json 901
rag_eval outputs/rag_eval/reranker_simulation_report.json 1129
rag_eval outputs/rag_eval/retrieval_ablation_report.json 10740
rag_eval outputs/rag_eval/traffic_backtest_37_day_report.json 11611
repo_aware outputs/repo_aware/dependency_usage_map.json 6715
repo_aware outputs/repo_aware/repo_aware_extension_summary.json 790
repo_aware outputs/repo_aware/repo_inspection_report.json 1237
repo_aware outputs/repo_aware/risky_callsite_report.json 6810
reports outputs/reports/devpulse_final_demo_report.txt 473
reports outputs/reports/devpulse_prd_completion_report_v3.json 9461
reports outputs/reports/patch_pr_simulation_summary_v35.json 925
reports outputs/reports/patch_risk_report.json 1323
test_simulation outputs/test_simulation/after_patch_tests_report.json 914
test_simulation outputs/test_simulation/before_tests_report.json 470
test_simulation outputs/test_simulation/test_failure_triage_report.json 1054
validation outputs/validation/goal_mode_core_validation.json 2685
validation outputs/validation/goal_mode_lifecycle_summary.json 367
validation outputs/validation/goal_mode_lifecycle_validation.json 2675
validation outputs/validation/patch_pr_simulation_validation_v35.json 2225
validation outputs/validation/query_mode_core_validation.json 1540
validation outputs/validation/query_mode_lifecycle_summary.json 378
validation outputs/validation/query_mode_lifecycle_validation.json 4368
validation outputs/validation/rag_eval_hardening_validation_v35.json 2491
validation outputs/validation/repo_aware_extension_validation_v35.json 1931
validation outputs/validation/repo_foundation_validation.json 1274