pass
production-simulated
RAG
LLM-last
agentic
repo-aware
DevPulse Evidence Dashboard
DevPulse is a production-simulated RAG + agentic developer change-intelligence system. This dashboard turns the repo artifacts into a visual project cockpit: PRD completion, RAG quality, goal-mode execution, repo-aware risk, patch simulation, and PR-ready review package.
Truth boundary: no real production SaaS, no live users, no live package registry, no real GitHub PR generation, no autonomous merge, and no production deployment.
PRD status
pass
v3.0 final validation
Evidence artifacts
32
EA-01 to EA-32
Failure scenarios
19
F-01 to F-19
Wrong-version rate
0.0
version-safe RAG gate
System Flow
1. Query Mode
Parse, extract version, route, retrieve.
Parse, extract version, route, retrieve.
2. Conflict Layer
Detect deprecated, stale, contradictory docs.
Detect deprecated, stale, contradictory docs.
3. LLM-last Synthesis
Only synthesize after deterministic gates.
Only synthesize after deterministic gates.
4. Goal Mode
Plan dependency migration tasks.
Plan dependency migration tasks.
5. Repo-aware Scan
Map dependency risk to callsites.
Map dependency risk to callsites.
6. PR Simulation
Generate patch, tests, triage, review bundle.
Generate patch, tests, triage, review bundle.
RAG Eval Hardening
pass
| Metric | Value |
|---|---|
| Eval queries | 180 |
| Hybrid Recall@5 | 0.94 |
| Reranker Recall@5 | 0.97 |
| Conflict Macro F1 | 0.966 |
| 37-day queries | 2479 |
Repo-aware Risk
BLOCKED
| Metric | Value |
|---|---|
| Callsites found | 10 |
| Risky callsites | 10 |
| High-risk deps | 4 |
| Medium-risk deps | 1 |
| Low-risk deps | 0 |
| Readiness | BLOCKED |
Patch + PR Simulation
pass
| Metric | Value |
|---|---|
| Patch changes | 4 |
| Apply recommendation | DO_NOT_APPLY_WITHOUT_REVIEW |
| Before tests | pass |
| After patch | review_blocked |
Goal-mode Final Plan
BLOCKED
| Field | Value |
|---|---|
| Total tasks | 5 |
| Safe tasks | 1 |
| Risky tasks | 2 |
| Blocked tasks | 0 |
| Escalated tasks | 2 |
| Recommended action | Do not proceed |
Patch Risk Gates
DO_NOT_APPLY_WITHOUT_REVIEW
- Confirm auth-sdk v3 authenticate signature and credential source.
- Confirm profile-sdk replacement API getUserProfile is correct.
- Confirm logging-lib structured payload shape against current docs.
- Run real package install and test suite before merge.
- Do not merge while aggregate repo readiness is BLOCKED.
Core Demo Reports
Query Mode Demo
EA-24 devpulse_demo_report === DevPulse Query Mode Demo === Status: pass Evidence artifacts EA-01 to EA-24: present Failure scenarios F-01 to F-10: represented Version correctness: wrong_version_answer_rate = 0.0 Conflict detection: 9/9 conflict types covered in deterministic scenario catalog Migration verdicts: SAFE, RISKY, BLOCKED LLM-last: synthesis suppressed for BLOCKED verdicts Citation assembly: programmatic citations from chunk metadata Truth boundary: production-simulated, not production SaaS
Agentic Demo
EA-32 agentic_demo_report
=== DevPulse Goal Mode / Agentic Demo ===
Status: pass
Goal: Assess migration safety for this repo from SDK v2 to SDK v3
Parsed dependencies: 5
Dependency deltas: 5
Planned tasks: 5
Task runs: 6
Recovery actions: 3
Aggregate verdict: BLOCKED
Recommended action: Do not proceed
Task priority order:
- P1 auth-sdk | major | major version jump
- P1 logging-lib | major | major version jump
- P1 profile-sdk | major | major version jump
- P4 unknown-dep | unknown | missing or ambiguous target version
- P6 analytics-sdk | patch | patch or low-risk update
Recovery actions:
- auth-sdk | skip_and_escalate | escalated | capped=False
- profile-sdk | skip_and_escalate | escalated | capped=False
- unknown-dep | retry_with_related_version_evidence | success | capped=False
Staged recommendation:
{
"safe_tasks_can_proceed_independently": [
"analytics-sdk"
],
"risky_tasks_require_caveats_and_reviewer_approval": [
"logging-lib",
"unknown-dep"
],
"blocked_or_escalated_tasks_block_full_migration": [
"auth-sdk",
"profile-sdk"
]
}
Truth boundary: production-simulated, controlled registry, no live package registry, no real GitHub PR generation.
Final Completion
=== DevPulse PRD v3.0 Completion Report === status: pass query_mode_artifacts: 24 goal_mode_artifacts: 8 total_evidence_artifacts: 32 failure_recovery_scenarios: 19 wrong_version_answer_rate: 0.0 agentic_eval_status: pass aggregate_goal_verdict: BLOCKED Truth boundary: production-simulated, non-production, controlled registry, no live users, no live package registry, no real GitHub PR generation. Final verdict: DevPulse v3.0 repo-evidence implementation is complete.
Key Artifact Links
Artifact Inventory
Generated artifacts discovered under outputs/.