Skip to content

QuartumSE Strategic Review & Next Steps

Date: 2025-10-30 Reviewer: Claude (AI Assistant) Context: In-depth project review at Phase 1 midpoint Target Audience: Project leadership and technical leads


Executive Summary

TL;DR: QuartumSE has solid technical foundations and clear strategic vision, but Phase 1 execution is significantly behind schedule. CI/CD infrastructure is now robust, but actual quantum experiments and data collection are minimal. The project needs to shift from infrastructure work to experimental execution immediately to meet Phase 1 exit criteria by Nov 2025.

Current Status: ⚠️ Phase 1 at risk - Infrastructure ✅ Complete, Experiments ❌ Behind

Key Findings

Area Status Risk Level
Infrastructure & CI/CD ✅ Robust 🟢 Low
Documentation ✅ Comprehensive 🟢 Low
Code Quality ⚠️ 23% coverage 🟡 Medium
Phase 1 Experiments ❌ Incomplete 🔴 High
Hardware Validation ❌ Not started 🔴 Critical
Timeline ⚠️ 1 month to deadline 🔴 Critical

Part 1: Current State Assessment

1.1 What's Working Well ✅

Infrastructure Excellence

  • CI/CD Pipeline: Expanded from 1→9 jobs (6 test configs, 3 integration platforms)
  • Documentation: 27 markdown files, professional MkDocs site at quartumse.com
  • Custom Domain: CNAME properly configured and persisting across deployments
  • Sphinx API Docs: 0 warnings after fixing 44→17→10→0 regression
  • Lessons Learned: Comprehensive documentation of debugging processes
  • Git Workflow: 195 commits this month, active development

Verdict: Infrastructure is production-ready for a Phase 1 R&D project.

Strategic Clarity

  • Vision: Clear positioning as "vendor-neutral quantum observability layer"
  • Roadmap: Detailed 5-phase plan (2025-2026) with explicit gates
  • Metrics: Well-defined success criteria (SSR, RMSE@$, CI coverage)
  • Competitive Analysis: Strong differentiation from Mitiq, Q-CTRL, vendor SDKs
  • IP Strategy: Patent themes identified, publication plan in place

Verdict: Strategic direction is sound and defensible.

1.2 Critical Gaps ❌

Experimental Execution (CRITICAL)

Phase 1 Task Checklist Status (from phase1_task_checklist.md):

Task Category Checked Unchecked Completion
Infrastructure 4/4 0 100%
S-T01/T02 Extended GHZ 0/5 5 0%
S-BELL Parallel Bell 0/4 4 0%
S-CLIFF Random Clifford 0/4 4 0%
S-ISING Ising Chain 0/4 4 0%
S-CHEM H₂ Energy 0/4 4 0%
Cross-experiment reporting 0/2 2 0%
Hardware validation 0/3 3 0%
C/O/B/M starters 0/2 2 0%
TOTAL 4/36 32 11%

Reality check: - ✅ 1 experiment has results: S-T01 smoke test (Oct 22, 2025) on ibm_torino - ❌ 0 validation datasets in validation_data/ directory - ❌ 0 experiment results stored in experiments/shadows//results/ - ❌ 32 unchecked tasks* with 1 month until Phase 1 deadline (Nov 2025)

Verdict: CRITICAL BLOCKER - The project has excellent infrastructure but minimal experimental data.

Code Coverage (MEDIUM PRIORITY)

Total Coverage: 23%
Lines: 2,455 valid, 562 covered, 1,893 uncovered

Most Undercovered Modules: - utils/metrics.py: 21% (404 lines, 319 uncovered) - Analysis utilities - utils/runtime_monitor.py: 21% (214 lines, 169 uncovered) - IBM Runtime monitoring - reporting/manifest_io.py: 15% (91 lines, 77 uncovered) - Manifest loading/saving - shadows/v0_baseline.py: 16% (80 lines, 67 uncovered) - Core algorithm! - shadows/v1_noise_aware.py: 16% (73 lines, 61 uncovered) - Core algorithm!

Why this matters: - Core shadow algorithms (v0/v1) are barely tested (16% coverage) - Analysis utilities are untested (21% coverage) - Risk of bugs in production experiments

Verdict: Code quality is adequate for R&D but needs improvement before Phase 3 (public beta).

1.3 Resource Allocation Analysis

October 2025 Activity Breakdown (estimated from commits):

Activity Commits % Time Value
CI/CD fixes ~40 20% Infrastructure
Documentation ~50 26% Quality
Sphinx debugging ~25 13% Infrastructure
Test expansion ~30 15% Quality
Experiments ~20 10% Core Mission
Other (deps, config) ~30 16% Maintenance

Problem: Only ~10% of effort went to actual experiments (the core Phase 1 deliverable).

Opportunity cost: - 40 commits on CI/CD = ~2 weeks of development time - Could have completed 3-4 experiment campaigns - Infrastructure is necessary, but over-invested relative to experimental progress


Part 2: Phase 1 Exit Criteria Analysis

2.1 Roadmap Phase 1 Goals (from roadmap.md)

Exit Criteria: 1. ✅ End-to-end run from notebook → manifest → report on Aer 2. ⚠️ + at least one IBM free-tier backend (partial: 1 smoke test only) 3. ❌ SSR ≥ 1.2× on Shadows-Core (sim) and ≥ 1.1× (IBM) 4. ❌ CI coverage ≥ 80% (current: 23%) 5. ❌ Zero critical issues, reproducible seeds & manifests 6. ❌ Patent themes shortlist (top-3) + experiment data to support novelty

Status: 1/6 complete (17%)

2.2 Experiment Deliverables Gap

Required (from roadmap.md): - S-T01: GHZ(3-5), ⟨Zᵢ⟩, ⟨ZᵢZⱼ⟩, purity, SSR ≥ 1.2 (sim), ≥ 1.1 (IBM) - S-T02: Calibrate inverse channel, variance reduction comparison - C-T01: H₂@STO-3G VQE, energy error ≤ 50 mHa (sim), ≤ 80 mHa (IBM) - O-T01: QAOA on 5-node ring, shot-frugal optimizer comparison - B-T01: 1-3 qubit RB, XEB, log to manifest - M-T01: GHZ(3-4) phase sensing, CI coverage ≥ 0.8

Actual: - ✅ S-T01 smoke test (1 run, Oct 22, ibm_torino, SSR ≥ 1.2) - ❌ S-T02, C-T01, O-T01, B-T01, M-T01: Not started

Data Gap: - Need: ≥10 hardware trials per experiment for statistical validation - Have: 1 smoke test - Gap: ~55 more hardware runs needed (6 experiments × 10 trials - 1 done)

2.3 Timeline Pressure

Days remaining: ~31 days (Oct 30 → Nov 30, 2025)

Required work: - 5 experiments × (design + implement + 10 hardware runs + analysis) = ~125 person-hours - Patent theme shortlist + supporting data documentation = ~20 hours - Code coverage improvement (23% → 80%) = ~80 hours - Total: ~225 person-hours in 31 days = 7.3 hours/day (aggressive but achievable)

Risk: If current pace continues (10% on experiments), Phase 1 will slip by 2-3 months.


Part 3: Strategic Priorities & Recommendations

3.1 Immediate Actions (Next 7 Days) 🚨

Priority 1: Shift to Experimental Execution

STOP: - ❌ Further CI/CD enhancements (already production-ready) - ❌ Documentation polish (already comprehensive) - ❌ Infrastructure work (defer to Phase 2)

START: - ✅ Daily hardware runs: Schedule IBM Quantum jobs systematically - ✅ Experiment execution sprints: Focus blocks of 3-4 hours on single experiments - ✅ Results-first mindset: Get data, analyze later if needed

Priority 2: Execute S-T01/S-T02 Extended Validation

Actionable Steps: 1. Day 1-2: Run S-T01 extended (GHZ 4-5 qubits, 10 trials on ibm_torino or ibm_brisbane) - Use existing experiments/shadows/S_T01_ghz_baseline.py - Add --trials=10 flag and batch execution - Store results in validation_data/s_t01/

  1. Day 3-4: Implement S-T02 noise-aware comparison
  2. Add MEM calibration to S-T01 script
  3. Run with/without MEM (10 trials each)
  4. Document variance reduction

  5. Day 5-7: Analysis and reporting

  6. Compute SSR, CI coverage, MAE across all trials
  7. Generate manifests and HTML reports
  8. Write discussion notes for patent themes

Deliverables: - 20+ hardware runs with manifests - SSR validation data for Phase 1 gate - First material for patent provisional draft

Priority 3: Skeleton Implementations for C/O/B/M

Don't aim for perfection - aim for data:

  • C-T01 (Chemistry):
  • Use existing qiskit.opflow VQE example
  • Run H₂ with 2-3 parameter settings
  • Compare shadow vs grouped readout
  • Target: 3 runs × 2 methods = 6 data points

  • O-T01 (Optimization):

  • QAOA p=1 on 5-node MAX-CUT
  • Fixed parameters, just compare shot efficiency
  • Target: 2 runs × 2 shot budgets = 4 data points

  • B-T01 (Benchmarking):

  • 1-2 qubit RB (use qiskit.ignis)
  • Log to manifest, compare to IBM calibration
  • Target: 2 RB sequences

  • M-T01 (Metrology):

  • GHZ(3) with Z-rotation parameter estimation
  • Target: 1 proof-of-concept run

Time estimate: 2 days per workstream = 8 days total

Rationale: Phase 1 needs breadth (all workstreams touched) more than depth (perfect implementations). Get first data drops to validate end-to-end workflow.

3.2 Medium-Term Actions (Next 2-3 Weeks)

Iterative Hardware Campaigns

Week 1 (Nov 1-7): - S-T01/S-T02 extended validation (20 runs) - C-T01 chemistry starter (6 runs) - Document results continuously

Week 2 (Nov 8-14): - O-T01 optimization starter (4 runs) - B-T01 benchmarking starter (2 runs) - M-T01 metrology starter (1 run)

Week 3 (Nov 15-21): - Repeat high-value experiments for statistical power - Analysis sprint: compute all Phase 1 metrics - Patent theme drafting with experimental evidence

Week 4 (Nov 22-30): - Cross-experiment reporting - Phase 1 gate review preparation - Phase 2 planning

Code Quality Improvements

Defer large refactors, but do: 1. Add integration tests for each new experiment (adds ~10% coverage each) 2. Test analysis utilities as experiments generate data 3. Focus on experiment reproducibility tests (most valuable for Phase 1)

Goal: Reach 40-50% coverage (realistic for R&D phase) rather than 80% (defer to Phase 3).

3.3 Risks & Mitigation

Risk Probability Impact Mitigation
IBM Runtime quota exhaustion High Critical Use quartumse runtime-status weekly; batch jobs; prioritize S-T01/S-T02
Experiments fail (low SSR) Medium High Accept partial results; document learnings; adjust Phase 2
Phase 1 deadline missed High Medium Negotiate 1-month extension; reframe as "Phase 1.5"
Hardware access issues Medium High Pre-book time windows; have Aer backup for dev/test
Burnout from pressure Medium Critical Work sustainably; 7hrs/day not 12hrs/day

Part 4: Technical Debt & Code Quality

4.1 Current Technical Debt

High Priority (address in Phase 2): 1. Shadow algorithms barely tested (16% coverage) - Core v0/v1 implementations need unit tests - Edge cases (empty observables, invalid configs) untested

  1. Reporting pipeline fragile
  2. 15-20% coverage on manifest I/O
  3. Error handling for corrupted manifests missing

  4. No integration tests for hardware

  5. All integration tests use Aer
  6. Real IBM backend behavior untested in CI

Medium Priority (address in Phase 3): 1. Type coverage incomplete - Many functions lack type annotations - Mypy likely missing issues

  1. Experiment reproducibility
  2. Seeds not fully deterministic
  3. Manifest replay not tested end-to-end

  4. Error messages not user-friendly

  5. Stack traces instead of actionable guidance
  6. No troubleshooting docs for common issues

Low Priority (defer to Phase 4+): 1. Performance optimization - No profiling done - Shadow estimation could be vectorized

  1. Logging consistency
  2. Mix of print(), logging, rich.console
  3. No structured logging

4.2 Code Quality Quick Wins

If time permits (don't prioritize over experiments): 1. Add docstring examples to top 5 public APIs 2. Write tests for utils/args.py (already used, should be solid) 3. Add error handling to manifest loading 4. Create troubleshooting guide for common setup issues


Part 5: Resource & Workload Planning

5.1 Realistic Capacity Assessment

Assumptions: - 1 developer (you + Claude assist) - 4-6 effective hours/day (sustainable pace) - 31 days remaining in Phase 1

Available capacity: 124-186 hours

Required work (revised estimates): | Task | Hours | Priority | |------|-------|----------| | S-T01/T02 extended validation | 40 | P0 | | C/O/B/M starter experiments | 50 | P0 | | Analysis & reporting | 30 | P0 | | Patent theme drafting | 15 | P1 | | Code coverage to 40% | 30 | P2 | | Phase 1 gate review prep | 10 | P1 | | P0+P1 Total | 145 | Fits in capacity ✅ |

Verdict: Phase 1 exit criteria achievable if focus shifts to experiments immediately.

Going forward (next 31 days): - 70% on experiments (hardware runs, analysis, reporting) - 15% on documentation (results write-up, patent themes) - 10% on code quality (tests for new code only) - 5% on infrastructure (bug fixes only, no enhancements)

Compare to current allocation: - 10% experiments → 70% experiments (7× increase) ✅


Part 6: Strategic Positioning & Competitive Analysis

6.1 Market Positioning Strengths

Differentiation is clear: 1. Vendor-neutrality: Real advantage as IBM/AWS/IonQ compete 2. Provenance: Unique offering in quantum space 3. Cost-for-accuracy metrics: Novel framing (RMSE@$) 4. Multi-observable reuse: Classical shadows USP

Patent themes (from project_bible.md): 1. Variance-Aware Adaptive Classical Shadows (VACS) ✅ Strong 2. Shadow-VQE readout integration ✅ Strong 3. Shadow-Benchmarking workflow ✅ Moderate

Recommendation: Focus on VACS for first provisional - most defensible IP.

6.2 Competitive Threats

Competitor Threat Level Mitigation
Mitiq (Unitary Fund) Medium They do mitigation, not shadows; different scope
Q-CTRL Fire Opal Low Commercial, expensive; we're open + auditable
IBM Estimator Primitive High If IBM adds shadows to Qiskit, we lose USP
Academic preprints Medium Publish fast; file patents before arXiv

Urgency for IP protection: HIGH - Shadows are hot research area, file provisionals in Phase 2.

6.3 Publication Strategy

Target venues (from roadmap.md): - PRX Quantum (high impact) - npj Quantum Information (fast turnaround) - Quantum journal (open access)

Timing: - Phase 2 (Dec 2025): arXiv preprints - Phase 3 (Q1 2026): journal submissions

Critical: Need strong experimental data from Phase 1 to support papers.


Part 7: Phase 2 Planning Considerations

7.1 Phase 2 Objectives (from roadmap.md)

Focus: Hardware-first iteration & patent drafts (Nov-Dec 2025)

Key deliverables: - Shadows v2 (Fermionic) for 2-RDM - Shadows v3 (Adaptive/Derandomized) - MEM + RC + ZNE combinations - IBM hardware campaign #1 dataset - Provisional patent draft(s) - Two arXiv preprints

Exit criteria: - SSR ≥ 1.3× on IBM - Provisional patent(s) filed - arXiv preprints ready

If Phase 1 extends to Dec 2025 (likely): - Merge Phase 1/2 into "Phase 1 Extended" (Nov-Dec) - Defer Shadows v2/v3 to Phase 2 (Jan-Feb 2026) - Keep patent/publication targets in Phase 2

Rationale: Better to complete Phase 1 properly than rush into Phase 2 with incomplete data.

7.3 Phase Boundary Flexibility

Option A: Strict Gates (current plan) - Phase 1 → Phase 2 requires all 6 exit criteria - Risk: May never advance if criteria too strict

Option B: Flexible Gates (recommended) - Phase 1 → Phase 2 requires core criteria only: 1. S-T01/S-T02 validation complete (SSR ≥ 1.1 on IBM) 2. All workstreams touched (C/O/B/M starters exist) 3. Manifest + reporting workflow proven - Defer: Perfect CI coverage, zero issues, complete patent themes

Recommendation: Adopt Option B - focus on mission-critical experiments, accept some technical debt.


Part 8: Next Actions (Prioritized)

Immediate (This Week)

  1. 📋 Create Experiment Execution Sprint Plan
  2. [ ] Schedule S-T01 extended runs (10 trials)
  3. [ ] Book IBM Quantum time windows (check runtime-status)
  4. [ ] Set up validation_data/ directories for results
  5. [ ] Create experiment execution checklist

  6. 🔬 Start S-T01 Extended Validation

  7. [ ] Run GHZ(4) 10 times on IBM backend
  8. [ ] Store manifests in validation_data/s_t01/
  9. [ ] Analyze SSR, CI coverage, MAE
  10. [ ] Write discussion notes

  11. 📊 Baseline Current Phase 1 Status

  12. [ ] Update phase1_task_checklist.md with accurate checkboxes
  13. [ ] Document what's actually complete vs planned
  14. [ ] Estimate hours required for remaining tasks

Short-Term (Next 2 Weeks)

  1. 🧪 Execute C/O/B/M Starters
  2. [ ] C-T01: H₂ VQE (skeleton implementation)
  3. [ ] O-T01: QAOA MAX-CUT (skeleton)
  4. [ ] B-T01: RB sequences (skeleton)
  5. [ ] M-T01: GHZ phase sensing (skeleton)

  6. 📝 Patent Theme Drafting

  7. [ ] Review S-T01 data for VACS evidence
  8. [ ] Draft provisional patent outline
  9. [ ] Consult IP attorney (if available)

  10. 📈 Analysis & Reporting Sprint

  11. [ ] Compute Phase 1 metrics across all experiments
  12. [ ] Generate HTML/PDF reports for all runs
  13. [ ] Write cross-experiment comparison

Medium-Term (Next 3-4 Weeks)

  1. 🔁 Iterative Validation
  2. [ ] Repeat high-SSR experiments for statistical power
  3. [ ] Run ablation studies (with/without MEM, etc.)
  4. [ ] Document failure modes and lessons

  5. ✅ Phase 1 Gate Review

  6. [ ] Self-assessment against exit criteria
  7. [ ] Decide: proceed to Phase 2 or extend Phase 1?
  8. [ ] Update roadmap based on learnings

  9. 🏗️ Phase 2 Planning

  10. [ ] Design Shadows v2/v3 implementations
  11. [ ] Draft Phase 2 hardware campaign plan
  12. [ ] Identify Phase 2 quick wins

Part 9: Success Metrics & KPIs

9.1 Phase 1 Completion Metrics

Primary (must achieve): - [ ] S-T01/S-T02: ≥10 hardware runs, SSR ≥ 1.1 on IBM - [ ] C/O/B/M: ≥1 run each with manifest + report - [ ] Patent themes: Top 3 identified with supporting data - [ ] Hardware validation: Complete with documented results

Secondary (nice to have): - [ ] Code coverage: 40%+ (up from 23%) - [ ] CI coverage: 60%+ (vs target 80%) - [ ] Cross-provider: At least 1 AWS Braket run

9.2 Weekly Progress Tracking

Propose weekly check-ins: - Mondays: Plan week's experiments, check IBM runtime status - Wednesdays: Mid-week progress review, adjust if needed - Fridays: Data analysis, update checklist, plan next week

Metrics to track: - Hardware runs completed (target: 2-3/week) - Manifests generated (target: match hardware runs) - Tasks checked off (target: 2-3/week) - SSR measurements (target: ≥1.1 on all tests)


Part 10: Conclusion & Recommendations

10.1 Overall Assessment

Grade: B (Good infrastructure, behind on experiments)

Strengths: - ✅ Excellent strategic vision and roadmap - ✅ Professional CI/CD and documentation - ✅ Clear competitive differentiation - ✅ Solid technical foundation

Weaknesses: - ❌ Experimental execution significantly behind schedule - ❌ Code coverage low for core algorithms - ❌ Resource allocation skewed toward infrastructure

Opportunity: - ⚡ 1 month sprint can get Phase 1 back on track - ⚡ Infrastructure is ready - just need to use it - ⚡ Shifting 10% → 70% to experiments = 7× productivity increase

10.2 Critical Path Forward

The next 31 days: 1. Focus obsessively on experiments (70% of time) 2. Execute S-T01/S-T02 extended validation (20 runs) 3. Get first data drops from C/O/B/M (skeleton implementations) 4. Draft patent themes with experimental evidence 5. Prepare Phase 1 gate review

Key principle: "Done is better than perfect" for Phase 1. Get experimental data, even if implementations are basic.

10.3 Risk-Adjusted Recommendations

Recommended Plan: 1. Accept 1-month Phase 1 extension (Nov → Dec 2025) - More realistic given current progress - Maintains quality without excessive pressure

  1. Redefine Phase 1 exit criteria (flexible gates)
  2. Focus on experiments, accept technical debt
  3. 40% code coverage instead of 80%
  4. Provisional patent outline instead of filed provisional

  5. Merge Phase 1/2 objectives (Nov-Dec 2025)

  6. Run Phase 1 experiments + start Phase 2 planning in parallel
  7. Begin patent drafting during experimental data collection

  8. Reassess in January 2026

  9. Formal Phase 2 start after holidays
  10. Fresh roadmap based on Phase 1 learnings

10.4 One-Sentence Recommendation

Stop polishing infrastructure, start running experiments daily, and produce 20+ hardware runs with manifests in the next month to validate the QuartumSE value proposition with real data.


Appendices

Appendix A: Repository Health Metrics

Total Source Files: 26
Total Tests: 104
Total Documentation: 27 files
Commits (Oct 2025): 195
Code Coverage: 23%
CI Jobs: 9 (test×6 + integration×3)
Documentation Site: quartumse.com

Appendix B: Experiment Status Matrix

Experiment Design Implementation Hardware Runs Analysis Report
S-T01 GHZ ⚠️ (1/10)
S-T02 Noise ⚠️ ❌ (0/10)
S-BELL ❌ (0/10)
S-CLIFF ❌ (0/10)
S-ISING ❌ (0/10)
S-CHEM (H₂) ❌ (0/6)
C-T01 VQE ❌ (0/3)
O-T01 QAOA ❌ (0/2)
B-T01 RB ❌ (0/2)
M-T01 Phase ❌ (0/1)

Summary: 10 experiments planned, 1 partially complete (10%), 54 hardware runs needed.

Appendix C: Lessons Learned Summary

From recent CI/CD debugging (docs/ops/lessons_learned_sphinx_ci.md): 1. Don't mix Sphinx and MkDocs documentation 2. Use suppress_warnings correctly (nitpick_ignore ≠ suppress_warnings) 3. Test documentation builds in CI 4. Document debugging processes for future reference 5. Follow your own lessons learned docs (PR #62 regression)

Meta-lesson: Technical debt (Sphinx regressions) can consume significant time. Prevention > firefighting.


Document Status: Draft for review Next Update: After Phase 1 sprint (Nov 7, 2025) Feedback: Please update this document as decisions are made and progress is tracked.