Hardware Smoke Test - Rationale¶

Experiment ID: SMOKE-HW Workstream: S (Shadows) Status: Completed (Nov 3, 2025) Phase: Phase 1 Foundation & R&D

Overview¶

The Hardware Smoke Test validates QuartumSE's classical shadows implementation on real IBM quantum hardware for the first time. Following successful simulator validation (SMOKE-SIM with SSR=17.37×), this experiment transitions to noisy intermediate-scale quantum (NISQ) devices to verify hardware integration, characterize performance degradation under realistic noise, and validate provenance capture from IBM Quantum runtime.

Scientific Rationale¶

Why This Experiment?¶

Hardware Integration Validation: Verify that QuartumSE's backend abstraction correctly interfaces with IBM Quantum Runtime, handling job submission, queue management, and result retrieval.
Noise Characterization: Establish baseline performance metrics under realistic hardware noise (gate errors, decoherence, readout errors) to set expectations for extended validation campaigns.
Provenance Under Real Conditions: Test that calibration data capture, backend snapshots, and manifest generation work correctly with live quantum processors (vs. idealized simulators).
Queue & Resource Management: Validate runtime budget tracking, backend selection logic, and operational procedures before committing to expensive multi-experiment campaigns.
Risk Mitigation for Phase 1: Quick validation run (100 shots, 3-qubit GHZ) minimizes cost and queue time while confirming readiness for extended experiments (S-T01, S-T02).

Connection to Larger Research Plan¶

This experiment is a critical gate for Phase 1 progression:

Prerequisite for S-T01/S-T02: Extended GHZ and noise-aware experiments require validated hardware access
Informs Mitigation Strategy: Hardware noise characterization guides MEM and ZNE parameter tuning
Unblocks Cross-Workstream: Chemistry (C-T01), Optimization (O-T01), and other workstreams depend on proven hardware pipeline
Phase 1 Exit Criterion: "End-to-end run on at least one IBM backend" - this experiment satisfies that requirement

Relationship to SMOKE-SIM: - SMOKE-SIM: Ideal performance (SSR=17.37×) - SMOKE-HW: Realistic performance (expected SSR=1.1-2×) - Gap analysis informs v1 noise-aware shadows development

Relevant Literature¶

IBM Quantum Backend Properties (IBM Quantum Documentation, 2024)
Calibration data interpretation (T1, T2, gate/readout errors)
Backend selection best practices for free-tier access
Queue management and runtime optimization
Huang, H.-Y., et al. (2021). "Efficient estimation of Pauli observables by derandomization." Physical Review Letters, 127(3), 030503.
Discusses practical shadow implementation on NISQ hardware
Addresses finite-sampling effects and noise robustness
Temme, K., Bravyi, S., & Gambetta, J. M. (2017). "Error mitigation for short-depth quantum circuits." Physical Review Letters, 119(18), 180509.
Foundational work on measurement error mitigation (MEM)
Informs S-T02 mitigation strategy design
Chen, S., et al. (2021). "Robust shadow estimation." PRX Quantum, 2(3), 030348.
Noise-aware shadows theory (v1 implementation for S-T02)
Predicts variance reduction from inverse channel correction

Expected Outcomes and Success Criteria¶

Primary Success Criteria¶

Successful Hardware Execution: Job completes without errors on IBM backend
Manifest Capture: Complete provenance including IBM calibration snapshot
Observable Estimation: Obtain estimates with confidence intervals for 3-qubit GHZ observables
Runtime Compliance: Stay within 10-minute free-tier window

Secondary Success Criteria¶

SSR Characterization: Quantify performance gap vs. simulator (expect 1.1-2× vs. 17×)
Noise Impact Analysis: Compare observed vs. expected values to characterize hardware errors
Calibration Metadata: Capture T1, T2, gate errors, readout errors for future mitigation
Queue Time Tracking: Document submission-to-completion time for operational planning

Quantitative Targets¶

Metric	Target	Rationale
Execution Success	100%	Must complete without errors
Manifest Completeness	All fields	Provenance validation
Observable Count	5 (3-qubit GHZ)	Same as SMOKE-SIM
Runtime	< 10 minutes	Free-tier compliance
SSR (estimated)	1.1-2×	Realistic noise-limited target
Queue Wait	< 30 minutes	Operational efficiency

Known Limitations¶

Single Backend Test: Only one IBM backend tested (ibm_fez or ibm_torino), not exhaustive
Small Shot Budget: 100 shots to minimize queue time (insufficient for high-precision SSR)
No Mitigation: v0 baseline only, no MEM/ZNE (deferred to S-T02)
Limited Observables: 3-qubit GHZ Z/ZZ only, not comprehensive state validation
Single Trial: One execution per backend, no statistical replication

Next Steps After Completion¶

Upon successful completion: 1. S-T01 Extended GHZ: Scale up to ≥10 trials, larger shadow budgets 2. S-T02 Noise-Aware: Add MEM and compare v0 vs. v1 on same backend 3. Cross-Workstream Launches: Unblock C-T01, O-T01, B-T01, M-T01 with validated hardware access 4. Backend Selection Refinement: Use queue/performance data to optimize future backend choices

Upon failure or unexpected results: 1. Debug hardware integration issues before extended campaigns 2. Adjust shadow budgets or observable selection based on noise levels 3. Consider alternative backends if selected backend underperforms

Part of Phase 1 Research Plan¶

This experiment is the hardware validation gate in the Phase 1 execution chain:

SMOKE-SIM ──✓──> SMOKE-HW ──────> Extended Validation
              (simulator)   (this)             │
                                               ├─> S-T01 (≥10 trials)
                                               ├─> S-T02 (MEM + v1)
                                               └─> C/O/B/M starters

Phase 1 Status: - ✅ SMOKE-SIM: Passed (SSR=17.37×) - 🔄 SMOKE-HW: In progress (this experiment) - ⏳ S-T01/S-T02: Awaiting SMOKE-HW completion - ⏳ Cross-workstream: Awaiting SMOKE-HW completion

Risk Mitigation: Keeping this experiment small (3 qubits, 100 shots, single trial) minimizes cost if unexpected issues arise, while still providing sufficient validation to proceed with confidence.

Additional Considerations¶

Backend Selection Criteria¶

For this smoke test, select IBM backend based on: 1. Low queue depth (prefer < 100 pending jobs) 2. Recent calibration (< 24 hours old) 3. Good qubit quality (readout error < 5%, T1 > 50 μs on target qubits) 4. Free-tier access (ibm_fez, ibm_marrakesh, ibm_torino, ibm_brisbane)

As of Nov 3, 2025: ibm_fez (77 pending jobs) is optimal.

Operational Learning Goals¶

Beyond scientific validation, this experiment tests operational workflows: - quartumse runtime-status CLI for queue monitoring - Backend descriptor parsing (provider:name syntax) - Calibration reuse vs. refresh logic - Manifest storage and retrieval patterns - Webhook notifications for job completion (if configured)

These operational learnings inform Phase 2 hardware campaign planning.