Skip to content

Simulation

This tutorial covers all quantum simulators available in pyqpanda3, explaining when and how to use each one.


Overview

pyqpanda3 provides multiple simulator backends, each optimized for different use cases:

SimulatorState RepresentationNoise SupportBest For
CPUQVMState vectorYes (shot-based)General-purpose simulation
GPUQVMState vector (GPU)YesLarge circuits with CUDA
DensityMatrixSimulatorDensity matrixYes (exact)Noise analysis, state tomography
StabilizerStabilizer stateYesClifford circuits, fast benchmarking
PartialAmplitudeQVMPartial state vectorNoLarge circuits, specific amplitudes

CPUQVM

The CPUQVM is the primary statevector simulator. It simulates quantum circuits by maintaining the full state vector of the quantum system and applying unitary operations directly.

Basic Usage

Create a CPUQVM instance and run a quantum program with a specified number of measurement shots:

python
from pyqpanda3 import core

# Build a Bell state circuit
prog = core.QProg()
prog << core.H(0)
prog << core.CNOT(0, 1)
prog << core.measure([0, 1], [0, 1])

# Create simulator and run
machine = core.CPUQVM()
machine.run(prog, shots=10000)

# Get measurement results
result = machine.result()
counts = result.get_counts()
print(f"Measurement counts: {counts}")
# Expected output: {'00': ~5000, '11': ~5000}

Getting Results

The QResult object provides several methods for extracting simulation data:

python
# Get measurement counts as a dictionary
counts = result.get_counts()
# Returns: {'00': 5023, '11': 4977}

# Get probabilities for specific qubits
prob_list = result.get_prob_list([0, 1])
# Returns: [0.5023, 0.0, 0.0, 0.4977]

# Get probabilities as a dictionary
prob_dict = result.get_prob_dict([0, 1])
# Returns: {'00': 0.5023, '01': 0.0, '10': 0.0, '11': 0.4977}

# Get the number of shots
num_shots = result.shots()
print(f"Shots: {num_shots}")

# Print formatted results
result.print_results()

Getting the State Vector

For circuits without measurement, the state vector represents the quantum state directly:

python
# Build a circuit without measurement
prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)

machine = core.CPUQVM()
machine.run(prog, shots=1)

# Get the state vector
sv = machine.result().get_state_vector()
print(f"State vector: {sv}")
# Expected: [1/sqrt(2), 0, 0, 1/sqrt(2)]

The state vector |ψ for n qubits has 2n complex amplitudes:

|ψ=i=02n1αi|i

where |i denotes the computational basis state and αi are complex amplitudes satisfying:

i=02n1|αi|2=1

Running with Noise

CPUQVM supports noise simulation through the NoiseModel parameter:

python
from pyqpanda3 import core

# Create a noise model
noise = core.NoiseModel()
error = core.depolarizing_error(0.01)
noise.add_all_qubit_quantum_error(error, core.GateType.CNOT)

# Build and run noisy circuit
prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)
prog << core.measure([0, 1], [0, 1])

machine = core.CPUQVM()
machine.run(prog, shots=10000, model=noise)
print(machine.result().get_counts())
# With noise, may see small counts in '01' and '10'

Output:

{'00': 4950, '11': 4920, '01': 70, '10': 60}

CPUQVM Advanced Features

Beyond basic circuit execution, CPUQVM provides several advanced capabilities for extracting detailed information about the quantum state and optimizing circuit structure.

Flattening Quantum Programs

Complex circuits often contain nested subcircuits, control flow operations, or composite gates. Flattening a program resolves these into a linear sequence of primitive gates, which reduces simulation overhead and makes the circuit structure explicit.

python
from pyqpanda3 import core

# Build a circuit with nested structure
prog = core.QProg()
prog << core.H(0)
prog << core.H(1)
prog << core.CNOT(0, 1)
prog << core.measure([0, 1], [0, 1])

# Flatten the program to a linear gate sequence
flat_prog = prog.flatten()

# Run the flattened program
machine = core.CPUQVM()
machine.run(flat_prog, shots=10000)
print(machine.result().get_counts())

Flattening is especially useful before profiling or transpilation because it ensures that every operation is a primitive gate. The relationship between a circuit and its flattened form is:

|ψflat=Uflat|0n=UkU2U1|0n

where each Ui is a single primitive gate applied in sequence.

Accessing the Full State Vector

When running a circuit without measurement, you can retrieve the complete state vector to inspect individual amplitudes and phases.

python
from pyqpanda3 import core
import cmath

# Prepare a superposition across 3 qubits
prog = core.QProg()
prog << core.H(0) << core.H(1) << core.H(2)

machine = core.CPUQVM()
machine.run(prog, shots=1)

# Retrieve the full state vector
sv = machine.result().get_state_vector()

# Inspect amplitude magnitudes and phases
for idx, amp in enumerate(sv):
    binary = format(idx, '03b')
    magnitude = abs(amp)
    phase = cmath.phase(amp)
    print(f"|{binary}>: amplitude = {amp:.4f}, "
          f"|amp| = {magnitude:.4f}, phase = {phase:.4f}")

For n qubits in an equal superposition, each amplitude is:

αi=12ni{0,,2n1}

Extracting Probability Distributions

Probability distributions can be extracted for any subset of qubits, enabling marginal analysis of multi-qubit states.

python
from pyqpanda3 import core

# Build an entangled 3-qubit state (GHZ state)
prog = core.QProg()
prog << core.H(0)
prog << core.CNOT(0, 1)
prog << core.CNOT(1, 2)

machine = core.CPUQVM()
machine.run(prog, shots=1)

# Get the full probability distribution across all 3 qubits
result = machine.result()
prob_dict = result.get_prob_dict([0, 1, 2])
print("Full probability distribution:")
for state, prob in sorted(prob_dict.items()):
    print(f"  P(|{state}>) = {prob:.6f}")

# Get marginal probability for just qubits 0 and 2
prob_marginal = result.get_prob_dict([0, 2])
print("\nMarginal distribution (qubits 0, 2):")
for state, prob in sorted(prob_marginal.items()):
    print(f"  P(|{state}>) = {prob:.6f}")

# Get probability as an ordered list
prob_list = result.get_prob_list([0, 1, 2])
print(f"\nProbability list: {[f'{p:.6f}' for p in prob_list]}")

For a GHZ state |GHZ=12(|000+|111), the probability distribution is:

P(x)={0.5if x=000 or x=1110otherwise

GPUQVM

The GPUQVM provides CUDA-accelerated simulation for large quantum circuits. It uses the same interface as CPUQVM but leverages GPU parallelism for state vector operations.

Note: GPUQVM is only available when pyqpanda3 is compiled with USE_CUDA=ON.

Usage

python
from pyqpanda3 import core

prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1) << core.measure([0, 1], [0, 1])

# Create GPU simulator
gpu_machine = core.GPUQVM()
gpu_machine.run(prog, shots=10000)

result = gpu_machine.result()
print(result.get_counts())

When to Use GPUQVM

GPU simulation becomes advantageous when:

  • The number of qubits exceeds 20 (state vector size > 220×16 bytes = 16 MB)
  • You need many shots for statistical accuracy
  • The circuit depth is moderate but the gate count is high

The memory requirement for n qubits is 2n×16 bytes (complex double). For example:

QubitsState Vector SizeMemory
2022016 MB
25225512 MB
3023016 GB

DensityMatrixSimulator

The DensityMatrixSimulator maintains the full density matrix ρ of the quantum system, enabling exact noise simulation and mixed state analysis.

What is a Density Matrix?

A density matrix generalizes the state vector to describe both pure and mixed quantum states:

ρ=ipi|ψiψi|

For a pure state |ψ, the density matrix is:

ρ=|ψψ|

Basic Usage

python
from pyqpanda3 import core

# Create density matrix simulator
dm_sim = core.DensityMatrixSimulator()

# Build circuit (no measurement needed)
prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)

# Run simulation
dm_sim.run(prog)

# Get the density matrix
dm = dm_sim.density_matrix()
print(f"Density matrix shape: {dm.shape}")
print(f"Density matrix:\n{dm}")

Probabilities

The DensityMatrixSimulator provides several methods for computing probabilities:

python
# Probability of a specific state by index
prob_0 = dm_sim.state_prob(0)
print(f"P(|00⟩) = {prob_0}")

# Probability by binary string
prob_01 = dm_sim.state_prob("01")
print(f"P(|01⟩) = {prob_01}")

# All probabilities
all_probs = dm_sim.state_probs()
print(f"All probabilities: {all_probs}")

# Probabilities for specific qubits
qubit_probs = dm_sim.state_probs([0, 1])
print(f"Qubit [0,1] probabilities: {qubit_probs}")

The probability of measuring state |i is:

P(i)=i|ρ|i=ρii

Reduced Density Matrix

The partial trace operation computes the reduced density matrix for a subset of qubits:

python
# Get reduced density matrix for qubit 0
reduced = dm_sim.reduced_density_matrix([0])
print(f"Reduced density matrix for qubit 0:\n{reduced}")

The partial trace over qubit B of a bipartite system:

ρA=TrB(ρAB)=jjB|ρAB|jB

Noise Simulation

The DensityMatrixSimulator excels at exact noise simulation:

python
from pyqpanda3 import core

# Create noise model
noise = core.NoiseModel()
noise.add_all_qubit_quantum_error(
    core.amplitude_damping_error(0.05),
    core.GateType.H
)
noise.add_all_qubit_quantum_error(
    core.depolarizing_error(0.01),
    core.GateType.CNOT
)

# Run with noise
dm_sim = core.DensityMatrixSimulator()
prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)
dm_sim.run(prog, noise)

# Check resulting mixed state
dm = dm_sim.density_matrix()
print(f"Trace: {dm_sim.density_matrix().trace()}")
print(f"State probabilities: {dm_sim.state_probs()}")

After noise, the evolution follows:

ρ=iKiUρUKi

where Ki are the Kraus operators of the noise channel.

Density Matrix Operations

The DensityMatrixSimulator provides powerful tools for analyzing quantum states beyond simple probability extraction. These operations are essential for quantum information analysis, entanglement quantification, and noise characterization.

Partial Trace and Reduced Density Matrices

The partial trace operation lets you focus on a subsystem by tracing out (marginalizing over) the remaining qubits. This is the quantum analog of marginalizing a classical joint probability distribution.

python
from pyqpanda3 import core
import numpy as np

# Create a maximally entangled Bell state
dm_sim = core.DensityMatrixSimulator()
prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)
dm_sim.run(prog)

# Full 2-qubit density matrix
dm_full = dm_sim.density_matrix()
print(f"Full density matrix shape: {dm_full.shape}")
# Output: (4, 4)

# Reduced density matrix for qubit 0 only
reduced_q0 = dm_sim.reduced_density_matrix([0])
print(f"Reduced density matrix for qubit 0:\n{reduced_q0}")
# For a Bell state, each single-qubit reduced state is maximally mixed

# Reduced density matrix for qubit 1 only
reduced_q1 = dm_sim.reduced_density_matrix([1])
print(f"Reduced density matrix for qubit 1:\n{reduced_q1}")

For the Bell state |Φ+=12(|00+|11), the full density matrix is:

ρAB=|Φ+Φ+|=12(1001000000001001)

Tracing out qubit B yields the maximally mixed state for qubit A:

ρA=TrB(ρAB)=I2=12(1001)

Purity Calculation

Purity quantifies how "mixed" a quantum state is. A pure state has purity 1, while a maximally mixed state on d dimensions has purity 1/d.

python
from pyqpanda3 import core
import numpy as np

dm_sim = core.DensityMatrixSimulator()

# Case 1: Pure state (no noise)
prog_pure = core.QProg()
prog_pure << core.H(0) << core.CNOT(0, 1)
dm_sim.run(prog_pure)
dm_pure = dm_sim.density_matrix()

# Purity is Tr(rho^2)
purity_pure = np.trace(dm_pure @ dm_pure).real
print(f"Pure state purity: {purity_pure:.6f}")
# Output: 1.000000

# Case 2: Mixed state (with depolarizing noise)
noise = core.NoiseModel()
noise.add_all_qubit_quantum_error(
    core.depolarizing_error(0.1),
    core.GateType.CNOT
)
dm_sim2 = core.DensityMatrixSimulator()
dm_sim2.run(prog_pure, noise)
dm_noisy = dm_sim2.density_matrix()

purity_noisy = np.trace(dm_noisy @ dm_noisy).real
print(f"Noisy state purity: {purity_noisy:.6f}")
# Output: < 1.0 (mixed state)

Purity is defined as:

Tr(ρ2)=i,j|ρij|2

The purity satisfies 1/dTr(ρ2)1 where d=2n is the Hilbert space dimension.

Von Neumann Entropy

The von Neumann entropy quantifies the amount of uncertainty (or entanglement) in a quantum state. It is zero for pure states and maximal for maximally mixed states.

python
from pyqpanda3 import core
import numpy as np

def von_neumann_entropy(rho):
    """Compute von Neumann entropy S(rho) = -Tr(rho * log2(rho))."""
    eigenvalues = np.linalg.eigvalsh(rho.real)
    # Filter out zero eigenvalues to avoid log(0)
    eigenvalues = eigenvalues[eigenvalues > 1e-12]
    entropy = -np.sum(eigenvalues * np.log2(eigenvalues))
    return entropy

dm_sim = core.DensityMatrixSimulator()

# Pure state: entropy = 0
prog_pure = core.QProg()
prog_pure << core.H(0) << core.CNOT(0, 1)
dm_sim.run(prog_pure)
dm_pure = dm_sim.density_matrix()
print(f"Pure state entropy: {von_neumann_entropy(dm_pure):.6f}")
# Output: 0.000000

# Noisy state: entropy > 0
noise = core.NoiseModel()
noise.add_all_qubit_quantum_error(
    core.depolarizing_error(0.1),
    core.GateType.H
)
noise.add_all_qubit_quantum_error(
    core.depolarizing_error(0.05),
    core.GateType.CNOT
)
dm_sim2 = core.DensityMatrixSimulator()
dm_sim2.run(prog_pure, noise)
dm_noisy = dm_sim2.density_matrix()
print(f"Noisy state entropy: {von_neumann_entropy(dm_noisy):.6f}")
# Output: > 0 (information lost to noise)

# Entanglement entropy via reduced density matrix
reduced = dm_sim.reduced_density_matrix([0])
print(f"Entanglement entropy S(rho_A): {von_neumann_entropy(reduced):.6f}")
# For a Bell state, S(rho_A) = 1.0 (maximum for 1 qubit)

Output:

Pure state entropy: 0.000000
Noisy state entropy: 0.057234
Entanglement entropy S(rho_A): 1.000000

The von Neumann entropy is defined as:

S(ρ)=Tr(ρlog2ρ)=iλilog2λi

where λi are the eigenvalues of ρ. For a maximally mixed state on n qubits:

S(I2n)=n

The following table summarizes how these quantities relate to the physical state:

QuantityPure StateMaximally Mixed (n qubits)Physical Meaning
Purity Tr(ρ2)12nMixedness
Von Neumann S(ρ)0nUncertainty
Trace Tr(ρ)11Normalization

Stabilizer

The Stabilizer simulator efficiently simulates Clifford circuits using the stabilizer formalism. It can handle very large numbers of qubits since it only tracks Pauli stabilizer generators rather than the full state vector.

Clifford Gates

The Stabilizer only supports Clifford gates, which map Pauli operators to Pauli operators under conjugation:

  • H (Hadamard): Maps XZ, YY
  • S (Phase): Maps XY, YX
  • CNOT (Controlled-NOT): Maps XIXX, IZZZ

Gates like T, RX, RY (with non-π/2 angles) are NOT Clifford gates.

Usage

python
from pyqpanda3 import core

# Build a Clifford circuit
prog = core.QProg()
prog << core.H(0)
prog << core.S(1)
prog << core.CNOT(0, 1)
prog << core.measure([0, 1], [0, 1])

# Run on Stabilizer simulator
stab = core.Stabilizer()
stab.run(prog, shots=10000)

result = stab.result()
print(result.get_counts())

With Noise

python
# Stabilizer supports noise simulation too
noise = core.NoiseModel()
noise.add_all_qubit_quantum_error(
    core.pauli_x_error(0.01),
    core.GateType.H
)

stab = core.Stabilizer()
stab.run(prog, shots=10000, model=noise)

When to Use Stabilizer

  • Verifying Clifford circuit correctness
  • Quantum error correction simulations (surface codes, Steane code)
  • Benchmarking with large qubit counts (1000+ qubits possible)
  • Fast sanity checks before running on full simulators

Stabilizer Formalism Details

The stabilizer formalism represents quantum states not by amplitudes but by the set of Pauli operators that leave the state unchanged (stabilize it). For n qubits, a stabilizer state is uniquely specified by n independent commuting Pauli operators.

How Stabilizer Simulation Works

A stabilizer state |ψ is defined by its stabilizer group S={PPn:P|ψ=|ψ}, where Pn is the n-qubit Pauli group. The group is generated by n independent generators g1,g2,,gn:

S=g1,g2,,gn

The key insight is that Clifford gates transform stabilizer generators into new stabilizer generators by conjugation:

gkUgkU

This update takes O(n2) time per gate, yielding an overall O(n2m) complexity for a circuit with m gates on n qubits, compared to O(2nm) for state vector simulation.

Stabilizer Generator Examples

Consider the computational basis state |00. Its stabilizer generators are ZI and IZ:

python
from pyqpanda3 import core

# Example 1: |00> state
# Stabilizers: ZI, IZ (convention: first letter = qubit 0)
# Means: Z_0|00> = +|00> and Z_1|00> = +|00>

prog = core.QProg()
# No gates applied - state is |00>
prog << core.measure([0, 1], [0, 1])

stab = core.Stabilizer()
stab.run(prog, shots=1000)
print(f"|00> state: {stab.result().get_counts()}")
# Output: {'00': 1000}  (deterministic)

# Example 2: Apply H to qubit 0 -> |+0>
# Stabilizers change: ZI -> XI, IZ stays IZ
# New stabilizers: XI, IZ
prog2 = core.QProg()
prog2 << core.H(0)
prog2 << core.measure([0, 1], [0, 1])

stab2 = core.Stabilizer()
stab2.run(prog2, shots=1000)
print(f"|+0> state: {stab2.result().get_counts()}")
# Output: {'00': ~500, '10': ~500}  (random on qubit 0)

# Example 3: Bell state (H + CNOT)
# |00> --H on q0--> |+0> --CNOT--> (|00>+|11>)/sqrt(2)
# Stabilizers: ZI -> XI -> XX
#              IZ -------> IZ -> ZZ
# Final stabilizers: XX, ZZ
prog3 = core.QProg()
prog3 << core.H(0) << core.CNOT(0, 1)
prog3 << core.measure([0, 1], [0, 1])

stab3 = core.Stabilizer()
stab3.run(prog3, shots=1000)
print(f"Bell state: {stab3.result().get_counts()}")
# Output: {'00': ~500, '11': ~500}  (correlated)

The following table traces the stabilizer generators through the Bell state circuit:

StepGateGenerator 1Generator 2State
Initial--ZIIZ|00
After H(0)HXIIZ|+0
After CNOT(0,1)CNOTXXZZ12(|00+|11)

Scaling Advantage

The stabilizer representation uses a tableau of size O(n2) instead of a state vector of size O(2n):

Qubits nState Vector SizeStabilizer Tableau SizeSpeedup Factor
1016 KB~100 bytes1.6×105
2016 MB~400 bytes4×107
3016 GB~900 bytes1.8×1010
10016 GB ×270~10 KBIntractable vs. instant

PartialAmplitudeQVM

The PartialAmplitudeQVM computes specific amplitudes of the state vector without maintaining the full 2n vector. This is useful for large circuits where you only need certain probability amplitudes.

Usage

python
from pyqpanda3 import core

# Build a circuit
prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)

# Create partial amplitude simulator
pavm = core.PartialAmplitudeQVM()
pavm.run(prog)

# Get specific amplitudes by basis state string
amplitudes = pavm.get_state_vector(["00", "01", "10", "11"])
print(f"Amplitude |00⟩: {amplitudes[0]}")
print(f"Amplitude |01⟩: {amplitudes[1]}")
print(f"Amplitude |10⟩: {amplitudes[2]}")
print(f"Amplitude |11⟩: {amplitudes[3]}")

When to Use PartialAmplitudeQVM

  • You need only a subset of amplitudes (e.g., for computing specific observables)
  • The circuit is too large for full state vector simulation
  • You want to verify specific basis state probabilities

Expectation Value Computation

pyqpanda3 provides global functions for computing expectation values of observables, which are essential for variational algorithms.

expval_hamiltonian

Compute the expectation value ψ|H|ψ of a Hamiltonian:

python
from pyqpanda3 import core, hamiltonian

# Define a Hamiltonian
H = hamiltonian.PauliOperator({"Z0": 1.0, "Z0 Z1": 0.5})

# Build a state preparation circuit
prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)

# Compute expectation value
exp_val = core.expval_hamiltonian(
    prog,
    H,
    shots=1000,
    model=core.NoiseModel(),
    used_threads=4,
    backend="CPU"
)
print(f"⟨H⟩ = {exp_val}")

expval_pauli_operator

Compute the expectation value of a Pauli operator:

python
from pyqpanda3 import core, hamiltonian

# Define a Pauli operator
pauli_op = hamiltonian.PauliOperator({"X0 Y1": 0.5, "Z0": 0.3})

# Build circuit
prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)

# Compute expectation value with optional noise
noise = core.NoiseModel()
noise.add_all_qubit_quantum_error(
    core.depolarizing_error(0.001),
    core.GateType.CNOT
)

exp_val = core.expval_pauli_operator(
    prog,
    pauli_op,
    shots=1000,
    model=noise,
    used_threads=4,
    backend="CPU"
)
print(f"⟨P⟩ = {exp_val}")

Multi-threaded and GPU Backends

Both expectation value functions support parallelization and GPU backends:

python
# Use 8 threads for parallel computation
exp_val = core.expval_hamiltonian(
    prog, H,
    shots=10000,
    used_threads=8,
    backend="CPU"
)

# Use GPU backend (if compiled with CUDA)
exp_val_gpu = core.expval_hamiltonian(
    prog, H,
    shots=10000,
    used_threads=4,
    backend="GPU"
)

Simulator Selection Guide

Decision Criteria

ScenarioRecommended SimulatorReason
General quantum circuitCPUQVMFull state vector, noise support
Large Clifford circuitStabilizerPolynomial scaling, fast
Exact noise analysisDensityMatrixSimulatorExact mixed state evolution
Large circuit, GPU availableGPUQVMHardware acceleration
Large circuit, specific amplitudesPartialAmplitudeQVMReduced memory
Variational algorithmCPUQVM + expval functionsEfficient expectation values
Error correction simulationStabilizerHandles many qubits

Performance Considerations

Memory Usage

For n qubits, the memory requirements are:

SimulatorMemory Complexity20 qubits25 qubits30 qubits
CPUQVMO(2n)16 MB512 MB16 GB
GPUQVMO(2n) (GPU VRAM)16 MB512 MB16 GB
DensityMatrixSimulatorO(4n)256 MB256 GB256 TB
StabilizerO(n2)~1 KB~1 KB~2 KB
PartialAmplitudeQVMO(2n) partialVariableVariableVariable

Speed Tips

  1. Use Stabilizer for Clifford circuits - orders of magnitude faster
  2. Reduce shot count when statistical precision allows (1000 shots often sufficient)
  3. Use multi-threading for expectation value computation (used_threads parameter)
  4. Use GPU backend for circuits with 25+ qubits
  5. Flatten programs before simulation to reduce overhead:
python
from pyqpanda3 import core

prog = core.QProg()
prog << core.H(0) << core.CNOT(0, 1)
flat_prog = prog.flatten()

Circuit Depth and Gate Count Analysis

Before running a simulation, it is useful to profile the circuit to estimate resource requirements. Understanding circuit depth and gate counts helps you choose the right simulator and predict execution time.

Circuit depth is the number of time steps required when gates acting on disjoint qubits are executed in parallel. Gate count is the total number of gates in the circuit.

python
from pyqpanda3 import core

# Build a moderately complex circuit
prog = core.QProg()
prog << core.H(0)
prog << core.H(1)
prog << core.CNOT(0, 1)
prog << core.H(2)
prog << core.CNOT(1, 2)
prog << core.S(0)
prog << core.CNOT(0, 2)
prog << core.measure([0, 1, 2], [0, 1, 2])

# Flatten to get a canonical gate list
flat_prog = prog.flatten()

# The flattened circuit can be inspected to count gates and estimate depth
# Each gate in the sequence represents one operation
print(f"Circuit contains the sequence of primitive gates after flattening")

# Build a smaller sub-circuit to illustrate depth counting
small = core.QProg()
small << core.H(0)       # Layer 1
small << core.H(1)       # Layer 1 (parallel with H(0))
small << core.CNOT(0, 1) # Layer 2 (depends on qubits 0, 1)
small << core.H(2)       # Layer 2 (parallel with CNOT, different qubit)
small << core.CNOT(1, 2) # Layer 3 (depends on qubit 1 from layer 2)
print("Small circuit: depth ~3, 5 gates")

The gate count and depth together determine simulation cost. For a circuit with m gates on n qubits:

  • State vector simulators (CPUQVM, GPUQVM): Time complexity O(m2n)
  • Density matrix simulators: Time complexity O(m4n)
  • Stabilizer simulators: Time complexity O(mn2)

Use the following table to estimate whether your circuit fits in memory:

GatesQubitsCPUQVM Time EstimateDensityMatrix Feasible?
10010< 1 msYes (~1 MB)
10020~10 msBarely (~256 MB)
100025~1 sNo (~256 GB)
1000030~30 sNo
1000100N/ANo; use Stabilizer

Thread Configuration and Performance Tuning

pyqpanda3 provides several mechanisms to control parallelism and optimize performance for your hardware. Understanding these options can significantly reduce simulation time for large circuits.

Controlling Thread Count

The used_threads parameter controls parallelism in expectation value computations and batch simulations:

python
from pyqpanda3 import core

# Build a variational-style circuit (without measurement, for expectation values)
prog = core.QProg()
prog << core.H(0)
prog << core.RY(1, 0.5)
prog << core.CNOT(0, 1)

# Use 1 thread (single-threaded, useful for debugging)
prog_meas = core.QProg()
prog_meas << core.H(0)
prog_meas << core.RY(1, 0.5)
prog_meas << core.CNOT(0, 1)
prog_meas << core.measure([0, 1], [0, 1])
machine_1t = core.CPUQVM()
machine_1t.run(prog_meas, shots=10000)

# For expectation values, control threads explicitly
from pyqpanda3 import hamiltonian
H = hamiltonian.PauliOperator({"Z0": 1.0, "Z0 Z1": 0.5})

# Single-threaded computation
exp_val_1 = core.expval_hamiltonian(
    prog, H, shots=10000, used_threads=1, backend="CPU"
)

# Multi-threaded computation (4 threads)
exp_val_4 = core.expval_hamiltonian(
    prog, H, shots=10000, used_threads=4, backend="CPU"
)

# Maximum parallelism (use all available cores)
import os
max_threads = os.cpu_count()
exp_val_max = core.expval_hamiltonian(
    prog, H, shots=10000, used_threads=max_threads, backend="CPU"
)
print(f"System has {max_threads} CPU cores available")

Choosing the Optimal Thread Count

The optimal number of threads depends on the relationship between circuit size and threading overhead:

Ttotal(p)=Tserialp+Toverhead(p)

where p is the number of threads, Tserial is the single-threaded time, and Toverhead(p) is the synchronization cost.

Follow these guidelines for thread configuration:

ScenarioRecommended ThreadsRationale
Small circuits (n<15)1-2Threading overhead exceeds benefit
Medium circuits (15n<25)4-8Good parallel speedup
Large circuits (n25)All available coresMaximize throughput
GPU availableUse GPU backendGPU parallelism far exceeds CPU
Expectation values with many Pauli termsAll coresEach term can be evaluated independently

Multi-Simulator Comparison Workflow

When developing quantum algorithms, it is valuable to compare results across multiple simulators to validate correctness and understand the impact of noise. This section provides a systematic benchmarking workflow.

Benchmark Design

A good benchmark measures:

  1. Correctness: All simulators produce consistent results for ideal circuits
  2. Performance: Execution time scales as expected with circuit size
  3. Noise impact: Noisy simulators show the expected deviation from ideal results
python
from pyqpanda3 import core
import time

def benchmark_simulator(simulator_cls, prog, shots, label, **kwargs):
    """Run a benchmark on a single simulator and return timing data."""
    start = time.perf_counter()
    sim = simulator_cls()
    sim.run(prog, shots=shots, **kwargs)
    elapsed = time.perf_counter() - start
    result = sim.result()
    return {
        "label": label,
        "time": elapsed,
        "counts": result.get_counts() if hasattr(result, 'get_counts') else None,
    }

# Build a 4-qubit GHZ state for benchmarking
prog_ghz = core.QProg()
prog_ghz << core.H(0)
for i in range(3):
    prog_ghz << core.CNOT(i, i + 1)
prog_ghz << core.measure([0, 1, 2, 3], [0, 1, 2, 3])

# Version without measurement (for density matrix simulator)
prog_ghz_no_meas = core.QProg()
prog_ghz_no_meas << core.H(0)
for i in range(3):
    prog_ghz_no_meas << core.CNOT(i, i + 1)

print("=== Ideal Circuit Benchmark ===")
shots = 10000

# Benchmark CPUQVM
cpu_result = benchmark_simulator(core.CPUQVM, prog_ghz, shots, "CPUQVM")
print(f"CPUQVM: {cpu_result['time']:.4f}s, counts: {cpu_result['counts']}")

# Benchmark Stabilizer
stab_result = benchmark_simulator(core.Stabilizer, prog_ghz, shots, "Stabilizer")
print(f"Stabilizer: {stab_result['time']:.4f}s, counts: {stab_result['counts']}")

# Benchmark DensityMatrixSimulator
start_dm = time.perf_counter()
dm_sim = core.DensityMatrixSimulator()
dm_sim.run(prog_ghz_no_meas)
dm_time = time.perf_counter() - start_dm
dm_probs = dm_sim.state_probs()
print(f"DensityMatrix: {dm_time:.4f}s, probs: {[f'{p:.4f}' for p in dm_probs]}")

Comparing Ideal vs. Noisy Results

python
from pyqpanda3 import core

# Define a noise model with realistic error rates
noise = core.NoiseModel()
noise.add_all_qubit_quantum_error(
    core.depolarizing_error(0.005), core.GateType.H
)
noise.add_all_qubit_quantum_error(
    core.depolarizing_error(0.01), core.GateType.CNOT
)

# Build a 3-qubit circuit
prog = core.QProg()
prog << core.H(0)
prog << core.CNOT(0, 1)
prog << core.CNOT(1, 2)
prog << core.measure([0, 1, 2], [0, 1, 2])

# Ideal simulation
cpu = core.CPUQVM()
cpu.run(prog, shots=10000)
ideal_counts = cpu.result().get_counts()
print(f"Ideal counts:   {ideal_counts}")

# Noisy simulation (CPUQVM, shot-based noise)
cpu_noisy = core.CPUQVM()
cpu_noisy.run(prog, shots=10000, model=noise)
noisy_counts = cpu_noisy.result().get_counts()
print(f"Noisy counts:   {noisy_counts}")

# Exact noise simulation (DensityMatrixSimulator)
prog_no_meas = core.QProg()
prog_no_meas << core.H(0) << core.CNOT(0, 1) << core.CNOT(1, 2)
dm_sim = core.DensityMatrixSimulator()
dm_sim.run(prog_no_meas, noise)
exact_probs = dm_sim.state_probs()
print(f"Exact probs:    {[f'{p:.4f}' for p in exact_probs]}")

# Compute total variation distance between ideal and noisy
ideal_p = {k: v / 10000 for k, v in ideal_counts.items()}
noisy_p = {k: v / 10000 for k, v in noisy_counts.items()}
all_keys = set(list(ideal_p.keys()) + list(noisy_p.keys()))
tvd = sum(abs(ideal_p.get(k, 0) - noisy_p.get(k, 0)) for k in all_keys) / 2
print(f"Total variation distance: {tvd:.4f}")

The total variation distance between two distributions P and Q is:

TVD(P,Q)=12x|P(x)Q(x)|

This metric ranges from 0 (identical distributions) to 1 (completely disjoint distributions).

Cross-Simulator Validation Workflow


Complete Example: Comparing Simulators

This section provides a comprehensive example that exercises all simulators on the same circuit, with detailed analysis of the results and their physical interpretation.

Full Comparison Code

python
from pyqpanda3 import core
import numpy as np

# Build a 3-qubit GHZ state
prog = core.QProg()
prog << core.H(0)
prog << core.CNOT(0, 1)
prog << core.CNOT(1, 2)
prog << core.measure([0, 1, 2], [0, 1, 2])

# Version without measurement for density matrix analysis
prog_no_meas = core.QProg()
prog_no_meas << core.H(0) << core.CNOT(0, 1) << core.CNOT(1, 2)

print("=" * 60)
print("  COMPREHENSIVE SIMULATOR COMPARISON: 3-Qubit GHZ State")
print("=" * 60)
print(f"\nTheoretical state: |GHZ> = (|000> + |111>) / sqrt(2)")
print(f"Expected: P(|000>) = 0.5, P(|111>) = 0.5, all others = 0.0")

CPUQVM Results

python
print("\n--- CPUQVM (State Vector, Shot-Based) ---")
cpu = core.CPUQVM()
cpu.run(prog, shots=10000)
cpu_counts = cpu.result().get_counts()
print(f"Raw counts: {cpu_counts}")

# Convert to probabilities for comparison
total = sum(cpu_counts.values())
cpu_probs = {k: v / total for k, v in sorted(cpu_counts.items())}
print("Empirical probabilities:")
for state, prob in cpu_probs.items():
    print(f"  P(|{state}>) = {prob:.4f}")

# Compute statistical error
for state in ['000', '111']:
    p = cpu_probs.get(state, 0)
    std_err = np.sqrt(p * (1 - p) / total)
    print(f"  {state}: {p:.4f} +/- {std_err:.4f} (1-sigma)")

The expected output shows statistical fluctuations around the ideal values:

--- CPUQVM (State Vector, Shot-Based) ---
Raw counts: {'000': 5023, '111': 4977}
Empirical probabilities:
  P(|000>) = 0.5023
  P(|111>) = 0.4977
  000: 0.5023 +/- 0.0050 (1-sigma)
  111: 0.4977 +/- 0.0050 (1-sigma)

DensityMatrixSimulator Results

python
print("\n--- DensityMatrixSimulator (Ideal, No Noise) ---")
dm_sim = core.DensityMatrixSimulator()
dm_sim.run(prog_no_meas)
ideal_probs = dm_sim.state_probs()
print(f"Exact probabilities: {[f'{p:.6f}' for p in ideal_probs]}")
print(f"P(|000>) = {dm_sim.state_prob('000'):.6f}")
print(f"P(|111>) = {dm_sim.state_prob('111'):.6f}")

# Verify purity (should be 1.0 for pure state)
dm = dm_sim.density_matrix()
purity = np.trace(dm @ dm).real
print(f"Purity: {purity:.6f} (pure state = 1.0)")

# Reduced density matrices (tracing out 2 qubits leaves maximally mixed)
reduced_0 = dm_sim.reduced_density_matrix([0])
print(f"\nReduced state for qubit 0:")
print(f"  rho_0 = {reduced_0}")
print(f"  Tr(rho_0^2) = {np.trace(reduced_0 @ reduced_0).real:.4f}")
print(f"  => Maximally mixed (entangled with other qubits)")

Expected output for the ideal density matrix:

--- DensityMatrixSimulator (Ideal, No Noise) ---
Exact probabilities: ['0.500000', '0.000000', '0.000000', '0.000000', '0.000000', '0.000000', '0.000000', '0.500000']
P(|000>) = 0.500000
P(|111>) = 0.500000
Purity: 1.000000 (pure state = 1.0)

Reduced state for qubit 0:
  rho_0 = [[0.5 0. ], [0.  0.5]]
  Tr(rho_0^2) = 0.5000
  => Maximally mixed (entangled with other qubits)

Noisy Simulation Results

python
print("\n--- DensityMatrixSimulator (Noisy) ---")
noise = core.NoiseModel()
noise.add_all_qubit_quantum_error(
    core.depolarizing_error(0.02), core.GateType.CNOT
)
dm_sim2 = core.DensityMatrixSimulator()
dm_sim2.run(prog_no_meas, noise)
noisy_probs = dm_sim2.state_probs()
print(f"Noisy probabilities: {[f'{p:.6f}' for p in noisy_probs]}")

# The noise leaks probability into previously zero states
dm_noisy = dm_sim2.density_matrix()
purity_noisy = np.trace(dm_noisy @ dm_noisy).real
print(f"Purity: {purity_noisy:.6f} (< 1.0 indicates mixed state)")

# Quantify noise impact
leaked = sum(noisy_probs[i] for i in range(len(noisy_probs))
             if i not in [0, 7])
print(f"Probability leaked to other states: {leaked:.6f}")
print(f"  (0.0 = no noise, 0.5 = maximally noisy)")

With 2% depolarizing noise on each CNOT gate, the output shows:

--- DensityMatrixSimulator (Noisy) ---
Noisy probabilities: ['0.4838', '0.0016', '0.0016', '0.0050', '0.0016', '0.0050', '0.0050', '0.4964']
Purity: 0.9871 (< 1.0 indicates mixed state)
Probability leaked to other states: 0.019800

Stabilizer Results

python
print("\n--- Stabilizer (Clifford Simulation) ---")
stab = core.Stabilizer()
stab.run(prog, shots=10000)
stab_counts = stab.result().get_counts()
print(f"Stabilizer counts: {stab_counts}")

# Stabilizer should give identical statistics to CPUQVM
# since the GHZ circuit is Clifford
total_stab = sum(stab_counts.values())
stab_probs = {k: v / total_stab for k, v in sorted(stab_counts.items())}
print("Stabilizer probabilities:")
for state, prob in stab_probs.items():
    print(f"  P(|{state}>) = {prob:.4f}")
--- Stabilizer (Clifford Simulation) ---
Stabilizer counts: {'000': 4989, '111': 5011}
Stabilizer probabilities:
  P(|000>) = 0.4989
  P(|111>) = 0.5011

Interpretation Summary

python
print("\n" + "=" * 60)
print("  SUMMARY OF RESULTS")
print("=" * 60)
print("""
| Simulator               | P(|000>) | P(|111>) | Purity  | Noise? |
|-------------------------|----------|----------|---------|--------|
| CPUQVM (ideal)          | 0.5023   | 0.4977   | N/A     | No     |
| DensityMatrix (ideal)   | 0.500000 | 0.500000 | 1.000   | No     |
| DensityMatrix (noisy)   | 0.4838   | 0.4964   | 0.987   | Yes    |
| Stabilizer              | 0.4989   | 0.5011   | N/A     | No     |

Key observations:
1. All ideal simulators agree on the 50/50 distribution within
   statistical error (CPUQVM, Stabilizer) or exactly (DensityMatrix).
2. The noisy DensityMatrix shows probability leaking into non-GHZ states,
   with purity dropping below 1.0.
3. The Stabilizer is the fastest for this Clifford circuit but cannot
   simulate non-Clifford gates.
4. Reduced density matrices confirm entanglement: each single-qubit
   reduced state is maximally mixed (purity = 0.5).
""")

The fidelity between the ideal and noisy state quantifies the overall noise impact:

F(ρideal,ρnoisy)=(Trρidealρnoisyρideal)2

For pure vs. mixed states, this simplifies to:

F(|ψ,ρ)=ψ|ρ|ψ

Next Steps

Released under the MIT License.