量子体积基准测试 -- Quantum Volume Benchmark

使用 pyqpanda3 的量子体积(Quantum Volume, QV)度量全面衡量量子计算机性能。

问题

量子计算机通常仅以量子比特数量来宣传，但量子比特数只说明了部分问题。一台具有高错误率和有限连接性的 100 量子比特机器可能比一台具有高保真度门和全连接性的 20 量子比特机器能解决的实用电路更少。工程师需要一个单一数值度量来捕获门保真度(Gate Fidelity)、量子比特连接性(Qubit Connectivity)、编译器效率(Compiler Efficiency)和电路深度(Circuit Depth)的组合效应。

量子体积(Quantum Volume, QV) 是 IBM 于 2019 年引入的全面基准测试，旨在解决这一差距。QV 不是单独测量各个硬件属性，而是测试量子计算机能否成功实现给定规模的随机电路并以高概率产生正确结果。该度量定义为：

V_{Q} = 2^{n_{Q}}

其中 $n_{Q}$ 是设备能够可靠执行深度为 $n_{Q}$ 的随机电路且重输出概率超过 $2 / 3$ 的最大量子比特数。

关键洞察是 QV 不仅是量子比特计数。一台拥有 50 个量子比特但 QV 仅为 $2^{5} = 32$ 的设备只能可靠处理相当于完美 5 量子比特机器的电路。瓶颈可能是以下任何组合：

门保真度(Gate Fidelity)：双量子比特门误差随电路深度累积
连接性(Connectivity)：硬件拓扑与电路所需交互不匹配时 SWAP 操作的开销
相干时间(Coherence Time)：退相干限制了电路能运行多深
编译质量(Compilation Quality)：编译器将抽象电路映射到原生硬件门的效率
测量误差(Measurement Errors)：读出误差破坏最终结果

这使得 QV 成为真实世界性能的实用预测器：如果你的应用需要宽度和深度为 $n$ 的电路，你需要一台 QV 至少为 $2^{n}$ 的机器。

方案

量子体积基准测试通过测试递增宽度的随机 SU(4) 电路来工作。对于每个宽度 $n$ ，协议如下：

生成随机电路。 创建深度为 $n$ 的 $n$ 量子比特电路，其中每层由施加在量子比特对上的随机 SU(4) 酉操作组成。每层中的量子比特配对是随机排列的。
在经典模拟器上运行。 计算所有 $2^{n}$ 个比特串的理想输出概率分布 $p (x)$ 。识别重输出(Heavy Outputs)——理想概率超过中位数的比特串：

Heavy outputs = {x : p (x) > median [{p (x^{'})}_{x^{'} \in {0, 1}^{n}}]}

在目标设备上运行。 多次执行相同电路（采样）并计算重输出出现的频率。重输出分数(Heavy Output Fraction)为：

h_{n} = \frac{∥ {产生重输出的采样} ∥}{∥ {总采样} ∥}

检查阈值。 如果 $h_{n} > 2 / 3$ 且具有足够的统计置信度（至少 $2 σ$ 高于 $1 / 2$ ，因为随机猜测会产生一半的重输出），则设备通过宽度 $n$ 。
找到最大值。 量子体积是设备通过的最大 $2^{n}$ 。

pyqpanda3 为此工作流提供两个关键函数：

core.QV(num_qubit, depth, seed) 生成一个 QV 电路（QCircuit），使用随机 SU(4) Oracle 层，遵循标准协议。每层随机排列量子比特，将它们分为 $⌊ n / 2 ⌋$ 对，并通过 QOracle 对每对施加随机 SU(4) 酉操作。
core.random_qcircuit(qubits, depth, gate_type) 从指定门集生成通用随机电路，用于标准 QV 协议之外的自定义基准测试。

代码

运行基本量子体积测试

使用 core.QV() 生成标准 QV 电路并在理想模拟器上评估。

python

"""Basic Quantum Volume test on an ideal simulator."""
from pyqpanda3 import core

# Parameters for the QV test
num_qubits = 4
depth = num_qubits  # QV uses depth == width
seed = 42

# Generate a QV circuit: random SU(4) unitaries on permuted qubit pairs
qv_circuit = core.QV(num_qubits, depth, seed)

# Wrap in a program and add measurements
prog = core.QProg()
prog.append(qv_circuit)
for q in range(num_qubits):
    prog.append(core.measure(q, q))

# Run on the ideal CPU simulator
qvm = core.CPUQVM()
shots = 10000
qvm.run(prog, shots)
result = qvm.result()

# Get measurement probabilities
prob_dict = result.get_prob_dict()
print("Ideal output distribution (top 10 outcomes):")
sorted_probs = sorted(prob_dict.items(), key=lambda x: -x[1])[:10]
for bitstring, prob in sorted_probs:
    print(f"  {bitstring}: {prob:.6f}")

手动计算重输出分数

重输出分数是 QV 测试的核心统计量。以下是逐步计算方法。

python

"""Compute the heavy output fraction for a QV circuit."""
from pyqpanda3 import core
import numpy as np


def compute_heavy_outputs(prob_dict: dict) -> set:
    """Identify heavy outputs: bit strings above the median probability.

    Args:
        prob_dict: Dictionary mapping bit strings to ideal probabilities.

    Returns:
        Set of heavy output bit strings.
    """
    probabilities = np.array(list(prob_dict.values()))
    median_prob = np.median(probabilities)
    heavy = {
        bitstring
        for bitstring, prob in prob_dict.items()
        if prob > median_prob
    }
    return heavy


def heavy_output_fraction(prob_dict: dict, counts_dict: dict) -> float:
    """Compute the heavy output fraction from ideal and sampled distributions.

    Args:
        prob_dict: Ideal probability distribution.
        counts_dict: Sampled measurement counts.

    Returns:
        Fraction of shots that produced heavy outputs.
    """
    heavy = compute_heavy_outputs(prob_dict)
    total_shots = sum(counts_dict.values())
    heavy_count = sum(
        count for bitstring, count in counts_dict.items()
        if bitstring in heavy
    )
    return heavy_count / total_shots


# Run QV test
num_qubits = 4
qv_circuit = core.QV(num_qubits, num_qubits, seed=123)

prog = core.QProg()
prog.append(qv_circuit)
for q in range(num_qubits):
    prog.append(core.measure(q, q))

# Get ideal distribution (high shot count for accurate probabilities)
qvm = core.CPUQVM()
qvm.run(prog, 100000)
ideal_probs = qvm.result().get_prob_dict()

# Get sampled distribution (fewer shots simulates a real device)
qvm.run(prog, 1000)
sampled_counts = qvm.result().get_counts()

# Compute heavy output fraction
hof = heavy_output_fraction(ideal_probs, sampled_counts)
print(f"QV-{num_qubits} heavy output fraction: {hof:.4f}")
print(f"Threshold for passing: 0.6667")
print(f"Result: {'PASS' if hof > 2 / 3 else 'FAIL'}")

扫描 QV 宽度以找到最大值

实践中，你测试递增宽度直到设备失败。在理想模拟器上，每个宽度都应该通过。

python

"""Sweep QV widths to determine the maximum Quantum Volume on an ideal simulator."""
from pyqpanda3 import core
import numpy as np


def compute_heavy_outputs(prob_dict: dict) -> set:
    probabilities = np.array(list(prob_dict.values()))
    median_prob = np.median(probabilities)
    return {
        bs for bs, p in prob_dict.items() if p > median_prob
    }


def run_qv_test(num_qubits: int, shots: int = 10000, seed: int = 0) -> float:
    """Run a single QV test and return the heavy output fraction."""
    qv_circuit = core.QV(num_qubits, num_qubits, seed)
    prog = core.QProg()
    prog.append(qv_circuit)
    for q in range(num_qubits):
        prog.append(core.measure(q, q))

    qvm = core.CPUQVM()

    # Ideal probabilities for identifying heavy outputs
    qvm.run(prog, 100000)
    ideal_probs = qvm.result().get_prob_dict()
    heavy = compute_heavy_outputs(ideal_probs)

    # Sampled distribution
    qvm.run(prog, shots)
    counts = qvm.result().get_counts()

    total = sum(counts.values())
    heavy_count = sum(c for bs, c in counts.items() if bs in heavy)
    return heavy_count / total


# Sweep widths from 2 to 7
print(f"{'Width':>6} {'Depth':>6} {'HOF':>8} {'Pass?':>6} {'QV':>8}")
print("-" * 40)

max_passing_width = 0
for n in range(2, 8):
    # Use multiple seeds for statistical robustness
    hofs = []
    for seed in range(10):
        hof = run_qv_test(n, shots=5000, seed=seed * 100 + n)
        hofs.append(hof)
    avg_hof = np.mean(hofs)
    passed = avg_hof > 2 / 3
    if passed:
        max_passing_width = n
    print(f"{n:>6} {n:>6} {avg_hof:>8.4f} {'PASS' if passed else 'FAIL':>6} {2**n:>8}")

quantum_volume = 2 ** max_passing_width if max_passing_width > 0 else 1
print(f"\nQuantum Volume (ideal simulator): {quantum_volume}")

使用 random_qcircuit 生成自定义随机电路

对于标准 QV 协议之外的基准测试，使用 core.random_qcircuit() 从特定门集生成电路。支持的门类型字符串包括："X"、"Y"、"Z"、"H"、"S"、"T"、"RX"、"RY"、"RZ"、"U1"、"U2"、"U3"、"U4"、"P"、"I"、"ISWAP"、"SQISWAP"、"CPHASE"、"RPHI"、"CU"、"SWAP"、"X1"、"Y1"、"Z1"、"RZZ"、"RYY"、"RXX"、"RZX"、"ECHO"、"IDLE"、"CNOT"、"CZ"、"MS"。当门类型列表为空时，默认使用所有类型。

python

"""Generate random circuits with a custom gate set."""
from pyqpanda3 import core

# Define qubits and a custom gate set
qubits = list(range(5))
depth = 20
gate_set = ["H", "X", "RX", "RY", "RZ", "CNOT", "SWAP"]

# Generate a random circuit
circuit = core.random_qcircuit(qubits, depth, gate_set)

# Build a program with measurements
prog = core.QProg()
prog.append(circuit)
for q in qubits:
    prog.append(core.measure(q, q))

# Run and get results
qvm = core.CPUQVM()
qvm.run(prog, 5000)
result = qvm.result()
prob_dict = result.get_prob_dict()
print(f"Number of possible outcomes: {len(prob_dict)}")

使用噪声模拟运行 QV

真实设备是有噪声的。通过对所有门施加去极化误差来模拟噪声对 QV 性能的影响。

python

"""Quantum Volume test with a depolarizing noise model."""
from pyqpanda3 import core
import numpy as np


def compute_heavy_outputs(prob_dict: dict) -> set:
    probabilities = np.array(list(prob_dict.values()))
    median_prob = np.median(probabilities)
    return {bs for bs, p in prob_dict.items() if p > median_prob}


# --- Build QV circuit ---
num_qubits = 4
qv_circuit = core.QV(num_qubits, num_qubits, seed=42)

prog = core.QProg()
prog.append(qv_circuit)
for q in range(num_qubits):
    prog.append(core.measure(q, q))

# --- Ideal reference (no noise) ---
qvm_ideal = core.CPUQVM()
qvm_ideal.run(prog, 100000)
ideal_probs = qvm_ideal.result().get_prob_dict()
heavy = compute_heavy_outputs(ideal_probs)

# --- Noisy simulation ---
error_rate = 0.02  # 2% depolarizing error per gate
noise_model = core.NoiseModel()
dep_error = core.depolarizing_error(error_rate)
noise_model.add_all_qubit_quantum_error(dep_error, core.GateType.CNOT)

# Single-qubit gate noise
dep_error_1q = core.depolarizing_error(error_rate * 0.5)
noise_model.add_all_qubit_quantum_error(dep_error_1q, core.GateType.H)
noise_model.add_all_qubit_quantum_error(dep_error_1q, core.GateType.X)

qvm_noisy = core.CPUQVM()
qvm_noisy.run(prog, 10000, noise_model)
noisy_counts = qvm_noisy.result().get_counts()

total = sum(noisy_counts.values())
heavy_count = sum(c for bs, c in noisy_counts.items() if bs in heavy)
hof_noisy = heavy_count / total

print(f"Error rate:    {error_rate:.1%}")
print(f"HOF (noisy):   {hof_noisy:.4f}")
print(f"Threshold:     0.6667")
print(f"Result:        {'PASS' if hof_noisy > 2 / 3 else 'FAIL'}")

扫描噪声水平以找到 QV 失败阈值

确定 QV 测试失败前的最大容许误差率。这对硬件工程至关重要：它告诉你设备必须达到的门保真度目标。

python

"""Sweep noise levels to find the QV failure threshold."""
from pyqpanda3 import core
import numpy as np


def compute_heavy_outputs(prob_dict: dict) -> set:
    probabilities = np.array(list(prob_dict.values()))
    median_prob = np.median(probabilities)
    return {bs for bs, p in prob_dict.items() if p > median_prob}


def qv_hof_with_noise(num_qubits: int, error_rate: float, seed: int) -> float:
    """Run a QV test with depolarizing noise and return HOF."""
    qv_circuit = core.QV(num_qubits, num_qubits, seed)
    prog = core.QProg()
    prog.append(qv_circuit)
    for q in range(num_qubits):
        prog.append(core.measure(q, q))

    qvm = core.CPUQVM()

    # Ideal reference
    qvm.run(prog, 100000)
    heavy = compute_heavy_outputs(qvm.result().get_prob_dict())

    # Noisy execution
    noise = core.NoiseModel()
    if error_rate > 0:
        dep_2q = core.depolarizing_error(error_rate)
        noise.add_all_qubit_quantum_error(dep_2q, core.GateType.CNOT)
        dep_1q = core.depolarizing_error(error_rate * 0.5)
        noise.add_all_qubit_quantum_error(dep_1q, core.GateType.H)

    qvm.run(prog, 10000, noise)
    counts = qvm.result().get_counts()

    total = sum(counts.values())
    return sum(c for bs, c in counts.items() if bs in heavy) / total


# Sweep error rates for QV-4
num_qubits = 4
error_rates = np.arange(0, 0.12, 0.01)
num_trials = 5

print(f"QV-{num_qubits} noise threshold sweep")
print(f"{'Error Rate':>12} {'Avg HOF':>10} {'Std Dev':>10} {'Pass?':>6}")
print("-" * 44)

threshold = None
for rate in error_rates:
    hofs = [
        qv_hof_with_noise(num_qubits, rate, seed=seed * 7 + num_qubits)
        for seed in range(num_trials)
    ]
    avg = np.mean(hofs)
    std = np.std(hofs)
    passed = avg > 2 / 3
    if not passed and threshold is None:
        threshold = rate
    print(f"{rate:>11.2%} {avg:>10.4f} {std:>10.4f} {'PASS' if passed else 'FAIL':>6}")

if threshold is not None:
    print(f"\nNoise threshold: ~{threshold:.2%} depolarizing error")
else:
    print("\nDevice passes at all tested noise levels.")

解析

量子体积的数学定义

由 Cross 等人（IBM，2019）确立的正式定义为：

\log_{2} V_{Q} = \arg max_{n} min (n, d (n)) 使得 {\tilde{h}}_{n} > \frac{2}{3}

其中：

$n$ 是量子比特数（电路宽度）
$d (n)$ 是宽度 $n$ 下可达的电路深度
${\tilde{h}}_{n}$ 是重输出概率，以至少 $2 σ$ 置信度高于 $1 / 2$ 来估计

在标准协议中， $d = n$ （深度等于宽度），因此公式简化为：

V_{Q} = 2^{n_{Q}}, n_{Q} = max {n : h_{n} > \frac{2}{3}}

选择 $2 / 3$ 作为阈值是有意义的。由于重输出集合恰好包含所有可能结果的一半，产生均匀随机输出的设备将获得 $h = 1 / 2$ 。 $2 / 3$ 的阈值提供了与随机行为的足够分离，同时考虑了有限采样的统计波动。

重输出生成(Heavy Output Generation, HOG)问题

QV 基准测试与一个称为重输出生成(Heavy Output Generation, HOG)问题的计算复杂度问题相关。非正式地说：

给定一个随机量子电路，产生一个比平均值更可能出现的输出（一个"重"输出）。

在经典计算上，从随机量子电路中采样重输出被认为是困难的——它需要模拟电路，这在量子比特数量上是指数时间的。能够可靠产生重输出的量子设备正在展示对直接经典模拟的计算优势。

与计算复杂度的联系为 QV 提供了比纯经验基准更强的理论基础。当量子计算机达到 $V_{Q} = 2^{n}$ 时，它不仅仅是在运行规模为 $n$ 的电路；它正在该规模上解决一个经典困难问题。

为什么 QV 捕获的不只是量子比特数

设备的量子体积可能受限于多个瓶颈中的任何一个，这正是它作为有用的全面度量的原因。

门保真度瓶颈。 考虑一台具有 $98 %$ 双量子比特门保真度的 20 量子比特设备。宽度为 5 的 QV 电路包含约 12 个双量子比特门，累积成功率为 ${0.98}^{12} \approx 0.78$ ——仍然通过。在宽度 8 时，约 32 个双量子比特门给出 ${0.98}^{32} \approx 0.52$ ，失败。尽管有 20 个物理量子比特，QV 将为 $2^{6} = 64$ 。

连接性瓶颈。 在线性链设备上，实现任意 SU(4) 对需要 SWAP 门，每个 SWAP 添加 3 个额外 CNOT 并累积误差。全连接设备用更少的门执行相同电路。

编译器瓶颈。 不同编译器将 SU(4) 酉操作分解为原生门的效率不同。更好的编译产生更短的电路，直接提高重输出分数。因此 QV 同时度量软件质量和硬件能力。

与电路层保真度的关系

理解 QV 的一种有用方式是通过层保真度(Layer Fidelity) $e^{- λ}$ 的概念，其中 $λ$ 是每层的总误差。对于宽度和深度为 $n$ 的 QV 电路，预期重输出分数约为：

E [h_{n}] \approx \frac{1}{2} (1 + e^{- n λ})

令 $E [h_{n}] = 2 / 3$ 并求解：

\frac{2}{3} = \frac{1}{2} (1 + e^{- n λ}) ⟹ e^{- n λ} = \frac{1}{3} ⟹ n λ = \ln 3

这意味着当设备的每层有效误差率满足 $λ < \ln 3 / n$ 时通过 QV- $n$ 。可容许的误差率与电路宽度成反比，这解释了为什么 QV 是如此苛刻的基准。

QV 电路结构

core.QV(num_qubit, depth, seed) 函数生成具有 IBM QV 协议定义的特定结构的电路：

每层：

使用 seed 控制的洗牌随机排列 $n$ 个量子比特
将量子比特分为 $⌊ n / 2 ⌋$ 对： $(p e r m [0], p e r m [1])$ 、 $(p e r m [2], p e r m [3])$ 等
通过 QOracle 对每对施加随机 SU(4) 酉操作（一个 $4 \times 4$ 特殊酉矩阵）

如果 $n$ 为奇数，每层的最后一个量子比特保持空闲。随机 SU(4) 矩阵通过以下方式生成：

抽取随机 $4 \times 4$ 复矩阵 $A$
计算其奇异值分解： $A = U Σ V^{†}$
取 $U$ 作为随机酉操作
将行列式归一化为 1（确保是 SU(4)，而不仅仅是 U(4)）

QV 数值的实用解释

QV	含义
$2^{1} = 2$	能可靠运行深度为 1 的 1 量子比特电路
$2^{3} = 8$	相当于完美的 3 量子比特设备
$2^{5} = 32$	能处理相当于完美 5 量子比特机器的电路
$2^{10} = 1024$	最先进的超导设备（2023-2024）
$2^{20} = 1048576$	需要极大规模的高保真门

为你的应用选择硬件时：

优化算法（QAOA）：需要 QV 至少为 $2^{n}$ ，其中 $n$ 是问题规模。10 变量 QAOA 需要 QV $\geq 2^{10}$ 。
变分算法（VQE）：更宽容，因为使用浅层电路，但更高的 QV 仍能改善解的质量。
量子纠错：需要远高于逻辑量子比特数的 QV，因为 QEC 电路涉及许多辅助量子比特和深层电路。

多电路试验和统计置信度

在正式 QV 认证中，你对每个宽度运行多个电路（通常 100-200 个）并计算平均重输出分数的单侧置信区间。设备通过的条件是 $2 σ$ 置信区间的下界超过 $1 / 2$ ：

\bar{h} - 2 \cdot \frac{σ_{h}}{\sqrt{K}} > \frac{1}{2}

其中 $K$ 是电路数量， $σ_{h}$ 是各电路重输出分数的标准差。这确保通过不是由于单个电路的随机运气。

python

"""Multi-trial QV test with confidence intervals."""
from pyqpanda3 import core
import numpy as np


def compute_heavy_outputs(prob_dict: dict) -> set:
    probabilities = np.array(list(prob_dict.values()))
    median_prob = np.median(probabilities)
    return {bs for bs, p in prob_dict.items() if p > median_prob}


def single_qv_hof(num_qubits: int, seed: int, shots: int = 10000) -> float:
    qv_circuit = core.QV(num_qubits, num_qubits, seed)
    prog = core.QProg()
    prog.append(qv_circuit)
    for q in range(num_qubits):
        prog.append(core.measure(q, q))

    qvm = core.CPUQVM()
    qvm.run(prog, 100000)
    heavy = compute_heavy_outputs(qvm.result().get_prob_dict())

    qvm.run(prog, shots)
    counts = qvm.result().get_counts()
    total = sum(counts.values())
    return sum(c for bs, c in counts.items() if bs in heavy) / total


# Official-style QV test with confidence intervals
num_qubits = 4
num_circuits = 50
seeds = [i * 31 + num_qubits for i in range(num_circuits)]

hofs = [single_qv_hof(num_qubits, s) for s in seeds]
mean_hof = np.mean(hofs)
std_hof = np.std(hofs, ddof=1)
ci_lower = mean_hof - 2 * std_hof / np.sqrt(num_circuits)

print(f"QV-{num_qubits} over {num_circuits} circuits:")
print(f"  Mean HOF:    {mean_hof:.4f}")
print(f"  Std dev:     {std_hof:.4f}")
print(f"  2-sigma CI:  [{ci_lower:.4f}, {mean_hof + 2 * std_hof / np.sqrt(num_circuits):.4f}]")
print(f"  CI lower > 0.5? {'PASS' if ci_lower > 0.5 else 'FAIL'}")

比较编译器

量子体积不仅是硬件度量。它还度量编译器质量，因为同一个逻辑电路可以以多种不同方式分解为原生门。更好的编译器产生更少的物理门，减少累积误差并提高重输出分数。

你可以通过生成相同的逻辑 QV 电路并在不同转译策略下比较结果来使用 QV 量化编译器效率。关键洞察是逻辑电路（SU(4) 酉操作序列）是固定的——只有到原生门的分解不同。

考虑三种编译策略：

朴素分解(Naive Decomposition)：每个 SU(4) 使用固定门序列分解（例如 3 个 CNOT 加单量子比特旋转）。简单但可能不是目标拓扑的最优选择。
拓扑感知映射(Topology-Aware Mapping)：编译器插入 SWAP 门以匹配电路连接性到硬件耦合图。更少的 SWAP 意味着更少的物理门。
优化合成(Optimized Synthesis)：高级编译器使用近似合成、门消除和交换感知路由来最小化总门数。

python

"""Compare compiler efficiency using QV circuits under different transpilation strategies."""
from pyqpanda3 import core
import numpy as np


def compute_heavy_outputs(prob_dict: dict) -> set:
    probabilities = np.array(list(prob_dict.values()))
    median_prob = np.median(probabilities)
    return {bs for bs, p in prob_dict.items() if p > median_prob}


def count_gates(circuit) -> dict:
    """Walk the circuit and count gate types.

    Returns a dictionary mapping gate type names to occurrence counts.
    """
    gate_counts = {}
    # Use the circuit's built-in gate counting if available,
    # otherwise parse the circuit description
    info = circuit.count_ops()
    for gate_name, count in info.items():
        gate_counts[gate_name] = gate_counts.get(gate_name, 0) + count
    return gate_counts


def run_qv_with_noise(num_qubits: int, seed: int,
                      two_qubit_error: float, one_qubit_error: float,
                      shots: int = 10000) -> float:
    """Run a QV test with specified noise levels and return HOF."""
    qv_circuit = core.QV(num_qubits, num_qubits, seed)
    prog = core.QProg()
    prog.append(qv_circuit)
    for q in range(num_qubits):
        prog.append(core.measure(q, q))

    qvm = core.CPUQVM()

    # Ideal reference for heavy output identification
    qvm.run(prog, 100000)
    heavy = compute_heavy_outputs(qvm.result().get_prob_dict())

    # Build noise model
    noise = core.NoiseModel()
    if two_qubit_error > 0:
        dep_2q = core.depolarizing_error(two_qubit_error)
        noise.add_all_qubit_quantum_error(dep_2q, core.GateType.CNOT)
    if one_qubit_error > 0:
        dep_1q = core.depolarizing_error(one_qubit_error)
        noise.add_all_qubit_quantum_error(dep_1q, core.GateType.H)

    qvm.run(prog, shots, noise)
    counts = qvm.result().get_counts()
    total = sum(counts.values())
    return sum(c for bs, c in counts.items() if bs in heavy) / total


# Compare three simulated "compiler" strategies by varying effective
# two-qubit gate counts through different noise scaling.
# Strategy A: baseline (1x two-qubit gate cost)
# Strategy B: improved routing (0.7x two-qubit gate cost via fewer SWAPs)
# Strategy C: optimized synthesis (0.5x two-qubit gate cost)
num_qubits = 5
base_2q_error = 0.03
base_1q_error = 0.005
seeds = [10, 20, 30, 40, 50]

strategies = {
    "Naive decomposition": (base_2q_error, base_1q_error),
    "Topology-aware routing": (base_2q_error * 0.7, base_1q_error * 0.8),
    "Optimized synthesis": (base_2q_error * 0.5, base_1q_error * 0.6),
}

print(f"Compiler comparison for QV-{num_qubits} (base 2Q error: {base_2q_error:.1%})")
print(f"{'Strategy':<25} {'Eff. 2Q Error':>14} {'Avg HOF':>10} {'Pass?':>6}")
print("-" * 60)

for name, (err_2q, err_1q) in strategies.items():
    hofs = [
        run_qv_with_noise(num_qubits, s, err_2q, err_1q)
        for s in seeds
    ]
    avg_hof = np.mean(hofs)
    print(f"{name:<25} {err_2q:>13.2%} {avg_hof:>10.4f} {'PASS' if avg_hof > 2/3 else 'FAIL':>6}")

输出表明编译器改进直接转化为 QV 性能提升。有效双量子比特门误差 50% 的减少（来自优化合成）可以使设备从失败变为通过给定 QV 级别。这就是量子软件团队大量投资编译器优化的原因。

使用 QV 进行编译器基准测试的关键观察：

门数量比电路深度更重要。 更宽但更浅的电路如果总门数更低，可能优于窄而深的电路。QV 捕获了这一点，因为累积门误差决定了重输出分数。
SWAP 开销是主导成本。 在连接性有限的硬件上，路由可能比逻辑电路所需多 3~10 倍的双量子比特门。在有无 SWAP 感知编译的情况下测量 QV 可直接量化此开销。
单量子比特门消除提供递减回报。 大多数现代编译器已经消除了相邻的逆门。剩余收益来自跨层优化，QV 可以通过比较优化前后的 HOF 来量化。
使用多个种子。 单个 QV 电路可能偶然偏向某一编译器。至少平均 10-20 个随机电路以获得可靠比较。

应用特定基准测试

标准 QV 协议使用随机 SU(4) 酉操作，均匀地测试完整门集。然而，真实量子应用通常使用特定的门模式：变分算法依赖参数化旋转，纠错电路是 Clifford 密集的，量子化学使用许多受控旋转。你可以使用 core.random_qcircuit() 对匹配实际工作负载的门集进行基准测试。

这种方法回答了与标准 QV 不同的问题。它不是问"这台设备能处理的最大通用电路是什么？"，而是问"这台设备运行我的应用实际使用的电路类型表现如何？"

常见的应用特定门集包括：

Clifford 电路：["H", "S", "CNOT"]——适用于纠错、态制备和经典模拟基准。Clifford 电路可以被经典高效模拟（Gottesman-Knill 定理），因此它们在不涉及量子优势声明的情况下测试硬件。
NISQ 变分电路：["H", "RX", "RY", "RZ", "CNOT"]——VQE 和 QAOA 的主要门集。混合参数化旋转和纠缠门。
T 门密集电路：["H", "T", "CNOT"]——适用于容错编译，其中 T 门数量决定资源需求。
硬件原生门：使用匹配设备原生操作的门集（例如 IBM 的 ["RZ", "SX", "CNOT"]、Google 的 ["RX", "RZ", "ISWAP"]）来测量无编译开销的原始硬件能力。

python

"""Benchmark different gate sets using custom random circuits."""
from pyqpanda3 import core
import numpy as np


def compute_heavy_outputs(prob_dict: dict) -> set:
    probabilities = np.array(list(prob_dict.values()))
    median_prob = np.median(probabilities)
    return {bs for bs, p in prob_dict.items() if p > median_prob}


def benchmark_gate_set(qubits: list, depth: int, gate_set: list,
                       model=None, num_trials: int = 10,
                       shots: int = 10000) -> dict:
    """Run random circuit benchmarking for a specific gate set.

    Args:
        qubits: List of qubit indices to use.
        depth: Circuit depth.
        gate_set: List of gate type strings.
        noise_model: Optional noise model for realistic simulation.
        num_trials: Number of random circuits to average over.
        shots: Measurement shots per circuit.

    Returns:
        Dictionary with benchmark results.
    """
    hofs = []
    circuit_depths = []

    for seed in range(num_trials):
        circuit = core.random_qcircuit(qubits, depth, gate_set)

        prog = core.QProg()
        prog.append(circuit)
        for q in qubits:
            prog.append(core.measure(q, q))

        qvm = core.CPUQVM()

        # Ideal reference
        qvm.run(prog, 100000)
        ideal_probs = qvm.result().get_prob_dict()
        heavy = compute_heavy_outputs(ideal_probs)

        # Run with or without noise
        if noise_model is not None:
            qvm.run(prog, shots, noise_model)
        else:
            qvm.run(prog, shots)
        counts = qvm.result().get_counts()

        total = sum(counts.values())
        hof = sum(c for bs, c in counts.items() if bs in heavy) / total
        hofs.append(hof)

    return {
        "gate_set": gate_set,
        "num_qubits": len(qubits),
        "depth": depth,
        "mean_hof": np.mean(hofs),
        "std_hof": np.std(hofs),
        "min_hof": np.min(hofs),
        "max_hof": np.max(hofs),
    }


# Define gate sets representing different application domains
gate_sets = {
    "Clifford": ["H", "S", "CNOT"],
    "Variational (NISQ)": ["H", "RX", "RY", "RZ", "CNOT"],
    "T-gate heavy": ["H", "T", "CNOT"],
    "Full rotation set": ["H", "RX", "RY", "RZ", "CNOT", "SWAP"],
    "Hardware-native (IBM-like)": ["RZ", "H", "CNOT"],
}

# Benchmark parameters
num_qubits = 4
depth = 20
num_trials = 15

# Create a noise model to see differentiation between gate sets
noise = core.NoiseModel()
dep_2q = core.depolarizing_error(0.02)
noise.add_all_qubit_quantum_error(dep_2q, core.GateType.CNOT)
dep_1q = core.depolarizing_error(0.005)
noise.add_all_qubit_quantum_error(dep_1q, core.GateType.H)
noise.add_all_qubit_quantum_error(dep_1q, core.GateType.RX)

qubits = list(range(num_qubits))

print(f"Application-specific benchmarking ({num_qubits} qubits, depth {depth})")
print(f"With depolarizing noise: 2Q=2.0%, 1Q=0.5%")
print(f"{'Gate Set':<28} {'Mean HOF':>10} {'Std':>8} {'Min':>8} {'Max':>8}")
print("-" * 68)

for name, gate_set in gate_sets.items():
    result = benchmark_gate_set(
        qubits, depth, gate_set,
        model=noise, num_trials=num_trials
    )
    print(f"{name:<28} {result['mean_hof']:>10.4f} {result['std_hof']:>8.4f} "
          f"{result['min_hof']:>8.4f} {result['max_hof']:>8.4f}")

# Also run without noise to confirm all gate sets produce valid heavy outputs
print(f"\n--- Ideal (no noise) ---")
print(f"{'Gate Set':<28} {'Mean HOF':>10} {'Std':>8}")
print("-" * 48)

for name, gate_set in gate_sets.items():
    result = benchmark_gate_set(
        qubits, depth, gate_set,
        model=None, num_trials=num_trials
    )
    print(f"{name:<28} {result['mean_hof']:>10.4f} {result['std_hof']:>8.4f}")

解释应用特定基准测试结果：

Clifford 电路倾向于产生更尖锐（低熵）的输出分布，因为它们保持稳定子态。这意味着重输出集合更集中，HOF 通常在相同深度和噪声水平下高于一般随机电路。如果你的应用是基于 Clifford 的，你的设备可能比标准 QV 数值所暗示的表现更好。
T 门密集电路对单量子比特门保真度敏感，因为 T 门需要精确的旋转角度。具有良好双量子比特门但较差单量子比特门的设备在 T 密集基准上会显示比标准 QV 结果更低的 HOF。
变分门集（带参数化旋转）最能代表 NISQ 算法性能。它们同时测试单量子比特和双量子比特门，使 HOF 成为变分算法收敛质量的良好预测器。
硬件原生门集完全消除编译开销。比较硬件原生门的 HOF 与编译后门可以揭示编译的代价。如果差距很小，你的编译器很高效；如果差距大，则有改进空间。

总结

量子体积基准测试提供了一个单一的、独立于硬件的度量，反映了量子计算机的真实能力。通过使用 core.QV() 生成标准测试电路和 core.random_qcircuit() 进行自定义基准测试，你可以系统地评估量子硬件并识别限制性能的瓶颈。

量子体积基准测试 -- Quantum Volume Benchmark ​

问题 ​

方案 ​

代码 ​

运行基本量子体积测试 ​

手动计算重输出分数 ​

扫描 QV 宽度以找到最大值 ​

使用 random_qcircuit 生成自定义随机电路 ​

使用噪声模拟运行 QV ​

扫描噪声水平以找到 QV 失败阈值 ​

解析 ​

量子体积的数学定义 ​

重输出生成(Heavy Output Generation, HOG)问题 ​

为什么 QV 捕获的不只是量子比特数 ​

与电路层保真度的关系 ​

QV 电路结构 ​

QV 数值的实用解释 ​

多电路试验和统计置信度 ​

比较编译器 ​

应用特定基准测试 ​

总结 ​

量子体积基准测试 -- Quantum Volume Benchmark

问题

方案

代码

运行基本量子体积测试

手动计算重输出分数

扫描 QV 宽度以找到最大值

使用 random_qcircuit 生成自定义随机电路

使用噪声模拟运行 QV

扫描噪声水平以找到 QV 失败阈值

解析

量子体积的数学定义

重输出生成(Heavy Output Generation, HOG)问题

为什么 QV 捕获的不只是量子比特数

与电路层保真度的关系

QV 电路结构

QV 数值的实用解释

多电路试验和统计置信度

比较编译器

应用特定基准测试

总结