CLI - FlashInfer-Bench

Entry Points

FlashInfer-Bench provides two equivalent command-line entry points:

flashinfer-bench --help
python -m flashinfer_bench --help

Use --help on any subcommand to inspect all available flags:

flashinfer-bench run --help
flashinfer-bench report --help
flashinfer-bench report summary --help

Run Benchmarks

Run benchmarks against a local FlashInfer-Trace dataset:

flashinfer-bench run --local /path/to/flashinfer-trace

This is equivalent to:

python -m flashinfer_bench run --local /path/to/flashinfer-trace

Useful options:

flashinfer-bench run --local /path/to/flashinfer-trace \
  --warmup-runs 10 \
  --iterations 100 \
  --num-trials 5 \
  --rtol 1e-3 \
  --atol 1e-3 \
  --timeout 300

Run only selected definitions or solutions:

flashinfer-bench run --local /path/to/flashinfer-trace \
  --definitions gemm_n5120_k2048 rmsnorm_h128 \
  --solutions solution_name_1 solution_name_2

Resume an interrupted run:

flashinfer-bench run --local /path/to/flashinfer-trace --resume

Use a YAML config file to set per-op-type or per-definition eval parameters:

flashinfer-bench run --local /path/to/flashinfer-trace --config my_config.yaml

Use the isolated runner instead of the default persistent runner:

flashinfer-bench run --local /path/to/flashinfer-trace --use-isolated-runner

Run The Benchmark Server

Start an HTTP benchmark server against a local trace dataset:

flashinfer-bench serve \
  --local /path/to/flashinfer-trace \
  --host 0.0.0.0 \
  --port 8000

Use --devices to pin specific CUDA devices, or omit it to use all available CUDA devices. For endpoint details and request/response examples, see Benchmark Server API.

Inspect Results

Summarize pass/fail counts and author rankings by average speedup:

flashinfer-bench report summary --local /path/to/flashinfer-trace

Show the best solution for each definition:

flashinfer-bench report best --local /path/to/flashinfer-trace

Merge multiple local datasets into one output directory:

flashinfer-bench report merge \
  --local /path/to/trace-a \
  --local /path/to/trace-b \
  --output /path/to/merged-trace

Render a console-oriented visualization of results:

flashinfer-bench report visualize --local /path/to/flashinfer-trace

Notes

The CLI supports local datasets via --local.
Log verbosity is controlled with --log-level {DEBUG,INFO,WARNING,ERROR} on supported commands.
The flashinfer-bench console script and python -m flashinfer_bench share the same implementation and behavior.

Documentation Index

​Entry Points

​Run Benchmarks

​Run The Benchmark Server

​Inspect Results

​Notes

Entry Points

Run Benchmarks

Run The Benchmark Server

Inspect Results

Notes