Entry Points
FlashInfer-Bench provides two equivalent command-line entry points:--help on any subcommand to inspect all available flags:
Run Benchmarks
Run benchmarks against a local FlashInfer-Trace dataset:Run The Benchmark Server
Start an HTTP benchmark server against a local trace dataset:--devices to pin specific CUDA devices, or omit it to use all available CUDA devices.
For endpoint details and request/response examples, see Benchmark Server API.
Inspect Results
Summarize pass/fail counts and author rankings by average speedup:Notes
- The CLI supports local datasets via
--local. - Log verbosity is controlled with
--log-level {DEBUG,INFO,WARNING,ERROR}on supported commands. - The
flashinfer-benchconsole script andpython -m flashinfer_benchshare the same implementation and behavior.

