flashinfer_bench.tracingΒΆ
flashinfer_bench.tracing provides tools for tracing kernel executions during LLM inference
and collecting workload traces for the FlashInfer Trace database. This module enables:
Workload Collection: Capture kernel inputs and execution patterns during runtime
Configurable Tracing: Control what data to collect and how to deduplicate or filter traces
Filter Policies: Apply policies to reduce redundant traces and manage dataset size