flashinfer_bench.data.TraceSet¶

class flashinfer_bench.data.TraceSet¶

Stores a FlashInfer Trace dataset containing definitions, solutions, workloads, and traces.

TraceSet serves as a centralized data warehouse for managing FlashInfer benchmark data. It provides efficient lookup and filtering capabilities for definitions, solutions, and execution traces organized by definition names.

The data structure is optimized for fast lookups with dictionary-based storage where keys are definition names and values are lists of associated objects.

root: Path | None = None¶: The root path of the TraceSet. If None, all add() or get() operations will be performed in-memory.

definitions: Dict[str, Definition]¶: The definitions in the database. Map from definition name to Definition object.

solutions: Dict[str, List[Solution]]¶: The solutions in the database. Map from definition name to all the solutions for that definition.

workloads: Dict[str, List[Trace]]¶: The workload traces in the database. Map from definition name to all workload traces for that definition.

traces: Dict[str, List[Trace]]¶: The traces in the database. Map from definition name to all traces for that definition.

classmethod from_path(path: str | None = None) → TraceSet¶

Load a TraceSet from a directory structure.

Loads a complete TraceSet by scanning the directory structure for: - definitions/: JSON files containing Definition objects - solutions/: JSON files containing Solution objects - workloads/: JSONL files containing Trace objects (workload traces) - traces/: JSONL files containing Trace objects (execution traces)

Parameters:

path (Optional[str], optional) – Root directory path containing the dataset structure. If None, the global environment variable FIB_DATASET_PATH is used. If FIB_DATASET_PATH is not set, ~/.cache/flashinfer_bench/dataset is used.

Returns:

A new TraceSet instance populated with data from the directory.

Return type:

TraceSet

Raises:

ValueError – If duplicate definition names or solution names are found.
FileNotFoundError – If the specified path doesn’t exist.

to_dict() → Dict[str, Any]¶

Convert the TraceSet to a Python dict.

Returns:: A dictionary representation of the TraceSet.
Return type:: Dict[str, Any]

get_solution(name: str) → Solution | None¶

Get a solution by name from all loaded solutions.

Uses an O(1) index lookup for fast retrieval. Since solution names are unique across the entire dataset, this returns at most one solution.

Parameters:: name (str) – The name of the solution to retrieve.
Returns:: The solution with the given name, or None if not found.
Return type:: Optional[Solution]

filter_traces(def_name: str, atol: float = 0.01, rtol: float = 0.01) → List[Trace]¶

Filter traces for a definition based on error bounds.

Returns only successful traces that meet the specified absolute and relative error tolerance criteria. This is useful for finding high-quality implementations that produce numerically accurate results.

Parameters:

def_name (str) – Name of the definition to filter traces for.
atol (float, optional) – Maximum absolute error tolerance, by default 1e-2.
rtol (float, optional) – Maximum relative error tolerance, by default 1e-2.

Returns:

List of traces that passed evaluation and meet error criteria. Empty list if no traces match the criteria.

Return type:

List[Trace]

get_best_trace(def_name: str, axes: Dict[str, int] | None = None, max_abs_error: float = 0.01, max_rel_error: float = 0.01) → Trace | None¶

Get the best performing trace for a definition based on speedup factor.

Finds the trace with the highest speedup factor among those that meet the specified criteria including axis constraints and error tolerances.

Parameters:

def_name (str) – Name of the definition to find the best trace for.
axes (Optional[Dict[str, int]], optional) – Dictionary of axis name to value pairs for exact matching. Only traces with exactly matching axis values will be considered. If None, all axis configurations are considered.
max_abs_error (float, optional) – Maximum absolute error tolerance, by default 1e-2.
max_rel_error (float, optional) – Maximum relative error tolerance, by default 1e-2.

Returns:

The best performing trace meeting all criteria, or None if no traces match the requirements.

Return type:

Optional[Trace]

summary(baseline_author: str = 'baseline', op_type: str | None = None, definition_name: str | None = None) → TraceSetSummary¶

Get aggregate trace counts and author rankings for the current dataset.

Returns:

Trace counts (total, passed, failed) and author rankings sorted by avg_speedup descending.

Return type:

TraceSetSummary

Parameters:

baseline_author (str)
op_type (str | None)
definition_name (str | None)

get_solution_score(solution_name: str, baseline_author: str = 'baseline') → SpeedupMetrics | None¶

Get the score for a single solution against the baseline.

The baseline is determined by baseline_author, which must map to exactly one solution in the same definition (0 or >1 raises ValueError). The baseline solution itself is not scored (returns None).

For each workload (matched by workload.uuid), the speedup is baseline_latency / solution_latency. When a (solution, workload) pair has multiple traces, the lowest-latency passing trace is used. Only workloads where the baseline also has a passing trace participate.

If the solution has any failed trace (status != PASSED), the entire score is 0. Otherwise the score is the mean speedup across workloads.

Parameters:

solution_name (str) – Solution name to score.
baseline_author (str) – Author name to use as baseline (default: ‘baseline’).

Returns:

Score result, or None if the solution is unknown, is the baseline itself, or has no comparable workloads.

Return type:

SpeedupMetrics | None

Raises:

ValueError – If the baseline author has zero or more than one solution for the definition, or if a PASSED trace has invalid performance data.

get_author_score(author: str, baseline_author: str = 'baseline', op_type: str | None = None, definition_name: str | None = None) → SpeedupMetrics | None¶

Get the score for a single author within the selected scope.

Within each scoped definition, the author may own multiple solutions. Only the best-scoring one (highest avg_speedup) is kept per definition. The author’s final score is the mean of these per-definition best scores.

Parameters:

author (str) – Author name to score.
baseline_author (str) – Author name to use as baseline (default: ‘baseline’).
op_type (Optional[str]) – Operation type to score within.
definition_name (Optional[str]) – Definition name to score within.

Returns:

Author score, or None if the author has no scorable solutions in scope.

Return type:

SpeedupMetrics | None

Raises:

KeyError – If definition_name is provided but not found.
ValueError – If both op_type and definition_name are provided.

rank_authors(baseline_author: str = 'baseline', op_type: str | None = None, definition_name: str | None = None) → List[Tuple[str, SpeedupMetrics]]¶

Rank all authors by average speedup within the selected scope.

Collects all distinct authors from solutions in the scoped definitions, scores each via get_author_score, filters out None results (e.g. baseline author), and sorts by avg_speedup descending.

Parameters:

baseline_author (str) – Author name to use as baseline (default: ‘baseline’).
op_type (Optional[str]) – Operation type to rank within.
definition_name (Optional[str]) – Definition name to rank within.

Returns:

(author_name, score) pairs sorted by avg_speedup descending, then by author name.

Return type:

list[tuple[str, SpeedupMetrics]]

Raises:

KeyError – If definition_name is provided but not found.
ValueError – If both op_type and definition_name are provided.
ValueError – If any included definition is missing a baseline solution from baseline_author.

backup_traces() → None¶

Backup the traces directory to a new directory. This is useful when we want to keep the old traces for reference.

Return type:: None

add_traces(traces: List[Trace]) → None¶

Add traces to the TraceSet, and store the traces to disk.

Parameters:: traces (List[Trace]) – The traces to add to the TraceSet.
Return type:: None

add_workload_blob_tensor(def_name: str, workload_uuid: str, tensors: Dict[str, torch.Tensor]) → str¶

Store a dict of workload blob tensors to a safetensors file, and return the saved file path relative to the TraceSet root.

Parameters:

def_name (str) – The def name of the tensor.
workload_uuid (str) – The workload uuid of the tensor.
tensors (Dict[str, torch.Tensor]) – The dict of tensors to store.

Returns:

The file path of the saved tensor.

Return type:

str

Raises:

ValueError – If the root path is not set, or the constructed tensor path already exists.
OSError – If writing to disk fails.

get_workload_blob_tensor(path_str: str) → Dict[str, torch.Tensor]¶

Get a workload blob tensor from disk to CPU.

Parameters:: path_str (str) – The file path of the tensor relative to the TraceSet root.
Returns:: The dict of tensors from the file.
Return type:: Dict[str, torch.Tensor]

add_workload_traces(traces: List[Trace]) → None¶

Add workload traces to the TraceSet, and store the traces to disk.

Parameters:: traces (List[Trace]) – The traces to add to the TraceSet.
Return type:: None