TraceSet¶
- class flashinfer_bench.data.TraceSet¶
Stores a FlashInfer Trace dataset containing definitions, solutions, workloads, and traces.
TraceSet serves as a centralized data warehouse for managing FlashInfer benchmark data. It provides efficient lookup and filtering capabilities for definitions, solutions, and execution traces organized by definition names.
The data structure is optimized for fast lookups with dictionary-based storage where keys are definition names and values are lists of associated objects.
- root: Path | None = None¶
The root path of the TraceSet. If None, all add() or get() operations will be performed in-memory.
- definitions: Dict[str, Definition]¶
The definitions in the database. Map from definition name to Definition object.
- solutions: Dict[str, List[Solution]]¶
The solutions in the database. Map from definition name to all the solutions for that definition.
- workloads: Dict[str, List[Trace]]¶
The workload traces in the database. Map from definition name to all workload traces for that definition.
- traces: Dict[str, List[Trace]]¶
The traces in the database. Map from definition name to all traces for that definition.
- classmethod from_path(path: str | None = None) TraceSet¶
Load a TraceSet from a directory structure.
Loads a complete TraceSet by scanning the directory structure for: - definitions/: JSON files containing Definition objects - solutions/: JSON files containing Solution objects - workloads/: JSONL files containing Trace objects (workload traces) - traces/: JSONL files containing Trace objects (execution traces)
- Parameters:
path (Optional[str], optional) – Root directory path containing the dataset structure. If None, the global environment variable FIB_DATASET_PATH is used. If FIB_DATASET_PATH is not set, ~/.cache/flashinfer_bench/dataset is used.
- Returns:
A new TraceSet instance populated with data from the directory.
- Return type:
- Raises:
ValueError – If duplicate definition names or solution names are found.
FileNotFoundError – If the specified path doesn’t exist.
- to_dict() Dict[str, Any]¶
Convert the TraceSet to a Python dict.
- Returns:
A dictionary representation of the TraceSet.
- Return type:
Dict[str, Any]
- get_solution(name: str) Solution | None¶
Get a solution by name from all loaded solutions.
Searches across all solutions in the TraceSet to find one with the specified name. Since solution names are unique across the entire dataset, this returns at most one solution.
- Parameters:
name (str) – The name of the solution to retrieve.
- Returns:
The solution with the given name, or None if not found.
- Return type:
Optional[Solution]
- filter_traces(def_name: str, atol: float = 0.01, rtol: float = 0.01) List[Trace]¶
Filter traces for a definition based on error bounds.
Returns only successful traces that meet the specified absolute and relative error tolerance criteria. This is useful for finding high-quality implementations that produce numerically accurate results.
- Parameters:
def_name (str) – Name of the definition to filter traces for.
atol (float, optional) – Maximum absolute error tolerance, by default 1e-2.
rtol (float, optional) – Maximum relative error tolerance, by default 1e-2.
- Returns:
List of traces that passed evaluation and meet error criteria. Empty list if no traces match the criteria.
- Return type:
List[Trace]
- get_best_trace(def_name: str, axes: Dict[str, int] | None = None, max_abs_error: float = 0.01, max_rel_error: float = 0.01) Trace | None¶
Get the best performing trace for a definition based on speedup factor.
Finds the trace with the highest speedup factor among those that meet the specified criteria including axis constraints and error tolerances.
- Parameters:
def_name (str) – Name of the definition to find the best trace for.
axes (Optional[Dict[str, int]], optional) – Dictionary of axis name to value pairs for exact matching. Only traces with exactly matching axis values will be considered. If None, all axis configurations are considered.
max_abs_error (float, optional) – Maximum absolute error tolerance, by default 1e-2.
max_rel_error (float, optional) – Maximum relative error tolerance, by default 1e-2.
- Returns:
The best performing trace meeting all criteria, or None if no traces match the requirements.
- Return type:
Optional[Trace]
- summary() Dict[str, any]¶
Get a comprehensive summary of all traces in the TraceSet.
Computes aggregate statistics across all execution traces including success rates, latency statistics, and overall dataset size metrics.
- Returns:
Dictionary containing the following keys: - total: Total number of traces - passed: Number of traces with successful evaluation - failed: Number of traces with failed evaluation - min_latency_ms: Minimum latency among successful traces (None if no successful traces) - max_latency_ms: Maximum latency among successful traces (None if no successful traces) - avg_latency_ms: Average latency among successful traces (None if no successful traces)
- Return type:
Dict[str, Any]
- backup_traces() None¶
Backup the traces directory to a new directory. This is useful when we want to keep the old traces for reference.
- Return type:
None
- add_traces(traces: List[Trace]) None¶
Add traces to the TraceSet, and store the traces to disk.
- Parameters:
traces (List[Trace]) – The traces to add to the TraceSet.
- Return type:
None
- add_workload_blob_tensor(def_name: str, workload_uuid: str, tensors: Dict[str, torch.Tensor]) str¶
Store a dict of workload blob tensors to a safetensors file, and return the saved file path relative to the TraceSet root.
- Parameters:
def_name (str) – The def name of the tensor.
workload_uuid (str) – The workload uuid of the tensor.
tensors (Dict[str, torch.Tensor]) – The dict of tensors to store.
- Returns:
The file path of the saved tensor.
- Return type:
str
- Raises:
ValueError – If the root path is not set, or the constructed tensor path already exists.
OSError – If writing to disk fails.
- get_workload_blob_tensor(path_str: str) Dict[str, torch.Tensor]¶
Get a workload blob tensor from disk to CPU.
- Parameters:
path_str (str) – The file path of the tensor relative to the TraceSet root.
- Returns:
The dict of tensors from the file.
- Return type:
Dict[str, torch.Tensor]