flashinfer_bench.agents¶

flashinfer_bench.agents provides tools for kernel agent development and debugging. This module provides the following tools:

Profiling Tools: Run NVIDIA Nsight Compute, Compute Sanitizer, etc. on solutions
FFI Prompts: Provide context about the FlashInfer Bench API for LLM agents

This package also provides JSON Schema version of the tools by calling function_to_schema() and get_all_tool_schemas().

flashinfer_bench.agents.flashinfer_bench_run_ncu(solution: Solution | str, workload: Workload | str, *, device: str = 'cuda:0', trace_set_path: str | None = None, set: str = 'detailed', sections: List[str] | None = None, kernel_name: str | None = None, page: str = 'details', ncu_path: str = 'ncu', timeout: int = 60, tmpdir: str | None = None, max_lines: int | None = None) → str¶

Run NCU profiling on a solution with a specific workload.

This function analyzes the performance of a solution using NVIDIA Nsight Compute.

All inputs and outputs are JSON-serializable, making it suitable as a target for LLM agent tool calling.

Uses FIB_DATASET_PATH environment variable when trace_set_path is not provided.

Parameters:

solution (Solution or str) – The solution to profile. Can be a Solution object or a path to a JSON file.
workload (Workload or str) – The workload configuration specifying input dimensions and data. Can be a Workload object or a path to a JSON file.
device (str, optional) – CUDA device to run on. Default is “cuda:0”.
trace_set_path (str, optional) – Path to the trace set. If not provided, uses FIB_DATASET_PATH environment variable.
set (str, optional) – NCU section set to collect. Use flashinfer_bench_list_ncu_options to see available sets. Default is “detailed”.
sections (List[str], optional) – Additional sections to collect beyond the set. Use flashinfer_bench_list_ncu_options to see available sections.
kernel_name (str, optional) – Filter to profile only kernels matching this name (supports regex).
page (str, optional) – NCU output page format. One of: “raw”, “details”, “source”. Default is “details”.
ncu_path (str, optional) – Path to the NCU executable. Default is “ncu”.
timeout (int, optional) – Timeout in seconds for NCU profiling. Default is 60.
tmpdir (str, optional) – Temporary directory for NCU. If not provided, uses system default.
max_lines (int, optional) – Maximum number of lines in output. If None, returns full output.

Returns:

NCU profiling results as text, or error message starting with “ERROR:”.

Return type:

str

Raises:

None – All errors are returned as strings.

Examples

>>> from flashinfer_bench.agents import flashinfer_bench_run_ncu
>>> result = flashinfer_bench_run_ncu(solution, workload, set="detailed")
>>> print(result)

flashinfer_bench.agents.flashinfer_bench_list_ncu_options(ncu_path: str = 'ncu') → str¶

List available NCU sets and sections for profiling configuration.

This function queries the NCU executable to get the available profiling sets and sections. Use this to discover valid options for the set and sections parameters of flashinfer_bench_run_ncu.

Parameters:: ncu_path (str, optional) – Path to the NCU executable. Default is “ncu”.
Returns:: Combined output of ncu –list-sets and ncu –list-sections.
Return type:: str

Examples

>>> from flashinfer_bench.agents import flashinfer_bench_list_ncu_options
>>> print(flashinfer_bench_list_ncu_options())

flashinfer_bench.agents.flashinfer_bench_run_sanitizer(solution: Solution | str, workload: Workload | str, *, device: str = 'cuda:0', trace_set_path: str | None = None, sanitizer_types: List[Literal['memcheck', 'racecheck', 'initcheck', 'synccheck']] | None = None, sanitizer_path: str = 'compute-sanitizer', timeout: int = 300, tmpdir: str | None = None, max_lines: int | None = None) → str¶

Run compute-sanitizer checks on a solution with a specific workload: memcheck, racecheck, initcheck, synccheck.

Parameters:

solution (Solution or str) – The solution to check. Can be a Solution object or a path to a JSON file.
workload (Workload or str) – The workload configuration specifying input dimensions and data. Can be a Workload object or a path to a JSON file.
device (str, optional) – CUDA device to run on. Default is “cuda:0”.
trace_set_path (str, optional) – Path to the trace set. If not provided, uses FIB_DATASET_PATH environment variable.
sanitizer_types (List[SanitizerType], optional) – List of sanitizer tools to run. Default runs all: memcheck, racecheck, initcheck, synccheck.
sanitizer_path (str, optional) – Path to the compute-sanitizer executable. Default is “compute-sanitizer”.
timeout (int, optional) – Timeout in seconds for each sanitizer check. Default is 300.
tmpdir (str, optional) – Temporary directory. If not provided, uses system default.
max_lines (int, optional) – Maximum number of lines in output. If None, returns full output.

Returns:

Sanitizer results as text, or error message starting with “ERROR:”.

Return type:

str

flashinfer_bench.agents.extract_solution_to_files(solution: Solution, base_path: str) → str¶

Extract a Solution object to a directory of files.

Creates a directory containing: - All source files from the solution - A metadata file (SOLUTION.md) with build specs and metadata

Parameters:

solution (Solution) – Solution object to extract.
base_path (str) – Base directory path where files will be created.

Returns:

Path to the created solution directory.

Return type:

str

flashinfer_bench.agents.pack_solution_from_files(path: str, spec: BuildSpec, name: str, definition: str, author: str, description: str = '') → Solution¶

Pack a directory of files into a Solution object.

Only includes source code files (.py, .cu, .cuh, .cpp, .c, .h, .hpp).

Parameters:

path (str) – Path to directory containing solution source files.
spec (BuildSpec) – BuildSpec object specifying build configuration.
name (str) – Solution name.
definition (str) – Definition name.
author (str) – Author name.
description (str, optional) – Solution description. Default is “”.

Returns:

Solution object constructed from files.

Return type:

Solution

flashinfer_bench.agents.function_to_schema(func: Callable) → dict¶

Generate OpenAI/Anthropic compatible tool schema from a function.

This function extracts information from: - Function name - Type hints for parameter types - Numpy-style docstring for descriptions and parameter docs

Parameters:: func (Callable) – The function to generate schema for. Must have type hints and numpy-style docstring.
Returns:: A tool schema compatible with OpenAI/Anthropic function calling format.
Return type:: dict

Examples

>>> from flashinfer_bench.agents.schema import function_to_schema
>>> from flashinfer_bench.agents import flashinfer_bench_run_ncu
>>> schema = function_to_schema(flashinfer_bench_run_ncu)
>>> print(schema["name"])
flashinfer_bench_run_ncu

flashinfer_bench.agents.get_all_tool_schemas() → List[dict]¶

Get schemas for all agent tools in flashinfer_bench.

Returns:: List of tool schemas compatible with OpenAI/Anthropic function calling.
Return type:: List[dict]

Examples

>>> from flashinfer_bench.agents.schema import get_all_tool_schemas
>>> schemas = get_all_tool_schemas()
>>> for s in schemas:
...     print(s["name"])
flashinfer_bench_list_ncu_options
flashinfer_bench_run_ncu
flashinfer_bench_run_sanitizer

Prompt templates for TVM FFI API documentation used by agents.

flashinfer_bench.agents.ffi_prompt.FFI_PROMPT_SIMPLE¶: Simplified TVM FFI API documentation with essential methods and a basic example.

flashinfer_bench.agents.ffi_prompt.FFI_PROMPT¶: Comprehensive TVM FFI API documentation with full method signatures and multiple examples.