Overview
A binding (binding.yaml) maps a contract’s abstract tables to concrete data sources. It tells the resolver how to fetch each table: which backend, which MCP tool, what arguments, and how to parse the response.
Bindings are optional. Without one, the resolver attempts automatic resolution via the knowledge graph using semantic column tags. With a binding, you get explicit control over data sourcing.
Binding Structure
Source Binding Fields
Each entry undersource_bindings maps one contract table to a data source:
| Field | Type | Default | Description |
|---|---|---|---|
backend | string | — | Data backend name ("elasticsearch", "vendor", "mcp", etc.) |
config_key | string | — | Credentials/config lookup key (e.g., "vendor_mcp", "elastic") |
fetch_mode | string | "mcp" | How data is fetched (see Fetch Modes) |
mcp_tool | string | — | MCP tool to call (e.g., "search", "tabular_text") |
mcp_server | string | — | MCP server name (e.g., "elastic", "vendor") |
mcp_args | map | — | Static tool arguments; supports {{param}} interpolation |
query_template | string | — | Query string with {{param}} placeholders |
index | string | — | Backend-specific index/table selector (e.g., ES index pattern) |
field_map | map | — | Maps contract column names to source field names |
items_path | string | — | Dot-path to items array in JSON response (e.g., "data", "results.items") |
single_item | bool | false | Response is one object, not an array |
max_rows | int | 10000 | Hard cap on rows to stage |
timeout | string | "30s" | Per-call timeout (e.g., "120s", "5m") |
response_format | string | "json" | Response format: "json", "csv", "ndjson" |
response_adapter | string | — | Tool-specific parser name (e.g., "tabular_text") |
pagination | object | — | Pagination configuration (see Pagination) |
Validation rules
response_formatandresponse_adapterare mutually exclusive — set one or the other, not both.response_formatmust be one of:"json","csv","ndjson"(or omitted for JSON default). Invalid values like"cvs"are rejected at parse time.
Fetch Modes
Thefetch_mode field controls who fetches the data and how:
| Mode | Who fetches | Response handling | When to use |
|---|---|---|---|
mcp_client / fracta_mcp_gateway | Go MCPFetcher | Parsed by Go (JSON/CSV/NDJSON/adapter) | MCP tools with structured, parseable responses |
mcp | Agent | Agent calls tool, stages via strategy_stage | Complex responses needing agent logic |
native / strategy_native | Strategy Python code | Strategy writes to DuckDB at runtime | Strategies that compute or fetch inline |
"mcp" (agent-driven staging).
api and direct were legacy Go loader-plugin modes and are no longer supported. Use fracta_mcp_gateway with an MCP server binding instead.
Choosing a fetch mode
Usemcp_client when:
- The MCP tool returns JSON, CSV, or NDJSON
- You want automatic Parquet staging without agent intervention
- You need pagination support
mcp when:
- The tool returns complex or unpredictable output
- The agent needs to inspect/filter results before staging
- You want maximum flexibility
native when:
- The strategy fetches its own data (e.g., direct API calls from Python)
- No external staging is needed
Response Formats
When usingfetch_mode: mcp_client, the response parsing is controlled by response_format and response_adapter.
Built-in formats (response_format)
| Format | Parser | Description |
|---|---|---|
json (default) | json.Unmarshal + items_path navigation | Standard JSON array or object with nested items |
csv | RFC 4180 CSV reader | First row = headers, subsequent rows = data |
ndjson | Line-by-line JSON | One JSON object per line (newline-delimited) |
Tool-specific adapters (response_adapter)
For tools that return non-standard output, use a named adapter instead of a format. See Response Adapters for the full adapter reference.
Field Mapping
Thefield_map maps contract column names (left) to source field names (right):
field_overrides at the top level for semantic-based mapping that works across multiple tables:
field_map takes precedence over field_overrides when both match the same column.
Pagination
For large datasets, configure pagination to fetch data page-by-page. All pages are written to a single Parquet file.Offset mode
page_size each iteration: from=0, from=10000, from=20000, …
Cursor mode
response_format: csv, response_format: ndjson, and any response_adapter — the cursor lives in a JSON envelope that non-JSON formats don’t have. This is enforced at runtime.
Pagination fields
| Field | Type | Modes | Description |
|---|---|---|---|
mode | string | all | "offset" or "cursor" |
page_size | int | all | Results per page (default: 100) |
offset_param | string | offset | Query arg name for offset value |
limit_param | string | both | Query arg name for page size |
cursor_param | string | cursor | Query arg name for cursor token |
next_cursor_path | string | cursor | Dot-path to next cursor in response |
total_path | string | offset | Dot-path to total count (optional, for logging) |
Pagination limits
- Total budget: 5 minutes for the entire paginated fetch
- Per-page timeout: Controlled by
timeout(default: 30s) - Row cap: Controlled by
max_rows(default: 10,000) - Termination: Empty page, partial page, maxRows reached, or null cursor
Parameter Interpolation
Bothmcp_args and query_template support {{param}} placeholders that are resolved from strategy parameters at runtime. Interpolation is recursive — a param value can itself contain {{other_param}}.

