Skip to main content

Overview

A contract (contract.yaml) declares what a strategy needs to run: its parameters, required tables with column schemas, graph dependencies, and optional discovery hints. Contracts are the interface between strategy authors and the data-staging layer — they describe what data is needed without specifying how to fetch it. Contracts live alongside strategy code:
strategies/security/hunt/my_hunt/
  contract.yaml
  binding.yaml      # optional — see docs/bindings.md
  strategy.py

Contract Fields

Top-level

FieldTypeRequiredDescription
namestringyesStrategy identifier (e.g., "credential-stealer-hunt")
versionstringnoSemantic version (e.g., "2.0.0")
descriptionstringyesHuman-readable purpose
tags[]stringyesAt least one tag for classification (e.g., [hunt, macos])
paramsmapnoInput parameters the strategy accepts
requiresobjectnoRuntime dependencies (graph, tables, sources)
pinned_backendstringnoConstrain resolution to a specific backend (e.g., "elasticsearch")
discoveryobjectnoPre-staging guidance for orchestrators

params

Each key is a parameter name; the value is a ParamSpec:
FieldTypeRequiredDescription
typestringyes"str", "int", "float", "bool", or "list"
requiredboolnoWhether the caller must provide this parameter
defaultanynoDefault value if not provided
descriptionstringnoParameter purpose (shown in strategy_describe)
params:
  days_back:
    type: int
    required: false
    default: 30
    description: "Number of days to look back"
  severity_filter:
    type: str
    required: false
    default: "HIGH,CRITICAL"
    description: "Comma-separated severity levels to include"

requires

FieldTypeDescription
graphboolWhether the strategy needs knowledge graph access
duckdbboolWhether the strategy needs DuckDB (implied by tables)
tablesmapRequired staging tables with column schemas
sources[]stringAdvisory list of expected LogSource names

requires.tables

Each key is a table name; the value is a TableSpec:
FieldTypeDescription
descriptionstringWhat data this table holds
optionalboolStrategy works without this table if true
volume_hintstringExpected cardinality: "low", "medium", "high"
columnsmapColumn definitions (see below)

Column definitions

Each key is a column name; the value is a ColumnSpec:
FieldTypeDescription
typestringDuckDB type: VARCHAR, INT64, FLOAT64, BOOLEAN, TIMESTAMP
semanticstringSemantic type linking to the knowledge graph (e.g., "ip_address", "identity_arn")
descriptionstringColumn purpose
Semantic tags enable automatic field resolution via the graph — the resolver matches columns to data sources by shared semantic types.
requires:
  graph: true
  tables:
    auth_events:
      description: "Authentication events with geolocation"
      volume_hint: medium
      columns:
        identity_arn:
          type: VARCHAR
          semantic: identity_arn
        source_ip:
          type: VARCHAR
          semantic: ip_address
        timestamp:
          type: TIMESTAMP
          semantic: timestamp

discovery

Optional hints for orchestrators that auto-stage data:
FieldTypeDescription
descriptionstringProse explanation of how to stage data
mcp_hints[]objectSuggested MCP tools
Each mcp_hint:
FieldTypeDescription
toolstringMCP tool name (e.g., "elasticsearch.search")
purposestringWhy this tool is needed
stage_asstringTable name to store results as
discovery:
  description: "Fetch auth events from Elasticsearch, stage as auth_events"
  mcp_hints:
    - tool: "elasticsearch.search"
      purpose: "Fetch authentication events with geo data"
      stage_as: "auth_events"

Validation

The following are enforced at parse time:
  • name must be non-empty
  • description must be non-empty
  • tags must contain at least one element

Examples

Minimal contract (graph-only, no tables)

name: "correlate-ip-across-sources"
version: "1.0.0"
description: "Query the graph for all log sources with IP fields"
tags: [correlation, ip, multi-source]
params:
  ip:
    type: str
    required: true
    description: "IP address to investigate"
requires:
  graph: true
  sources: [CloudTrail, VPCFlowLogs, IdentitySystemLog]

Full contract with tables and discovery

name: "impossible-travel-detect"
version: "1.0.0"
description: "Detect impossible travel from authentication events"
tags: [detection, impossible-travel, geolocation]
params:
  identity_arn:
    type: str
    required: true
    description: "Identity ARN to analyze"
  max_speed_kmh:
    type: int
    required: false
    default: 900
    description: "Maximum plausible travel speed in km/h"
requires:
  graph: true
  tables:
    auth_events:
      description: "Authentication events with geolocation data"
      volume_hint: medium
      columns:
        identity_arn:
          type: VARCHAR
          semantic: identity_arn
        source_ip:
          type: VARCHAR
          semantic: ip_address
        latitude: { type: FLOAT64 }
        longitude: { type: FLOAT64 }
        timestamp:
          type: TIMESTAMP
          semantic: timestamp
  sources: [CloudTrail, IdentitySystemLog]
discovery:
  description: "Fetch auth events from Elasticsearch"
  mcp_hints:
    - tool: "elasticsearch.search"
      purpose: "Fetch authentication events with geo data"
      stage_as: "auth_events"

Native-fetch contract (no tables)

name: "credential-stealer-native"
version: "1.0.0"
description: "Self-contained credential stealer hunt (strategy fetches its own data)"
tags: [hunt, credential-theft, native]
params:
  days_back: { type: int, required: false, default: 30 }
  max_alerts: { type: int, required: false, default: 100 }
requires:
  graph: true
  duckdb: true
  # No tables — strategy fetches directly at runtime

Pinned-backend contract

name: "splunk-field-survey"
version: "1.0.0"
description: "Survey Splunk indexes and extracted fields per sourcetype"
tags: [enrichment, splunk, schema-discovery]
params:
  index_filter: { type: str, required: false, default: "*" }
pinned_backend: splunk
requires:
  graph: false
  tables:
    splunk_indexes:
      description: "Index metadata"
      columns:
        index: { type: VARCHAR }
        event_count: { type: VARCHAR }
        size_bytes: { type: VARCHAR }