Integration Recipes

For: a developer wrapping an external security tool (Falco, Suricata, Wazuh, osquery, anything that already emits events) as an InnerWarden collector.

InnerWarden can ingest events from any external tool by reading that tool's output and turning each line into an Event. You could hand-write that collector. But if the tool already does its own detection and just spits out alerts, there is a faster way: write a recipe.

A recipe is a small TOML file that describes the integration precisely enough that a person, or an AI assistant, can generate a working collector from it without reading the tool's source. It captures the contract: how the tool emits data, what one event looks like, how its fields and severities map onto InnerWarden's, and how to pull out entities (the IP, the user, the container). The recipe is a spec; the collector is the implementation it drives.

This page is the recipe format with one full worked example. For hand-writing a collector from scratch (and the Event rules a generated collector must follow), see Write a Module and Module Reference.

How a recipe becomes a collector

recipe.toml
    ├── describes ─► input mechanism (file_tail / subprocess / unix_socket / http / grpc)
    ├── describes ─► what one raw event looks like (field names, an example)
    ├── describes ─► field mappings (tool field -> Event field)
    ├── describes ─► severity mapping (tool's labels -> InnerWarden Severity)
    ├── describes ─► entity extraction (how to find the IP / user / container)
    └── describes ─► the config surface (what the operator sets)
              │
              ▼
   a person or an AI reads the recipe + the collector pattern
              │
              ▼
   FooCollector (Rust) + module.toml + docs/README.md
              │
              ▼
   innerwarden module validate ./modules/foo-integration

The recipe contains no Rust. It is the description; the collector author (or an AI assistant) writes the code from it, following the collector rules in Module Reference.

Generating a collector with an AI assistant

Give the assistant, in one prompt, the context it needs:

This page (the recipe format).
Module Reference (the collector rules: fail-open, async, Event field names).
Your recipe.toml for the tool.
The Event and EntityRef type definitions (field names are in Data Formats; the live source is in the repo).
A reference collector from the repo to match the house pattern (any existing subprocess- or file-tailing collector under crates/sensor/src/collectors/).

A prompt that works:

Using the recipe in recipe.toml and following the InnerWarden collector pattern in the reference file, generate the collector, modules/<id>-integration/module.toml, config/sensor.example.toml, and docs/README.md. The collector must be fail-open, async, send Event structs over the mpsc channel, and match every field and severity mapping in the recipe exactly.

The recipe is structured so that prompt produces a correct first draft with little iteration.

Recipe format

A recipe lives at integrations/<tool>/recipe.toml. The sections below are a full Falco recipe, top to bottom.

`[recipe]` - metadata

[recipe]
id          = "falco"                   # used as collector_id and module prefix
name        = "Falco Runtime Security"
description = "Short description shown in the module listing"
tool        = "falco"
tool_url    = "https://falco.org"
tool_version_min = "0.36"               # oldest tested version
collector_id = "falco_log"              # sensor config key: [collectors.falco_log]
author      = ""
license     = "Apache-2.0"

`[input]` - how the tool emits data

[input]
mechanism       = "file_tail"           # see the mechanisms table below
path_config_key = "path"                # the config key holding the file path
path_default    = "/var/log/falco/falco.log"
format          = "jsonl"               # jsonl | text | binary
restart_on_eof  = true                  # re-tail after EOF (survives log rotation)
reconnect_secs  = 5

Mechanisms:

Value	What it is
`file_tail`	Tail a log file line by line.
`subprocess_stdout`	Spawn a subprocess and read its stdout.
`unix_socket`	Connect to a Unix domain socket (JSON stream).
`http_poll`	Poll an HTTP endpoint on an interval.
`grpc_stream`	Consume a gRPC streaming API.

`[event_schema]` - what one raw event looks like

Describe the fields. Use dot notation for nested ones.

[event_schema]
ts_field       = "time"                 # the timestamp field
ts_format      = "rfc3339"              # rfc3339 | unix_sec | unix_ms | strptime:<fmt>
summary_field  = "output"               # human-readable description
severity_field = "priority"             # the tool's severity label
kind_field     = "rule"                 # event type / rule name
tags_field     = "tags"                 # optional: array of string tags
extra_fields   = ["source", "output_fields"]   # additional fields -> Event.details

Include a real example from the tool's actual output:

[event_schema.example]
raw = '''
{"output":"15:00:00 A shell was spawned in a container","priority":"Warning",
 "rule":"Terminal shell in container","source":"syscall",
 "tags":["container","shell","mitre_execution"],
 "time":"2026-03-15T15:00:00.000000000Z",
 "output_fields":{"container.id":"abc123def456","proc.name":"bash","user.name":"root"}}
'''

`[severity_map]` - translate the tool's severities

[severity_map]
"Emergency"     = "critical"
"Alert"         = "critical"
"Critical"      = "critical"
"Error"         = "high"
"Warning"       = "high"
"Notice"        = "medium"
"Informational" = "low"
"Debug"         = "debug"

Valid InnerWarden severities are debug, info, low, medium, high, critical. Map every label the tool emits; anything unmapped defaults to info.

`[kind_format]` - build the `Event.kind` string

[kind_format]
# Tokens: {source} {rule} {rule_slug} {priority}
# rule_slug = rule, lowercased, with spaces and special chars turned into underscores.
template = "falco.{rule_slug}"

Result: falco.terminal_shell_in_container, wazuh.sshd_brute_force, and so on.

`[[entity_extraction.rules]]` - pull out the entities

Entities are what the correlation engine pivots on. Map every one you can find; mark the rest optional.

[[entity_extraction.rules]]
field       = "output_fields.fd.sip"    # dot-path into the raw event
entity_type = "ip"                       # ip | user | container | path | service
optional    = true                       # if true, skip silently when absent
transform   = "none"                     # none | short_id_12 | trim_whitespace

[[entity_extraction.rules]]
field       = "output_fields.user.name"
entity_type = "user"
optional    = true

[[entity_extraction.rules]]
field       = "output_fields.container.id"
entity_type = "container"
optional    = true
transform   = "short_id_12"

`[tags]` - event tags

[tags]
static             = ["falco", "kernel", "ebpf"]   # always present
dynamic_from_event = true                          # also copy tags from tags_field

`[[config_schema.fields]]` - what the operator configures

[[config_schema.fields]]
key         = "path"
type        = "string"
required    = false
default     = "/var/log/falco/falco.log"
description = "Path to the Falco JSON log file"

[[config_schema.fields]]
key         = "enabled"
type        = "bool"
required    = false
default     = "false"
description = "Enable the Falco collector"

These map to keys under [collectors.<collector_id>] in the sensor config. The TOML reference is owned by Configuration.

`[[prerequisites]]` - what must exist first

[[prerequisites]]
kind   = "binary_exists"
value  = "/usr/bin/falco"
reason = "Falco must be installed"

[[prerequisites]]
kind   = "file_readable"
value  = "{config.path}"
reason = "Falco's log file must be readable"

`[setup_notes]` - instructions shown at install time

[setup_notes]
required_tool_config = """
Add to /etc/falco/falco.yaml:
  json_output: true
  json_include_output_property: true
  file_output:
    enabled: true
    keep_alive: false
    filename: /var/log/falco/falco.log
"""

`[module_manifest]` - how this becomes a module

[module_manifest]
module_id            = "falco-integration"
module_tier          = "open"
incident_passthrough = true

When incident_passthrough = true, the generated collector (or a thin agent-side shim) promotes high and critical events straight to incidents without running them through an InnerWarden detector. That is the right choice for a tool that already does its own detection (Falco, Wazuh, Suricata): the tool found the threat, so re-detecting it adds nothing. A tool that emits raw telemetry, not verdicts, should leave this false and let an InnerWarden detector decide.

Contributing a recipe

You do not need to write any Rust to contribute a recipe. A recipe alone is valuable: it lets anyone (or an AI) generate the collector later.

Create integrations/<tool>/recipe.toml in the format above.
Include a real [event_schema.example] from the tool's actual output.
Map every severity label the tool emits.
Cover every entity-extraction path you know about; mark unknown ones optional = true.
Open a PR. Recipes-without-code are low-friction contributions.

If you go further and generate the collector too, validate and test it before the PR:

innerwarden module validate ./modules/<id>-integration
make test

The command surface is owned by CLI Reference; the collector rules a generated collector must satisfy are in Module Reference. After merge, maintainers publish the module so others can innerwarden module install <id>-integration.

Bundled recipes: there are currently no external-tool recipes shipped in the box. The integrations/ directory ships only a README describing this format; you author your own recipe from the format on this page.

Uh oh!

Uh oh!

Integration Recipes

Integration Recipes

How a recipe becomes a collector

Generating a collector with an AI assistant

Recipe format

[recipe] - metadata

[input] - how the tool emits data

[event_schema] - what one raw event looks like

[severity_map] - translate the tool's severities

[kind_format] - build the Event.kind string

[[entity_extraction.rules]] - pull out the entities

[tags] - event tags

[[config_schema.fields]] - what the operator configures

[[prerequisites]] - what must exist first

[setup_notes] - instructions shown at install time

[module_manifest] - how this becomes a module

Contributing a recipe

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Start here

Operate

Guard AI agents

How it works

Extend

Trust and compliance

Links

Clone this wiki locally

`[recipe]` - metadata

`[input]` - how the tool emits data

`[event_schema]` - what one raw event looks like

`[severity_map]` - translate the tool's severities

`[kind_format]` - build the `Event.kind` string

`[[entity_extraction.rules]]` - pull out the entities

`[tags]` - event tags

`[[config_schema.fields]]` - what the operator configures

`[[prerequisites]]` - what must exist first

`[setup_notes]` - instructions shown at install time

`[module_manifest]` - how this becomes a module