-
-
Notifications
You must be signed in to change notification settings - Fork 30
Integration Recipes
For: a developer wrapping an external security tool (Falco, Suricata, Wazuh, osquery, anything that already emits events) as an InnerWarden collector.
InnerWarden can ingest events from any external tool by reading that tool's output and turning each line into an Event. You could hand-write that collector. But if the tool already does its own detection and just spits out alerts, there is a faster way: write a recipe.
A recipe is a small TOML file that describes the integration precisely enough that a person, or an AI assistant, can generate a working collector from it without reading the tool's source. It captures the contract: how the tool emits data, what one event looks like, how its fields and severities map onto InnerWarden's, and how to pull out entities (the IP, the user, the container). The recipe is a spec; the collector is the implementation it drives.
This page is the recipe format with one full worked example. For hand-writing a collector from scratch (and the Event rules a generated collector must follow), see Write a Module and Module Reference.
recipe.toml
├── describes ─► input mechanism (file_tail / subprocess / unix_socket / http / grpc)
├── describes ─► what one raw event looks like (field names, an example)
├── describes ─► field mappings (tool field -> Event field)
├── describes ─► severity mapping (tool's labels -> InnerWarden Severity)
├── describes ─► entity extraction (how to find the IP / user / container)
└── describes ─► the config surface (what the operator sets)
│
▼
a person or an AI reads the recipe + the collector pattern
│
▼
FooCollector (Rust) + module.toml + docs/README.md
│
▼
innerwarden module validate ./modules/foo-integration
The recipe contains no Rust. It is the description; the collector author (or an AI assistant) writes the code from it, following the collector rules in Module Reference.
Give the assistant, in one prompt, the context it needs:
- This page (the recipe format).
-
Module Reference (the collector rules: fail-open, async,
Eventfield names). -
Your
recipe.tomlfor the tool. -
The
EventandEntityReftype definitions (field names are in Data Formats; the live source is in the repo). -
A reference collector from the repo to match the house pattern (any existing subprocess- or file-tailing collector under
crates/sensor/src/collectors/).
A prompt that works:
Using the recipe in
recipe.tomland following the InnerWarden collector pattern in the reference file, generate the collector,modules/<id>-integration/module.toml,config/sensor.example.toml, anddocs/README.md. The collector must be fail-open, async, sendEventstructs over the mpsc channel, and match every field and severity mapping in the recipe exactly.
The recipe is structured so that prompt produces a correct first draft with little iteration.
A recipe lives at integrations/<tool>/recipe.toml. The sections below are a full Falco recipe, top to bottom.
[recipe]
id = "falco" # used as collector_id and module prefix
name = "Falco Runtime Security"
description = "Short description shown in the module listing"
tool = "falco"
tool_url = "https://falco.org"
tool_version_min = "0.36" # oldest tested version
collector_id = "falco_log" # sensor config key: [collectors.falco_log]
author = ""
license = "Apache-2.0"[input]
mechanism = "file_tail" # see the mechanisms table below
path_config_key = "path" # the config key holding the file path
path_default = "/var/log/falco/falco.log"
format = "jsonl" # jsonl | text | binary
restart_on_eof = true # re-tail after EOF (survives log rotation)
reconnect_secs = 5Mechanisms:
| Value | What it is |
|---|---|
file_tail |
Tail a log file line by line. |
subprocess_stdout |
Spawn a subprocess and read its stdout. |
unix_socket |
Connect to a Unix domain socket (JSON stream). |
http_poll |
Poll an HTTP endpoint on an interval. |
grpc_stream |
Consume a gRPC streaming API. |
Describe the fields. Use dot notation for nested ones.
[event_schema]
ts_field = "time" # the timestamp field
ts_format = "rfc3339" # rfc3339 | unix_sec | unix_ms | strptime:<fmt>
summary_field = "output" # human-readable description
severity_field = "priority" # the tool's severity label
kind_field = "rule" # event type / rule name
tags_field = "tags" # optional: array of string tags
extra_fields = ["source", "output_fields"] # additional fields -> Event.detailsInclude a real example from the tool's actual output:
[event_schema.example]
raw = '''
{"output":"15:00:00 A shell was spawned in a container","priority":"Warning",
"rule":"Terminal shell in container","source":"syscall",
"tags":["container","shell","mitre_execution"],
"time":"2026-03-15T15:00:00.000000000Z",
"output_fields":{"container.id":"abc123def456","proc.name":"bash","user.name":"root"}}
'''[severity_map]
"Emergency" = "critical"
"Alert" = "critical"
"Critical" = "critical"
"Error" = "high"
"Warning" = "high"
"Notice" = "medium"
"Informational" = "low"
"Debug" = "debug"Valid InnerWarden severities are debug, info, low, medium, high, critical. Map every label the tool emits; anything unmapped defaults to info.
[kind_format]
# Tokens: {source} {rule} {rule_slug} {priority}
# rule_slug = rule, lowercased, with spaces and special chars turned into underscores.
template = "falco.{rule_slug}"Result: falco.terminal_shell_in_container, wazuh.sshd_brute_force, and so on.
Entities are what the correlation engine pivots on. Map every one you can find; mark the rest optional.
[[entity_extraction.rules]]
field = "output_fields.fd.sip" # dot-path into the raw event
entity_type = "ip" # ip | user | container | path | service
optional = true # if true, skip silently when absent
transform = "none" # none | short_id_12 | trim_whitespace
[[entity_extraction.rules]]
field = "output_fields.user.name"
entity_type = "user"
optional = true
[[entity_extraction.rules]]
field = "output_fields.container.id"
entity_type = "container"
optional = true
transform = "short_id_12"[tags]
static = ["falco", "kernel", "ebpf"] # always present
dynamic_from_event = true # also copy tags from tags_field[[config_schema.fields]]
key = "path"
type = "string"
required = false
default = "/var/log/falco/falco.log"
description = "Path to the Falco JSON log file"
[[config_schema.fields]]
key = "enabled"
type = "bool"
required = false
default = "false"
description = "Enable the Falco collector"These map to keys under [collectors.<collector_id>] in the sensor config. The TOML reference is owned by Configuration.
[[prerequisites]]
kind = "binary_exists"
value = "/usr/bin/falco"
reason = "Falco must be installed"
[[prerequisites]]
kind = "file_readable"
value = "{config.path}"
reason = "Falco's log file must be readable"[setup_notes]
required_tool_config = """
Add to /etc/falco/falco.yaml:
json_output: true
json_include_output_property: true
file_output:
enabled: true
keep_alive: false
filename: /var/log/falco/falco.log
"""[module_manifest]
module_id = "falco-integration"
module_tier = "open"
incident_passthrough = trueWhen incident_passthrough = true, the generated collector (or a thin agent-side shim) promotes high and critical events straight to incidents without running them through an InnerWarden detector. That is the right choice for a tool that already does its own detection (Falco, Wazuh, Suricata): the tool found the threat, so re-detecting it adds nothing. A tool that emits raw telemetry, not verdicts, should leave this false and let an InnerWarden detector decide.
You do not need to write any Rust to contribute a recipe. A recipe alone is valuable: it lets anyone (or an AI) generate the collector later.
- Create
integrations/<tool>/recipe.tomlin the format above. - Include a real
[event_schema.example]from the tool's actual output. - Map every severity label the tool emits.
- Cover every entity-extraction path you know about; mark unknown ones
optional = true. - Open a PR. Recipes-without-code are low-friction contributions.
If you go further and generate the collector too, validate and test it before the PR:
innerwarden module validate ./modules/<id>-integration
make testThe command surface is owned by CLI Reference; the collector rules a generated collector must satisfy are in Module Reference. After merge, maintainers publish the module so others can innerwarden module install <id>-integration.
Bundled recipes: there are currently no external-tool recipes shipped in the box. The
integrations/directory ships only a README describing this format; you author your own recipe from the format on this page.
- Hand-writing a collector and the
Eventrules it must follow: Module Reference - The ten-minute first-module path: Write a Module
- What the entities and severities mean across the product: Data Formats
- Collector config keys: Configuration