Implements first version of modular capability system #15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds a first implementation of a modular capability system as well as the corresponding dependency injection / parameter resolution.
This means, that if you define an
@experimentclass (and it is included somewhere from themain.pyscript), all parameters of its__init__method will try to be filled from parameters (either command line arguments, the.envfile, environment variables or python function default values), including recursively building parameters that expose a__parameters__field (such as created by the@capabilityannotation on a class).As example, the exisiting wintermute.py script has been re-implemented, and can now be executed with the experiment names
linux_privesc_gpt35turbo,linux_privesc_gpt4andlinux_privesc_gpt4turbo(which use different versions of the GPT API respectively, as can be seen by their names) from main.py.To know which parameters to include, you can call it as eg
python3 main.py linux_privesc_gpt4turbo -hand get the following list of parameters:While the help output probably needs to be improved (not all of these parameters are mandatory, and some are not even sensible to change without knowing what you are doing), they show you all that you can configure in this experiment.
This has been automatically generated from the following dependencies / options (
@dataclasshas been used here to automatically have all fields in a__init__function and have them properly assigned):Comparing this you can also see, how the parameter names were built. All that start with eg
llm.are parameters of the GPT35Turbo capability.If you now for example set the
ssh.passwordandssh.usernamein the environment variables, and thellm.api_keyandlog_db.connection_stringin your.envfile, you can execute the command usingThis PR currently contains quite a bit of duplicated code, which has been left on purpose to ease the comparison between old and new version, and to allow better regression testing.