Skip to content

Commit 9983243

Browse files
authored
Merge pull request #1 from ipa-lab/v2
update main branch
2 parents c54a608 + 2d090b9 commit 9983243

15 files changed

+245
-121
lines changed

‎.env.example‎

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
OPENAI_KEY="your-openai-key"
2-
MODEL="gpt-3.5-turbo"
2+
MODEL="gpt-4"
3+
CONTEXT_SIZE=7000
34

45
# exchange with the IP of your target VM
56
TARGET_IP='enter-the-private-ip-of-some-vm.local'

‎README.md‎

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
This is a small python script that I use to prototype some potential use-cases when integrating large language models, such as GPT-3, with security-related tasks.
66

7-
What is it doing? More or less it creates a SSH connection to a configured virtual machine (I am using vulnerable VMs for that on purpose and then asks GPT-3 to find security vulnerabilities (which it often executes). Evicts a bit of an eerie feeling for me.
7+
What is it doing? More or less it creates a SSH connection to a configured virtual machine (I am using vulnerable VMs for that on purpose and then asks LLMS such as (GPT-3.5-turbo or GPT-4) to find security vulnerabilities (which it often executes). Evicts a bit of an eerie feeling for me.
88

99
### Vision Paper
1010

@@ -29,7 +29,23 @@ series = {ESEC/FSE 2023}
2929
}
3030
~~~
3131

32-
# Example run
32+
# Example runs
33+
34+
## updated version using GPT-4
35+
36+
This happened during a recent run:
37+
38+
![Example wintermute run](example_run_gpt4.png)
39+
40+
Some things to note:
41+
42+
- the panel labeled 'my new fact list' is generated by the LLM. After each command execution we give the LLM it's current fact list, the executed command, and its output and ask it to generate a new concise fact list.
43+
- the tabel contains all executed commands. The columns 'success?' and 'reason' are populate by asking the LLM if the executed comamnd (and its output) help with getting root access as well as to reason about the commands output
44+
- in the bottom you see the last executed command (`/tmp/bash -p`) and it's output.
45+
46+
In this case GPT-4 wanted to exploit a vulnerable cron script (to which it had write access), sadly I forgot to enable cron in the VM.
47+
48+
## initial version (tagged as fse23-ivr) using gpt-3.5-turbo
3349

3450
This happened during a recent run:
3551

@@ -50,9 +66,9 @@ So, what is acutally happening when executing wintermute?
5066

5167
## High-Level Description
5268

53-
This tool uses SSH to connect to a (presumably) vulnerable virtual machine and then asks OpenAI GPT-3 to suggest linux commands that could be used for finding security vulnerabilities or privilege escalatation. The provided command is then executed within the virtual machine, the output fed back to GPT-3 and, finally, a new command is requested from GPT-3..
69+
This tool uses SSH to connect to a (presumably) vulnerable virtual machine and then asks OpenAI GPT to suggest linux commands that could be used for finding security vulnerabilities or privilege escalatation. The provided command is then executed within the virtual machine, the output fed back to the LLM and, finally, a new command is requested from it..
5470

55-
This tool is only intended for experimenting with this setup, only use it against virtual machines. Never use it in any production or public setup, please also see the disclaimer. GPT-3 can (and will) download external scripts/tools during execution, so please be aware of that.
71+
This tool is only intended for experimenting with this setup, only use it against virtual machines. Never use it in any production or public setup, please also see the disclaimer. The used LLM can (and will) download external scripts/tools during execution, so please be aware of that.
5672

5773
## Setup
5874

‎config.py‎

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
import os
2+
3+
from dotenv import load_dotenv
4+
5+
def check_config():
6+
load_dotenv()
7+
8+
def model():
9+
return os.getenv("MODEL")
10+
11+
def context_size():
12+
return int(os.getenv("CONTEXT_SIZE"))
13+
14+
def target_ip():
15+
return os.getenv('TARGET_IP')
16+
17+
def target_password():
18+
return os.getenv("TARGET_PASSWORD")
19+
20+
def target_user():
21+
return os.getenv('TARGET_USER')
22+
23+
def openai_key():
24+
return os.getenv('OPENAI_KEY')

‎example_run_gpt4.png‎

130 KB
Loading

‎history.py‎

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,27 @@
11
import tiktoken
2+
import os
3+
4+
from rich.table import Table
25

36
def num_tokens_from_string(string: str) -> int:
47
"""Returns the number of tokens in a text string."""
5-
encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
8+
model = os.getenv("MODEL")
9+
encoding = tiktoken.encoding_for_model(model)
610
return len(encoding.encode(string))
711

812

913
class ResultHistory:
1014
def __init__(self):
1115
self.data = []
1216

13-
def append(self, cmd, result):
17+
def append(self, think_time, cmd_type, cmd, result, success, reasoning):
1418
self.data.append({
1519
"cmd": cmd,
16-
"result": result
20+
"result": result,
21+
"think_time": think_time,
22+
"cmd_type": cmd_type,
23+
"success": success,
24+
"reasoning": reasoning
1725
})
1826

1927
def get_full_history(self):
@@ -42,4 +50,18 @@ def get_history(self, limit=3072):
4250
"result" : itm["result"][:(rest-size_cmd-2)] + ".."
4351
})
4452
return list(reversed(result))
45-
return list(reversed(result))
53+
return list(reversed(result))
54+
55+
def create_history_table(self):
56+
table = Table(show_header=True, show_lines=True)
57+
table.add_column("Type", style="dim", width=7)
58+
table.add_column("ThinkTime", style="dim")
59+
table.add_column("To_Execute")
60+
table.add_column("Resp. Size", justify="right")
61+
table.add_column("success?", width=8)
62+
table.add_column("reason")
63+
64+
for itm in self.data:
65+
table.add_row(itm["cmd_type"], itm["think_time"], itm["cmd"], str(len(itm["result"])), itm["success"], itm["reasoning"])
66+
67+
return table

‎llms/openai.py‎

Lines changed: 7 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,12 @@
11
import openai
2-
import os
2+
import config
33

4-
openapi_model : str = ''
5-
6-
def openai_config():
7-
global openapi_model
8-
9-
api_key = os.getenv('OPENAI_KEY')
10-
model = os.getenv('MODEL')
4+
def get_openai_response(cmd):
115

12-
if api_key != '' and model != '':
13-
openai.api_key = api_key
14-
openapi_model = model
15-
else:
6+
if config.model() == '' and config.openai_key() == '':
167
raise Exception("please set OPENAI_KEY and MODEL through environment variables!")
178

18-
def get_openai_response(cmd):
19-
completion = openai.ChatCompletion.create(model=openapi_model, messages=[{"role": "user", "content" : cmd}])
20-
return completion.choices[0].message.content
9+
openai.api_key = config.openai_key()
10+
11+
completion = openai.ChatCompletion.create(model=config.model(), messages=[{"role": "user", "content" : cmd}])
12+
return completion.choices[0].message.content

‎llms/openai_rest.py‎

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
import config
2+
import requests
3+
4+
5+
def get_openai_response(cmd):
6+
if config.model() == '' and config.openai_key() == '':
7+
raise Exception("please set OPENAI_KEY and MODEL through environment variables!")
8+
openapi_key = config.openai_key()
9+
openapi_model = config.model()
10+
11+
headers = {"Authorization": f"Bearer {openapi_key}"}
12+
data = {'model': openapi_model, 'messages': [{'role': 'user', 'content': cmd}]}
13+
response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=data).json()
14+
15+
print(str(response))
16+
return response['choices'][0]['message']['content']

‎prompt_helper.py‎

Lines changed: 20 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,29 @@
11
import logging
2+
import json
3+
import time
24

3-
from colorama import Fore, Style
45
from datetime import datetime
56
from mako.template import Template
67

7-
from llms.openai import get_openai_response
8+
class LLM:
9+
def __init__(self, llm_connection):
10+
self.connection = llm_connection
811

9-
log = logging.getLogger()
10-
filename = datetime.now().strftime('logs/run_%Y%m%d%m-%H%M.log')
11-
log.addHandler(logging.FileHandler(filename))
12+
# prepare logging
13+
self.log = logging.getLogger()
14+
filename = datetime.now().strftime('logs/run_%Y%m%d%m-%H%M.log')
15+
self.log.addHandler(logging.FileHandler(filename))
16+
self.get_openai_response = llm_connection
1217

13-
def output_log(kind, msg):
14-
print("[" + Fore.RED + kind + Style.RESET_ALL +"]: " + msg)
15-
log.warning("[" + kind + "] " + msg)
18+
# helper for generating and executing LLM prompts from a template
19+
def create_and_ask_prompt(self, template_file, log_prefix, **params):
1620

17-
# helper for generating and executing LLM prompts from a template
18-
def create_and_ask_prompt(template_file, log_prefix, **params):
19-
global logs
21+
template = Template(filename='templates/' + template_file)
22+
prompt = template.render(**params)
23+
self.log.warning("[" + log_prefix + "-prompt] " + prompt)
24+
tic = time.perf_counter()
25+
result = self.get_openai_response(prompt)
26+
toc = time.perf_counter()
27+
self.log.warning("[" + log_prefix + "-answer] " + result)
2028

21-
template = Template(filename='templates/' + template_file)
22-
prompt = template.render(**params)
23-
output_log(log_prefix + "-prompt", prompt)
24-
result = get_openai_response(prompt)
25-
output_log(log_prefix + "-answer", result)
26-
return result
29+
return json.loads(result), str(toc-tic)

‎requirements.txt‎

Lines changed: 18 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,25 @@
1-
aiohttp==3.8.4
2-
aiosignal==1.3.1
3-
async-timeout==4.0.2
4-
attrs==23.1.0
51
bcrypt==4.0.1
6-
certifi==2022.12.7
2+
certifi==2023.7.22
73
cffi==1.15.1
8-
charset-normalizer==3.1.0
9-
colorama==0.4.6
10-
cryptography==40.0.2
11-
fabric==3.0.0
12-
frozenlist==1.3.3
4+
charset-normalizer==3.2.0
5+
cryptography==41.0.3
6+
decorator==5.1.1
7+
Deprecated==1.2.14
8+
fabric==3.2.2
139
idna==3.4
14-
invoke==2.0.0
10+
invoke==2.2.0
1511
Mako==1.2.4
16-
MarkupSafe==2.1.2
17-
multidict==6.0.4
18-
openai==0.27.4
19-
paramiko==3.1.0
12+
markdown-it-py==3.0.0
13+
MarkupSafe==2.1.3
14+
mdurl==0.1.2
15+
paramiko==3.3.1
2016
pycparser==2.21
17+
Pygments==2.16.1
2118
PyNaCl==1.5.0
2219
python-dotenv==1.0.0
23-
regex==2023.3.23
24-
requests==2.28.2
25-
tiktoken==0.3.3
26-
tqdm==4.65.0
27-
urllib3==1.26.15
28-
yarl==1.9.2
20+
regex==2023.8.8
21+
requests==2.31.0
22+
rich==13.5.2
23+
tiktoken==0.4.0
24+
urllib3==2.0.4
25+
wrapt==1.15.0

‎targets/ssh.py‎

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,7 @@
1-
import os
2-
31
from fabric import Connection
42
from invoke import Responder
53

6-
def get_ssh_connection():
7-
ip = os.getenv('TARGET_IP')
8-
user = os.getenv('TARGET_USER')
9-
password = os.getenv('TARGET_PASSWORD')
4+
def get_ssh_connection(ip, user, password):
105

116
if ip != '' and user != '' and password != '':
127
return SSHHostConn(ip, user, password)
@@ -31,6 +26,7 @@ def connect(self):
3126
connect_kwargs={"password": self.password},
3227
)
3328
self.conn=conn
29+
self.conn.open()
3430

3531
def run(self, cmd):
3632
sudopass = Responder(

0 commit comments

Comments
 (0)