You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,7 +41,7 @@ series = {ESEC/FSE 2023}
41
41
42
42
# Example runs
43
43
44
-
- more can be seen at [history notes](https://github.com/ipa-lab/hackingBuddyGPT/blob/v3/history_notes.md)
44
+
- more can be seen at [history notes](https://github.com/ipa-lab/hackingBuddyGPT/blob/v3/docs/history_notes.md)
45
45
46
46
## updated version using GPT-4
47
47
@@ -51,11 +51,11 @@ This happened during a recent run:
51
51
52
52
Some things to note:
53
53
54
-
- the panel labeled 'my new fact list' is generated by the LLM. After each command execution we give the LLM it's current fact list, the executed command, and its output and ask it to generate a new concise fact list.
55
-
-the tabel contains all executed commands. The columns 'success?' and 'reason' are populate by asking the LLM if the executed comamnd (and its output) help with getting root access as well as to reason about the commands output
56
-
-in the bottom you see the last executed command (`/tmp/bash -p`) and it's output.
57
-
58
-
In this case GPT-4 wanted to exploit a vulnerable cron script (to which it had write access), sadly I forgot to enable cron in the VM.
54
+
-initially the current configuration is output. Yay, so many colors!
55
+
-"Got command from LLM" shows the generated command while the panel afterwards has the given command as title and the command's output as content.
56
+
- the tabel contains all executed commands. ThinkTime denotes the time that was needed to generate the command (Tokens show the token count for the prompt and its response). StateUpdTime shows the time that was needed to generate a new state (the next column also gives the token count)
57
+
- "What does the LLM know about the system?" gives an LLM generated list of system facts. To generate it, it is given the latest executed command (and it's output) as well as the current list of system facts. This is the operation which time/token usage is shown in the overview table as StateUpdTime/StateUpdTokens. As the state update takes forever, this is disabled by default and has to be enabled through a command line switch.
58
+
- Then the next round starts. The next given command (`sudo tar`) will lead to a pwn'd system BTW.
# for defaults we are using .env but allow overwrite through cli arguments
39
+
parser=argparse.ArgumentParser(description='Run an LLM vs a SSH connection.')
40
+
parser.add_argument('--enable-explanation', help="let the LLM explain each round's result", action="store_true")
41
+
parser.add_argument('--enable-update-state', help='ask the LLM to keep a multi-round state with findings', action="store_true")
42
+
parser.add_argument('--log', type=str, help='sqlite3 db for storing log files', default=os.getenv("LOG_DESTINATION") or':memory:')
43
+
parser.add_argument('--target-ip', type=str, help='ssh hostname to use to connect to target system', default=os.getenv("TARGET_IP") or'127.0.0.1')
44
+
parser.add_argument('--target-hostname', type=str, help='safety: what hostname to exepct at the target IP', default=os.getenv("TARGET_HOSTNAME") or"debian")
45
+
parser.add_argument('--target-user', type=str, help='ssh username to use to connect to target system', default=os.getenv("TARGET_USER") or'lowpriv')
46
+
parser.add_argument('--target-password', type=str, help='ssh password to use to connect to target system', default=os.getenv("TARGET_PASSWORD") or'trustno1')
47
+
parser.add_argument('--max-rounds', type=int, help='how many cmd-rounds to execute at max', default=int(os.getenv("MAX_ROUNDS")) or10)
48
+
parser.add_argument('--llm-connection', type=str, help='which LLM driver to use', choices=get_potential_llm_connections(), default=os.getenv("LLM_CONNECTION") or"openai_rest")
49
+
parser.add_argument('--target-os', type=str, help='What is the target operating system?', choices=["linux", "windows"], default="linux")
50
+
parser.add_argument('--model', type=str, help='which LLM to use', default=os.getenv("MODEL") or"gpt-3.5-turbo")
51
+
parser.add_argument('--llm-server-base-url', type=str, help='which LLM server to use', default=os.getenv("LLM_SERVER_BASE_URL") or"https://api.openai.com")
52
+
parser.add_argument('--tag', type=str, help='tag run with string', default="")
53
+
parser.add_argument('--context-size', type=int, help='model context size to use', default=int(os.getenv("CONTEXT_SIZE")) or4096)
54
+
parser.add_argument('--hints', type=argparse.FileType('r', encoding='latin-1'), help='json file with a hint per tested hostname', default=None)
Copy file name to clipboardExpand all lines: docs/history_notes.md
+14Lines changed: 14 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,17 @@
1
+
## updated version using GPT-4 (approx. End of August 2023)
2
+
3
+
This happened during a recent run:
4
+
5
+

6
+
7
+
Some things to note:
8
+
9
+
- the panel labeled 'my new fact list' is generated by the LLM. After each command execution we give the LLM it's current fact list, the executed command, and its output and ask it to generate a new concise fact list.
10
+
- the tabel contains all executed commands. The columns 'success?' and 'reason' are populate by asking the LLM if the executed comamnd (and its output) help with getting root access as well as to reason about the commands output
11
+
- in the bottom you see the last executed command (`/tmp/bash -p`) and it's output.
12
+
13
+
In this case GPT-4 wanted to exploit a vulnerable cron script (to which it had write access), sadly I forgot to enable cron in the VM.
14
+
1
15
# initial version (tagged as fse23-ivr) using gpt-3.5-turbo
0 commit comments