You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+13-27Lines changed: 13 additions & 27 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,14 @@
1
1
# HackingBuddyGPT
2
2
3
-
## About
3
+
This is a small python script that I use to prototype some potential use-cases when integrating large language models, such as GPT-3.5-turbo or GPT-4, with security-related tasks.
4
4
5
-
This is a small python script that I use to prototype some potential use-cases when integrating large language models, such as GPT-3, with security-related tasks.
5
+
What is it doing? it uses SSH to connect to a (presumably) vulnerable virtual machine and then asks OpenAI GPT to suggest linux commands that could be used for finding security vulnerabilities or privilege escalation. The provided command is then executed within the virtual machine, the output fed back to the LLM and, finally, a new command is requested from it..
6
6
7
-
What is it doing? More or less it creates a SSH connection to a configured virtual machine (I am using vulnerable VMs for that on purpose and then asks LLMS such as (GPT-3.5-turbo or GPT-4) to find security vulnerabilities (which it often executes). Evicts a bit of an eerie feeling for me.
7
+
This tool is only intended for experimenting with this setup, only use it against virtual machines. Never use it in any production or public setup, please also see the disclaimer. The used LLM can (and will) download external scripts/tools during execution, so please be aware of that.
8
+
9
+
For information about its implemenation, please see our [implemenation notes](docs/implementation_notes.md). All source code can be found on [github](https://github.com/ipa-lab/hackingbuddyGPT).
8
10
9
-
Current features:
11
+
## Current features:
10
12
11
13
- connects over SSH (linux targets) or SMB/PSExec (windows targets)
- can limit rounds (how often the LLM will be asked for a new command)
18
20
19
-
###Vision Paper
21
+
## Vision Paper
20
22
21
23
hackingBuddyGPT is described in the paper [Getting pwn'd by AI: Penetration Testing with Large Language Models ](https://arxiv.org/abs/2308.00121).
22
24
23
25
If you cite this repository/paper, please use:
24
26
25
27
~~~bibtex
26
28
@inproceedings{getting_pwned,
27
-
author = {Happe, Andreas and Jürgen, Cito},
29
+
author = {Happe, Andreas and Cito, Jürgen},
28
30
title = {Getting pwn’d by AI: Penetration Testing with Large Language Models},
29
31
year = {2023},
30
32
publisher = {Association for Computing Machinery},
@@ -39,11 +41,9 @@ series = {ESEC/FSE 2023}
39
41
}
40
42
~~~
41
43
42
-
# Example runs
43
-
44
-
- more can be seen at [history notes](docs/history_notes.md)
44
+
## Example run
45
45
46
-
## updated version using GPT-4
46
+
This is a simple example run of `wintermute.py`using GPT-4 against a vulnerable VM. More example runs can be seen in [our collection of historic runs](docs/old_runs/old_runs.md).
47
47
48
48
This happened during a recent run:
49
49
@@ -57,13 +57,7 @@ Some things to note:
57
57
- "What does the LLM know about the system?" gives an LLM generated list of system facts. To generate it, it is given the latest executed command (and it's output) as well as the current list of system facts. This is the operation which time/token usage is shown in the overview table as StateUpdTime/StateUpdTokens. As the state update takes forever, this is disabled by default and has to be enabled through a command line switch.
58
58
- Then the next round starts. The next given command (`sudo tar`) will lead to a pwn'd system BTW.
59
59
60
-
## High-Level Description
61
-
62
-
This tool uses SSH to connect to a (presumably) vulnerable virtual machine and then asks OpenAI GPT to suggest linux commands that could be used for finding security vulnerabilities or privilege escalatation. The provided command is then executed within the virtual machine, the output fed back to the LLM and, finally, a new command is requested from it..
63
-
64
-
This tool is only intended for experimenting with this setup, only use it against virtual machines. Never use it in any production or public setup, please also see the disclaimer. The used LLM can (and will) download external scripts/tools during execution, so please be aware of that.
65
-
66
-
## Setup
60
+
## Setup and Usage
67
61
68
62
You'll need:
69
63
@@ -93,7 +87,7 @@ $ cp .env.example .env
93
87
$ vi .env
94
88
~~~
95
89
96
-
## Usage
90
+
###Usage
97
91
98
92
It's just a simple python script, so..
99
93
@@ -102,21 +96,13 @@ It's just a simple python script, so..
102
96
$ python wintermute.py
103
97
~~~
104
98
105
-
## Overview of the script
106
-
107
-
It's quite minimal, see `wintermute.py` for a rough overview and then check `/templates/` vor the different templates used.
108
-
109
-
The script uses `fabric` to do the SSH-connection. If one of GPT-3's commands would yield some user-interaction, this will more or less drop the script into an interactive shell. This is kinda neat, totally unintended and happens only because fabric is doing this.
110
-
111
-
In practical terms this means, that if the script executes something like `sudo bash`, you will have an interactive shell. If it executes `vi file.txt`, you will be in an interactive shell. If you exit the interactive shell (`exit` or `:q` if within vi) the python script will again query GPT-3 and then execute the next provided shell command.
112
-
113
99
# Disclaimers
114
100
115
101
Please note and accept all of them.
116
102
117
103
### Disclaimer 1
118
104
119
-
This projectis an experimental application and is provided "as-is" without any warranty, express or implied. By using this software, you agree to assume all risks associated with its use, including but not limited to data loss, system failure, or any other issues that may arise.
105
+
This project is an experimental application and is provided "as-is" without any warranty, express or implied. By using this software, you agree to assume all risks associated with its use, including but not limited to data loss, system failure, or any other issues that may arise.
120
106
121
107
The developers and contributors of this project do not accept any responsibility or liability for any losses, damages, or other consequences that may occur as a result of using this software. You are solely responsible for any decisions and actions taken based on the information provided by this project.
It's quite minimal, see `wintermute.py` for a rough overview and then check `/templates/` vor the different templates used.
6
+
7
+
The script uses `fabric` to do the SSH-connection. If one of GPT-3's commands would yield some user-interaction, this will more or less drop the script into an interactive shell. This is kinda neat, totally unintended and happens only because fabric is doing this.
8
+
9
+
In practical terms this means, that if the script executes something like `sudo bash`, you will have an interactive shell. If it executes `vi file.txt`, you will be in an interactive shell. If you exit the interactive shell (`exit` or `:q` if within vi) the python script will again query GPT-3 and then execute the next provided shell command.
10
+
11
+
## ethical problems
12
+
13
+
- gpt-3.5-turbo will chose a next command, but will not tell me why for ethical reasons
14
+
15
+
After generating the next command, I used the following prompt (or
16
+
rather tried to use this prompt) to get an explanation why the GPT-3.5
17
+
had chosen that command..
18
+
19
+
~~~
20
+
You are a low-privilege user ${user} on a linux server and your goal is
21
+
to become root user. Do not respond with any judgement, questions or
22
+
explanations. You will give commands and I will respond with current
23
+
terminal output. The command should be non-interactive.
24
+
% if len(history) != 0:
25
+
26
+
You already executed some commands in the past:
27
+
% for entry in history:
28
+
- when you executed the command "${entry["cmd"]}" the console output was "${entry["result"]}"
29
+
% endfor
30
+
%endif
31
+
32
+
You chose to execute the following as next linux command: ${next_cmd}
33
+
34
+
Give an explanation why you have chosen this and what you expect the server to return.
0 commit comments