- Project Proposal
- Presenting Our Proposal
- A Research Summary
- Our Interim Results: [Project Presentation] [Project Report]
To work with LaTex locally on your IDE, follow the steps below:
- Install MikTex as the LaTeX Distribution.
- Install Perl.
- Install the VS Code Extension for previewing the LaTex in a PDF format, such as LaTeX Workshop.
- Previewing and Exporting will be available in the
.texfile.
- Create a Virtual Environment with this command on a terminal:
py -m venv marl_env. - Update pip to its latest version with this command:
py -m pip install --upgrade pip. - From the Virtual Environment, install all the dependencies with this command:
pip install -r requirements.txt, orconda list -e > requirements.txtif you use conda. Note: Use a full path for conda. - Or you may find creating a virtual environment useful if you keep facing dependency issues. Just run the following command before installing libraries from pip:
python -m venv marl_env source marl_env/bin/activate # (On Windowks: marl_env\Scripts\activate). - Atari is available via Gymnasium: https://www.gymlibrary.dev/environments/atari/index.html
- Clone the repository https://github.com/damat-le/gym-simplegrid.git to get the codes of a simple grid-based environment for customization.
The rgb_array shape Image shape: (480, 640, 4) represents the dimensions and color channels of the rendered image of the environment. Here's a breakdown of what each dimension represents:
Breakdown of the Shape Height (480): The first dimension (480) represents the height of the image in pixels. Width (640): The second dimension (640) represents the width of the image in pixels. Color Channels (4): The third dimension (4) represents the color channels of the image. In this case, there are 4 channels, which typically correspond to the RGBA color model:
- R: Red channel
- G: Green channel
- B: Blue channel
- A: Alpha channel (transparency)
Explanation Height and Width: The height and width of the image determine the resolution of the rendered image. In this case, the image has a resolution of 480x640 pixels. Color Channels: The 4 color channels (RGBA) provide information about the color and transparency of each pixel in the image. The alpha channel allows for transparency effects, which can be useful for rendering overlapping objects or semi-transparent elements.
When the environment is rendered in ansi mode, the render method generates a string that represents the current state of the environment. This string typically includes information such as the current step, the agent's position, the reward obtained, whether the episode has ended, and the agent's action.
Example:
Step: 5, Agent Position: (2, 3), Reward: -1, Done: False, Action: (1, 0)
View more states in a simulated game in this directory: Simulations/.
You might need to run the script `runtime-environment.py to see how a game runs.
- Distance to the nearest obstacle (obs_dist): int or float
- Relative position of the goal (xg): int, -1 if goal is not in partial observability.
- Relative position of the goal (yg): int, -1 if goal is not in partial observability.
- Whether the path is clear or blocked(path_blocked): 0/1 int
- Leader's action (action): int
- Leader can observe the follower or not (follower_visibility): 0/1 int
- Leaders distance to follower (follower_dist): float
- Leader's suggested action in x direction (action_dx): int
- Leader's suggested action in y direction (action_dy): int
- Leader's current x position (x): int
- Leader's current y position (y): int
- Sample: [-1, 1, 1.0, 0, 0, 0, 2, 2]
| Layer (type) | Output Shape | Param # |
|---|---|---|
| input_layer (InputLayer) | (None, 8) | 0 |
| reshape (Reshape) | (None, 1, 8) | 0 |
| gru (GRU) | (None, 1, 64) | 14,208 |
| gru_1 (GRU) | (None, 32) | 9,408 |
- Total params: 23,616 (92.25 KB)
- Trainable params: 23,616 (92.25 KB)
- Non-trainable params: 0 (0.00 B)
- Prediction: Outputs an array of 32 values, representing the encoded leader's message communicating to the follower agents.
| Layer (type) | Output Shape | Param # |
|---|---|---|
| input_layer_1 (InputLayer) | (None, 32) | 0 |
| repeat_vector (RepeatVector) | (None, 1, 32) | 0 |
| gru_2 (GRU) | (None, 1, 64) | 18,816 |
| gru_3 (GRU) | (None, 64) | 24,960 |
| dense (Dense) | (None, 8) | 520 |
- Total params: 44,296 (173.03 KB)
- Trainable params: 44,296 (173.03 KB)
- Non-trainable params: 0 (0.00 B)
- Prediction: Outputs an array of 8 values, representing the probabilities of each possible action.
- Evaluates the best move for an agent.
| Layer (type) | Output Shape | Param # |
|---|---|---|
| input_layer_27 (InputLayer) | (None, 8) | 0 |
| reshape_26 (Reshape) | (None, 1, 8) | 0 |
| dense_71 (Dense) | (None, 1, 64) | 576 |
| dense_72 (Dense) | (None, 1, 64) | 4,160 |
| dense_73 (Dense) | (None, 1, 9) | 585 |
| reshape_27 (Reshape) | (None, 9) | 0 |
- Total params: 5,321 (20.78 KB)
- Trainable params: 5,321 (20.78 KB)
- Non-trainable params: 0 (0.00 B)
| Layer (type) | Output Shape | Param # |
|---|---|---|
| input_layer_28 (InputLayer) | (None, 2, 8) | 0 |
| global_average_pooling1d_11 | (None, 8) | 0 |
| dense_74 (Dense) | (None, 64) | 576 |
| dense_75 (Dense) | (None, 64) | 4,160 |
| dense_76 (Dense) | (None, 9) | 585 |
- Total params: 5,321 (20.78 KB)
- Trainable params: 5,321 (20.78 KB)
- Non-trainable params: 0 (0.00 B)
- Input of the model is a combination of the leader's message and its own observation on the grid. The leader's message is encoded and compressed into 8 values in an array.
- Run the
evaluation.pyscript to plot nicely looking graphs based on metrics we recorded during training. - On your terminal, change directory into
training/, and then runtensorboard --logdir=logs. Openhttp://localhost:6006/or the port it opens to in order to view GPU consumption statistics. - Run unit tests: on your terminal, change directory into
testswithcd tests. Then, run this command:coverage run -m unittest discover. - Inspect Unit Tests at ./tests/htmlcov/
- TMUX for idling long executions

