This repository is part of the artifact evaluation of the paper ``FlowChronicle: Synthetic Network Flow Generation through Pattern Set Mining'' accepted at the CoNEXT'24 conference in Los Angeles.
The reader will find in this repository everything needed to reproduce the training and generation of the FlowChronicle model, as well as the different data and functions required to evaluate it.
Please follow these instructions carefully to reproduce the content of the paper.
First, clone the repository in your working environment using your preferred method (ssh, gh, etc..)
git clone git@github.com:joschac/FlowChronicleCoNEXT.git
Please download the two archives following this link: https://drive.google.com/drive/folders/1M4737El_lPQVX8k5YXkLKkTnk3wo73-R Unpack the content of the two archives in their respective folder.
In order to facilitate the reproducibility of our findings, our application and its dependencies are embedded inside a Docker container. Please build it by launching the following command inside the Flowhronicle repository:
docker build -f setup/Dockerfile -t flwchncl .
If using Docker is impossible, please create a virtual environment and install the dependencies of setup/requirements.txt:
cd setup
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Make sure, in this case, that your version of Python is superior to 3.10.
After having built it, run the container to get into our /app/ directory as the root
docker run -it -p 8888:8888 flwchncl
Inside /app/, simply run the our_train_and_generate.py in order to train a model and generate new network flows
python3 our_train_and_generate.py
Training a new model from scratch can be really long and computationally intensive. For just generating new data from an already preexisting model, do:
python3 generate_from_model.py
If training or generating new data is impossible, it is still possible to inspect the data that we generate ourselves.
The network flows generated by our model are at data/our_syn.csv
Regardless of what has been done at the prior step, the different evaluations led in Section 7 of the paper can be reproduced with the experiments.ipynb notebook
Inside the /app/ directory of the container, please do
jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root
The Notebook server is launched inside the container, click on one of the URLs (usually one of them should be http://localhost:8888/ if you are working locally) and open experiments.ipynb inside the server.
The notebook contains all the instructions for reproducing the results of the experiment of Section 7.
If the reader wants to recreate the actual tables and figures of the papers, he might feel free to use our own generated data for both FlowChronicle and the baselines; they can be found inside the results/ directory.