Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .env.sample
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
PORT=8000
OPENAI_API_KEY=
VERBOSE_LLM=True
DIALOG_DATA_PATH=./know.csv
PROJECT_CONFIG=./prompt.toml
DIALOG_DATA_PATH=./sample_data/data.csv
PROJECT_CONFIG=./sample_data/prompt.toml
DATABASE_URL=postgresql://talkdai:talkdai@127.0.0.1:5432/talkdai
STATIC_FILE_LOCATION=static
DEBUG=false
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -164,4 +164,6 @@ requirements.txt
*.csv
*.toml
!src/tests/fixtures/*.csv
!src/tests/fixtures/*.toml
!src/tests/fixtures/*.toml
!sample_data/*.csv
!sample_data/*.toml
44 changes: 36 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,45 @@ We assume you are familiar with [Docker](https://www.docker.com/), if you are no
```bash
docker-compose up
```
it will start two services:
it will start two services:
- `db`: where the PostgresSQL database runs to support chat history and document retrieval for [RAG](https://en.wikipedia.org/wiki/Prompt_engineering#Retrieval-augmented_generation);
- `dialog`: the service with the api.

## Quick Start

To use this project, you need to have a `.csv` file with the knowledge base and a `.toml` file with your prompt configuration.
If you are new to the project and want to get started quickly with some sample data and a simple prompt configuration, follow the steps below:

We recommend that you create a folder inside this project called `data` and put CSVs and TOMLs files over there.
1. Clone the repository:

### `.csv` knowledge base
```bash
git clone https://github.com/talkdai/dialog.git
```

2. Create a `.env` file based on the `.env.sample` file:

```bash
cp .env.sample .env
```

3. Set the OPENAI_API_KEY value in the `.env` file:

```
OPENAI_API_KEY=your-openai-api-key
```

4. Build and start the services with docker:

```bash
docker-compose up --build
```

### Customizing prompts and data

To customize this project, you need to have a `.csv` file with the knowledge base of your interest and a `.toml` file with your prompt configuration.

We recommend that you create a folder inside this project called `data` to store your CSVs and TOMLs files over there. The `data` folder is already in the `.gitignore` file, so you can store your data without worrying about it being pushed to the repository.

#### `.csv` knowledge base

The knowledge base has needed columns:

Expand All @@ -43,12 +71,12 @@ category,subcategory,question,content
faq,promotions,loyalty-program,"The company XYZ has a loyalty program when you refer new customers you get a discount on your next purchase, ..."
```

When the `dialog` service starts, it loads the knowledge base into the database, so make sure the database is up and paths are correct (see [environment variables](##environment-variables) section). Alternatively, inside `src` folder, run `make load-data path="<path-to-your-knowledge-base>.csv"`.
When the `dialog` service starts, it loads the knowledge base into the database, so make sure the database is up and paths are correct (see [environment variables](##environment-variables) section). Alternatively, inside `src` folder, run `make load-data path="<path-to-your-knowledge-base>.csv"`.

See [our documentation](https://dialog.talkd.ai/settings#csv-knowledge-base) for more options about the the knowledge base, including embedding more coluns together.
See [our documentation](https://dialog.talkd.ai/settings#csv-knowledge-base) for more options about the the knowledge base, including embedding more columns together.


### `.toml` prompt configuration
#### `.toml` prompt configuration

The `[prompt.header]`, `[prompt.suggested]`, and `[fallback.prompt]` fields are mandatory fields used for processing the conversation and connecting to the LLM.

Expand All @@ -69,7 +97,7 @@ qualified service to high-end customers. Be brief in your answers, without being
and objective in your responses. Never say that you are a model (AI), always answer as Avelino.
Be polite and friendly!"""

suggested = "Here is some possible content
suggested = "Here is some possible content
that could help the user in a better way."

fallback = "I'm sorry, I couldn't find a relevant answer for your question."
Expand Down
3 changes: 3 additions & 0 deletions sample_data/data.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
category,subcategory,question,content
faq,football,"Whats your favorite soccer team","My favorite soccer team is Palmeiras, from Brazil."
faq,football,"Whats your favorite soccer player","My favorite soccer player is Neymar, from Brazil."
8 changes: 8 additions & 0 deletions sample_data/prompt.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[model]
model_name = "gpt-4o"
temperature = 0.1

[prompt]
prompt = """
You are a nice bot, say something nice to the user and try to help him with his question, but also say to the user that you don't know totally about the content he asked for.
"""