π‘ 100 % local, privacyβfirst Knowledge Base β Chat with your Joplin vault offline and under your full control.
Replace cloud RetrievalβAugmentedβGeneration services with a 100 % local stack:
-
No vendor lockβin β all components are openβsource, containerised, and easily swappable.
-
Respect your data β nothing leaves your machine unless you enable the optional Telegram bridge.
-
Optimised for commodity hardware β runs smoothly on an 8 GB RAM laptop, scales down to a Raspberry Pi 4 (via fast CLI) and up to a GPU workstation.
-
Transparent plumbing β plain Python, Docker, and YAML; every step is auditable.
ββββββββββββββββ β ingestion βββββββββββββββββ
Joplin Export β Markdown / Attachments / Resources β
β `joplin_sync.py` β
βββββββββββββββββββββ¬βββββββββββββββββββββββββ
β batches of chunks
βΌ
βββββββββββββββββββββββββ
β Weaviate β β‘ retrieval
β (BM25 + vectors) β<βββββββββββββββ
βββββββββββββββββββββββββ β
β² β
β GraphQL β
β β’ context β
βΌ β
βββββββββββββββββββββββββ β
β Ollama β β£ answer β
β (local LLM/adapter) βββββββββββββββββ
βββββββββββββββββββββββββ
β²
β
βββββββββββββββββ¬ββββββββββββββ΄ββββββββββββββββ
β Terminal CLI β Telegram (opt.) β Element / Matrix (WIP)
βββββββββββββββββ΄ββββββββββββββ¬ββββββββββββββββ
Conversation
(β a minimal Flask gateway is planned for v2025βQ3.)
-
Ingestion β
joplin_sync.pyscans export folders, slices Markdown into ~500 token chunks, OCRs images/PDFs, computes embeddings (dimension 384), and uploads JSON batches. -
Storage β Chunks live in Weaviate with original note path, resource hash, timestamps, and tag list.
-
Retrieval β CLIs perform hybrid search (BM25 + cosine) with a configurable Ξ± blend, rerank hits with recency & ownership heuristics, and craft the final prompt.
-
Generation β A local Ollama model (e.g. llama3:8bβinstruct) produces the answer, including citations back to note source paths.
| Name | Highlights | Language / deps |
|---|---|---|
joplin_sync.py |
Multithreaded crawler (I/O bound) β’ OCR via Tesseract β’ Incremental cache keyed by SHAβ256 β’ Automatic language detection for OCR β’ Hybrid schema bootstrapper β’ Search playground (--search) |
Python 3.10, Tesseract β₯ 5.0 |
rag_query.py |
Intent classifier (EN/ES) β’ Slidingβwindow memory (8 turns) β’ Ownership/procedural/temporal stages β’ Structured logs (structlog) β’ Rich colour console |
Python 3.10, LangChain 0.2, Ollama |
rag_query-fast.py |
β€ 2 s cold start β’ No memory, direct retrieval β’ YAML pattern matcher β’ Ideal for Raspberry Pi | Same |
docker-compose.yml |
Readβonly Weaviate vectorizer β’ Singleβnode Raft with persistent shards β’ Tweaked limits (QUERY_DEFAULTS_LIMIT=25) |
Docker β₯ 20.10 |
telegram_rag_bot.py |
Oneβuser hardβlocked β’ Markdown rendering β’ /summary, /reset commands |
pythonβtelegramβbot 21 |
sync_and_upload.sh |
CRONβfriendly wrapper for delta sync & upload | Bash |
scripts/migrations/ |
Future schema migrations (placeholder) | Python |
| Category | Minimum | Recommended | Notes |
|---|---|---|---|
| OS | Linux x86β64 / ARM 64, macOS, Windows 11 WSL2 | Ubuntu 22.04 LTS | On macOS, macFUSE may boost FS perf |
| Python | 3.9 | 3.10 | Tested with CPython |
| CPU | 2 cores | 4β8 cores | Heavy OCR benefits |
| RAM | 4 GB | 8β16 GB | Embeddings cached in RAM |
| Storage | 5 GB | 20 GB+ | Weaviate grows with vault size |
| GPU | optional | 8 GB VRAM | Accelerates Ollama (CUDA / Metal) |
| Tesseract | 5.0 | 5.3 + tesseract-lang |
Add eng, spa, deu, β¦ |
git clone https://github.com/luisriverag/joplin_weaviate_ollama/.git
cd joplin_weaviate_ollama
python -m venv .venv
source .venv/bin/activate # Windows β .venv\Scripts\activate
pip install --upgrade pip
pip install -r requirements.txt-
Debian/Ubuntu
sudo apt update sudo apt install tesseract-ocr libtesseract-dev poppler-utils sudo apt install tesseract-ocr-eng tesseract-ocr-spa # language packs -
Arch
sudo pacman -S tesseract tesseract-data-eng tesseract-data-spa
-
macOS (Homebrew)
brew install tesseract poppler brew install tesseract-lang # add packs as needed -
Windows
-
Install Tesseract and add to
PATH. -
WSL2 users: run Weaviate inside WSL or in a Docker Desktop Linux container.
-
Install CUDA 12 + cuDNN 8 or Apple Metal plugins β restart Ollama. ollama pull llama3:8b will autoβdetect GPU and quantise accordingly.
Copy .env and tweak:
cp sample.env .env
nano .env| Var | Default | Description |
|---|---|---|
MD_FOLDERS |
β | Commaβseparated paths to Joplin exports (.md + _resources) |
WEAVIATE_URL |
http://localhost:8080 |
Use https:// behind a reverse proxy |
WEAVIATE_INDEX |
Documents |
Each export profile can map to a unique index |
EMBEDDING_MODEL |
sentence-transformers/all-MiniLM-L6-v2 |
384βdim embeddings |
OLLAMA_MODEL |
llama3:8b |
Any Ollama label works (mistral:7b-instruct, β¦) |
TELEGRAM_BOT_TOKEN |
(unset) | Enable bot when set |
TELEGRAM_USER_ID |
(unset) | Numeric ID accepted by the bot |
TZ |
UTC |
Affects temporal boosts; set to your timezone |
LOG_LEVEL |
INFO |
DEBUG prints request bodies & queries |
docker compose up -d
# wait for readiness
until docker compose logs --tail 5 weaviate | grep -q "Startup complete"; do sleep 2; doneollama serve &
ollama pull llama3:8b # first time onlyTip: Put ollama serve in a systemd user service for autostart.
| Goal | Command |
|---|---|
| Smoke test (100 notes) | python joplin_sync.py --sync --upload --test 100 --progress |
| Full sync (all CPUs) | python joplin_sync.py --sync --upload --workers $(nproc) --progress |
| Incremental delta (CRON) | ./sync_and_upload.sh |
| ReβOCR only (skip upload) | python joplin_sync.py --sync --workers 4 --no-upload |
| Adβhoc hybrid search | python joplin_sync.py --search "license plate" --alpha 0.7 --top-k 5 |
| Flag | Meaning |
|---|---|
--workers N |
Threads for OCR/embeddings (default = CPU count) |
--batch-size N |
Upload batch size (notes) |
--index NAME |
Override target Weaviate class |
--timeout N |
Perβimage OCR timeout (sec) |
--alpha 0β1 |
Weight of vector vs BM25 for hybrid search |
--cache-info |
Print cache statistics |
--clean-cache [--force] |
Remove orphaned cache entries |
python rag_query.py
π§ > ΒΏTengo la factura de la lavadora?| Shortcut | Effect |
|---|---|
debug:<query> |
Show top docs, classification, scores |
no-analysis:<query> |
Skip ownership / procedural heuristics |
summary |
3βline recap of conversation memory |
reset |
Clear memory buffer |
help |
Quick inβCLI cheat sheet |
Ideal for constrained hardware; same usage minus memory & reranking:
python rag_query-fast.py --config patterns.yamlHotβreload config: :reload inside CLI reβreads the YAML without restart.
python telegram_rag_bot.py &-
Set
TELEGRAM_BOT_TOKENandTELEGRAM_USER_IDin.env. -
Send
/startto your bot. -
Use Markdown or plain text; answers cite note paths like
notebooks/Finanzas/2023-IRPF.md.
Security note: The bot rejects messages from any ID except the whitelisted one.
Note: ragbot_elementmatrix.py is untested, pull requests welcome
| Symptom | Likely cause | Remedy |
|---|---|---|
Connection refused to Weaviate |
Container not ready | docker compose logs -f weaviate |
| "Phantom shard" error | Dirty shutdown | Add RAFT_ENABLE_ONE_NODE_RECOVERY: true, restart once |
| OCR stalls on TIFF | Bad scan | Lower --timeout; file skipped & logged |
| Answers too generic | Using fast CLI | Switch to full CLI or enlarge top-k |
| Model OOM on GPU | VRAM too small | Pull 4βbit quant (llama3:8b-q4_0) or use CPU |
Logs:
-
sync_errors.logβ ingestion issues -
~/.ollama/logsβ LLM server -
weaviate-data/logsβ DB warnings
| Lever | Default | Faster | Notes |
|---|---|---|---|
| Embeddings model | MiniLMβL6βv2 | allβmpnetβbaseβv2 (768 dim) | Higher recall, slower |
| Batch size | 32 notes | 128 | RAM bound |
| OCR lang packs | eng + spa | exact language subset | Fewer dictionaries = faster |
--alpha |
0.7 | 0.5 (vectorβheavy) | Lower β BM25 heavier |
| LLM quant | q4_0 | q2_K | Lower VRAM, slower |
| Weaviate cache | off | search.cache plugβin |
Enterprise feature |
-
Network isolation β Weaviate listens on
localhostby default. Use firewall rules or Docker network to restrict. -
Encrypted volume β Store
weaviate-data/on an encrypted partition if note content is sensitive. -
No telemetry β
ENABLE_TELEMETRY=falsein compose file. -
Telegram bridge β Remember all traffic goes through Telegramβs servers; dont run the telegram ragbot if strict privacy is required.
-
Why two CLIs? β
rag_query.pyaims for maximal context awareness;rag_query-fast.pyboots in seconds, fitting IoT / Pi devices. -
Does any data leave my machine? β No, unless you enable the Telegram bot or point
WEAVIATE_URLto a remote host. -
How do I wipe and rebuild the index? β
docker compose down -vto drop shards, deletenote_cache.json, then run a fresh sync. -
Can I disable OCR? β Yes:
--no-ocrskips image/PDF text extraction.
MIT