MatOllama is a powerful, feature-rich command-line interface for managing and interacting with local large language models through Ollama. Built for developers, researchers, and AI enthusiasts who demand speed, clarity, and robust functionality in their terminal workflows.
Professional terminal UX - Context-aware switching - Session persistence
- Rich UI Components: Beautiful tables, panels, progress bars, and color-coded output
- Concurrent Operations: Multi-layer progress bars for model downloads/uploads with real-time speed & ETA
- Adaptive Layouts: Dynamic table widths that adjust to terminal size and content length
- Real-time Feedback: Streaming chat responses with first-token responsiveness
- Intelligent Display: Handles "thinking" models with dimmed reasoning steps
- Context-Aware Switching: Seamlessly switch between models while preserving conversation history
- Comprehensive Operations: Pull, run, copy, rename, create, push, and remove models with confirmation prompts
- Smart Selection: Interactive model picker with arrow key navigation
- Resource Control: Unload models, monitor running processes, and manage GPU/CPU usage
- Intelligent Deletion: Auto-refresh model list after deletions to prevent index confusion
- Native Pass-through: Automatically passes unknown commands (like
signin) to the official Ollama binary
- Persistent Sessions: Save/load conversations with metadata and version tracking
- Export Capabilities: Export chats in JSON (for datasets) or text formats
- Organized Storage: Automatic directory structure (Sessions/, Exports/, config.json)
- Theme Persistence: Customizable color themes that survive restarts
- Auto-Bootstrapping: Automatically manages virtual environments and installs dependencies
- Graceful Interruption: Ctrl+C handling with context-aware responses
- Input Blocking: Prevents commands during long operations (model copying/renaming)
- Confirmation Dialogs: Safety prompts for destructive actions
- Memory Management: Automatic model unloading before deletion to prevent failures
- Python 3.9+ (Check with
python3 --version) - Ollama installed and running (Download Ollama)
-
Clone the repository:
git clone https://github.com/maternion/MatOllama.git cd MatOllama -
Run MatOllama:
# The script is a self-contained installer. # It will automatically create a venv, install dependencies, and run. ./MatOllama.py
Note: Ensure the script is executable (
chmod +x MatOllama.py)
ollama serve./MatOllama.py# Pull a model (now with concurrent download bars!)
pull qwen3
# Run by name or index
run qwen3
# or
run 1
# One-shot prompt
run qwen3 "Explain quantum computing in simple terms"Once a model is loaded:
- Type messages directly
- Use
/switch 2to change models while preserving context - Use
/unloadto unload current model and exit chat mode - Use
/exitto leave chat mode - Use
stopor Ctrl+C to halt generation
| Command | Description | Examples |
|---|---|---|
list, ls |
Display all models with index, name, size, and modified date | list, ls |
select |
Interactive model picker with arrow keys | select |
pull <model> |
Download model from Ollama registry (multi-layer progress) | pull codellama:7b |
run <model|#> [prompt] |
Start chat session or execute single prompt | run 2, run gpt-oss:20b "Hello" |
show <model> |
Display detailed model information | show qwen3:4b |
| Command | Description | Examples |
|---|---|---|
rm <model|#> |
Remove model (shows updated list after deletion) | rm 3, rm old-model |
copy <src> <dest> |
Duplicate model with new name | copy llama3.1 my-llama |
rename <old> <new> |
Rename model (copy + delete original) | rename 1 better-name |
create <name> [file] |
Create custom model from Modelfile | create my-bot ./Modelfile |
push <model|#> |
Upload model to registry (multi-layer progress) | push my-model |
| Command | Description | Examples |
|---|---|---|
ps |
Show currently running models with resource usage | ps |
unload [model|#] |
Free model from memory | unload, unload 2 |
stop |
Halt active generation | stop |
version |
Display CLI and Ollama version info | version |
* |
Unknown commands passed to Ollama binary | signin, cp |
| Command | Description | Examples |
|---|---|---|
save [filename] |
Save current session to Sessions/ | save, save my-session |
load <filename> |
Load session from Sessions/ | load session_20250101.json |
export [format] [file] |
Export conversation to Exports/ | export json, export text chat.txt |
theme [color] |
Set persistent theme color | theme blue, theme |
temp [0.0-2.0] |
Show/set temperature (persistent) | temp 0.8, temp |
system [prompt] |
Set system prompt for current session | system "You are a helpful coding assistant" |
| Command | Description | Examples |
|---|---|---|
history |
Display conversation with timestamps | history |
clear |
Clear conversation history (with confirmation) | clear |
help |
Show comprehensive command help | help |
exit, quit |
Exit application (prompts to save) | exit |
When in chat mode (after running a model):
| Command | Description | Examples |
|---|---|---|
/switch <model|#> |
Switch models preserving context | /switch 2, /switch gpt-4o |
/unload |
Unload current model and exit chat mode | /unload |
/set verbose <true|false> |
Toggle detailed API debugging | /set verbose true |
/set think <true|false> |
Toggle thinking mode for reasoning models | /set think false |
/exit |
Exit chat mode, return to command mode | /exit |
/help |
Show in-chat help | /help |
# Start conversation with one model
run qwen3:30b
> "Let's discuss machine learning basics"
# Switch to specialized model while keeping context
/switch qwen3-coder:30b
# Conversation history is preserved and transferred# Remove models with automatic list refresh
rm 1
# β Model deleted
# Updated model list automatically displayed
# No more index confusion!
# Also works with model names
rm old-experimental-model# In chat mode, cleanly exit and unload
/unload
# β Model unloaded, exited chat mode
# Memory freed, ready for next model
# Quick model switching with context
/switch 3
# β Previous conversation transferred to new model# Set theme (survives restarts)
theme magenta
# Configure temperature (persistent)
temp 0.9
# Settings stored in config.json# Rename models efficiently
rename 1 production-model
rename 2 dev-model
# Export conversations in different formats
export json training-data.json
export text readable-chat.txt# Save current work
save important-research
# Load previous session
load important-research.json
# Restores: model, history, settings, theme
# Export for sharing
export text research-summary.txt./MatOllama.py --help
Options:
--host TEXT Ollama server URL (default: http://localhost:11434)
--timeout FLOAT Request timeout in seconds (default: 300.0)
--version Show version and exitMatOllama creates organized directories:
MatOllama/
βββ MatOllama.py # Main script (Self-contained installer)
βββ config.json # Persistent settings
βββ Sessions/ # Saved chat sessions
βββ Exports/ # Exported conversations
βββ .ollama_history # Command history
# Custom Ollama host
export OLLAMA_HOST=http://192.168.1.100:11434
# Run with custom settings
./MatOllama.py --host $OLLAMA_HOST --timeout 600# Pull research-focused model
pull deepseek-coder:33b
# Start research session
run deepseek-coder:33b
# Chat about complex topics, then switch for different perspective
/switch llama3.1:70b
# Save valuable conversation
save research-session-$(date +%Y%m%d)
# Export for sharing
export text research-summary.txt# Interactive model selection
select
# Use arrow keys to choose
# Quick code generation
run codellama "Write a Python function for binary search"
# Switch to general model for documentation
/switch llama3.1
"Now write documentation for that function"
# Export as training data
export json code-examples.json# List all models with clear formatting
list
# or use shorthand
ls
# Clean up old models (with auto-refresh)
rm old-model-v1
rm 3 # Remove by index, shows updated list
# Create production copies
copy experimental-model production-v1
rename development-model staging-v2
# Monitor system resources
ps
# Clean unload when switching contexts
/unload # In chat mode
unload unused-model # In command mode# Check Ollama server status
curl http://localhost:11434/api/version
# Test with custom host
./MatOllama.py --host http://localhost:11434
# Verify Ollama is running
ollama serve# Check running models
ps
# Unload unused models
unload old-model
# Clean exit from chat with unload
/unload # Frees memory and exits chat
# Monitor during operations
run llama3 --verbose# Models not deleting? They might be loaded
ps # Check what's running
unload problematic-model # Unload first
rm problematic-model # Then delete
# Use the auto-refresh feature
rm 1 # Automatically shows updated list after deletion# Terminal too narrow?
# Resize terminal - tables auto-adjust to width
# Color issues?
theme cyan # Try different theme
# Text corruption?
clear # Clear terminal buffer
# Table formatting issues?
ls # New responsive tables adapt to terminal size# Model not found?
pull missing-model
# Corrupted model?
rm problematic-model
pull problematic-model # Re-download
# Out of memory?
/unload # In chat mode - frees memory and exits
unload # In command mode
ps # Check what's running- Use
/unloadin chat mode to cleanly free memory and exit - Use
unloadcommand to free specific models from memory - Monitor with
psto track resource usage - Set appropriate
tempvalues (0.7 default, 0.3 for factual, 1.2 for creative)
- Use
selectfor visual model picking - Use
lsas shorthand forlistcommand - Save important sessions before switching models
- Use
/unloadinstead of/exitwhen you want to free model memory - Export conversations regularly for backup
- Use
/switchto compare model responses on same topic
- The
rmcommand now shows updated model list automatically - Delete by name to avoid index confusion:
rm old-model-name - Always check
psto see what's actually running - Use
/unloadin chat mode for clean memory management
- Use descriptive names when saving sessions
- Export training conversations as JSON
- Keep Sessions/ organized by date or topic
- Regular cleanup of old exports and models using improved
rmcommand
Built with love for the Ollama community. MatOllama leverages:
- Ollama - The foundation for local LLM serving
- Rich - Beautiful terminal formatting
- prompt-toolkit - Enhanced input handling
- inquirer - Interactive selections
MatOllama - Professional CLI for Local LLMs
β Star this repo if MatOllama enhances your AI workflow!