The Gold Standard for AI Testing: Ethical, Fair, and Compliant
Open-source AI testing framework for comprehensive security, bias detection, compliance, and ethical AI evaluation. Production-ready testing for LLMs and AI agents.
Detect vulnerabilities before attackers do.
- Prompt injection attacks - Block instruction override attempts
- Jailbreak detection - Prevent safety guideline bypasses
- Adaptive red-teaming - AI-powered dynamic attack generation
- Multi-modal security - Test image injection and visual jailbreaks
- Tool use safety - Validate agent tool-calling security
- Token smuggling - Detect encoding-based attacks
β See Attack Engine Documentation
Eliminate hallucinations and ensure factual accuracy.
- Hallucination detection - Identify fabricated information
- Groundedness checking - Verify claims match source material
- RAG system evaluation - Full RAG Triad (Context, Groundedness, Answer Relevance)
- Consistency testing - Ensure reliable responses
- Semantic similarity - Real embedding-based analysis
β See Truth Engine Documentation
Meet regulatory requirements automatically.
- EU AI Act compliance - Articles 9-15 & 52 coverage
- GDPR compliance - Data privacy and protection
- NIST AI RMF - Risk management framework
- SOC 2 & ISO 42001 - Enterprise standards
- Auto-generated guardrails - Export NeMo Guardrails configs
- Custom policy engine - Enforce company-specific rules
β See Governance Engine Documentation
Eliminate algorithmic discrimination with research-backed metrics.
- 15 fairness metrics - Demographic parity, equalized odds, disparate impact
- Standard benchmarks - Adult, COMPAS, German Credit datasets
- LLM-native testing - Auto-generate demographic variants
- Interpretability layer - Plain-English bias explanations
- Legal compliance - EEOC 80% rule validation
- Hiring & lending testing - Domain-specific thresholds
β See Fairness Engine Documentation
Test for cultural bias and value alignment.
- Decolonization score - 5-dimensional cultural bias testing
- Epistemic bias (knowledge systems)
- Linguistic bias (communication styles)
- Historical bias (narrative perspectives)
- Cultural bias (norm assumptions)
- Stereotyping (representation quality)
- Political bias detection - Measure ideological skew
- Values alignment - Human rights, ethics, inclusivity
β See Values Engine Documentation
Sophisticated AI-powered testing, not brittle keyword matching.
- Uses GPT-4, Claude, or local LLMs (Ollama, LM Studio) as evaluators
- Contextual understanding of refusals vs. compliance
- Nuanced detection of hallucinations and policy violations
- Supports OpenAI, Anthropic, or fully offline local models
evaluator:
provider: "openai"
model: "gpt-4o"
api_key: "${OPENAI_API_KEY}"Dynamic attacks that evolve based on your agent's responses.
- Attacker Agent observes target responses
- Generates new exploits targeting discovered weaknesses
- Multi-turn interrogation vs. static attack datasets
- Powered by GPT-4, Claude, or local LLMs
Research-backed algorithmic fairness testing.
- 15 peer-reviewed fairness metrics
- Formal mathematical definitions
- Industry-standard benchmarks (Adult, COMPAS, German Credit)
- Interpretability layer with plain-English explanations
Test both text and vision-language models.
- Image injection attacks
- QR code exploits
- Steganography detection
- Visual jailbreak testing
AI testing that doesn't feel like a chore.
- Nyan Progress Display - Rainbow-trailing progress animations
- Nyan Alignment Score - Unified 0-100 ethical metric
- Automated PDF/JSON/Markdown reports
- 3D embedding visualizations
| Feature | indoctrine.ai | Alternatives |
|---|---|---|
| Open Source | β MIT License | β Proprietary |
| Privacy-First | β Runs locally | β Cloud-only |
| Comprehensive | β 5-layer testing | |
| Production-Ready | β CI/CD integration | |
| Research-Backed | β 15 fairness metrics | |
| Cultural Equity | β Decolonization testing | β Not available |
| Auto-Remediation | β Guardrail export | β Detection only |
pip install indoctrine-aifrom agent_indoctrination import Indoctrinator
indo = Indoctrinator("config.yaml")
results = indo.run_full_suite(my_agent)
indo.generate_report(results, "report.pdf")
print(f"Nyan Alignment Score: {results['overall_score']}/100")Output:
π [ββββββββββββββββββββ] 100% Complete
β
Security: 92/100 | β
Accuracy: 88/100 | β
Compliance: 95/100
Nyan Alignment Score: 91/100
| Industry | What We Test | Why It Matters |
|---|---|---|
| AI/ML Teams | Security, hallucinations, consistency | Catch bugs before production |
| Compliance Officers | EU AI Act, GDPR, SOC 2 | Automated regulatory audits |
| Red Teams | Adversarial attacks, jailbreaks | Identify security vulnerabilities |
| HR/Hiring | Fairness metrics, bias detection | Avoid discrimination lawsuits |
| Finance/Lending | Disparate impact, EEOC compliance | Fair lending requirements |
| Healthcare | HIPAA, bias, hallucinations | Patient safety & equity |
| Enterprise AI | Governance, security, fairness | Comprehensive AI risk management |
- Getting Started - Install and run your first test in 5 minutes
- Configuration - Complete configuration reference
- Testing Engines - Deep dive into all 5 testing capabilities
- Examples - Real-world usage patterns (RAG, tools, CI/CD)
- Troubleshooting - Common issues and solutions
- Best Practices - Optimization and workflow guidelines
- Advanced Topics - Observability, distributed testing, custom engines
- Installation Guide
- First Test Tutorial
- LLM Provider Setup
- CI/CD Integration
- Fairness Metrics Reference
- Custom Attack Development
β
Prompt injection & jailbreak detection
β
Adaptive AI-powered red-teaming
β
Multi-modal security testing (images, QR codes)
β
Hallucination & groundedness checking
β
RAG Triad evaluation (Context, Groundedness, Answer Relevance)
β
EU AI Act, GDPR, NIST AI RMF compliance
β
15 objective fairness metrics
β
Decolonization testing (5 cultural dimensions)
β
Auto-generated guardrails (NeMo)
β
LLM-as-a-Judge evaluation
β
OpenAI, Anthropic, Ollama, LM Studio support
β
CI/CD integration (GitHub Actions, GitLab)
β
PDF/JSON/Markdown reports
β
Nyan Progress Display π
# config.yaml - Works with OpenAI, Anthropic, or local LLMs
evaluator:
provider: "openai"
model: "gpt-4o"
api_key: "${OPENAI_API_KEY}"
# Or use local LLMs (free, offline)
evaluator:
provider: "openai"
endpoint: "http://localhost:11434/v1"
model: "llama3"
api_key: "ollama"
# Enable testing engines
attack:
enabled: true
adaptive: true # AI-powered attacks
truth:
enabled: true
enable_rag_triad: true
governance:
enabled: true
frameworks:
- eu_ai_act
- gdpr
fairness:
enabled: true
use_case: "hiring" # EEOC thresholds
values:
enabled: true# .github/workflows/ai-testing.yml
name: AI Safety Testing
on: [pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- run: pip install indoctrine-ai
- name: Run AI tests
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: indoctrinate run --config config.yaml --agent my_agent.py
- name: Check thresholds
run: indoctrinate validate --results results.json --fail-on-criticalβ CI/CD Examples
We welcome contributions! See CONTRIBUTING.md for guidelines.
- π Report bugs - GitHub Issues
- π‘ Suggest features - Discussions
- π Submit PRs - Follow the
devbranch workflow - β Star the repo - Help us reach more AI developers!
MIT License - see LICENSE for details.
- Documentation: docs/
- GitHub Issues: https://github.com/16246541-corp/indoctrine.ai/issues
- Discussions: https://github.com/16246541-corp/indoctrine.ai/discussions
Built for safer, fairer, and more compliant AI π