Flit: End-to-End Analytics Platform 🚀

Comprehensive data platform demonstrating production-ready analytics, experimentation, and ML systems for modern e-commerce.

🏢 Business Context

Flit is a (hypothetical) fast-growing e-commerce logistics company that needed to transform from gut-feeling decisions to data-driven growth. This platform demonstrates how modern data teams build end-to-end analytics infrastructure that scales from startup to enterprise.

Business Impact Delivered:

🔍 Data Foundation: Unified analytics reducing analyst query time by xx%
📊 Smart Experimentation: A/B testing framework with CUPED variance reduction
🤖 Predictive Intelligence: LTV and churn models driving targeted campaigns
💬 AI-Powered Operations: RAG documentation assistant reducing support tickets

🏗️ Platform Architecture

graph TB
    subgraph "Data Sources"
        B[E-commerce Transactions]
        C[Synthetic Experiments]
    end
    
    subgraph "Data Platform"
        D[BigQuery Warehouse]
        E[dbt Transformations] 
        F[Cost Optimization]
    end
    
    subgraph "Analytics Layer"
        G[Experiment Framework]
        H[ML Models & APIs]
        I[AI Documentation Assistant]
    end
    
    subgraph "User Interfaces"
        J[Executive Dashboard]
        K[Experiment Results]
        L[Model Monitoring]
    end


    B --> D  
    C --> D
    D --> E
    E --> F
    E --> G
    E --> H
    E --> I
    G --> K
    H --> L
    I --> J
    K --> J
    L --> J

🗂️ Repository Structure

Repository	Purpose	Tech Stack	Status
flit-data-platform	Core data warehouse & transformations	dbt, BigQuery, SQL	🔄
flit-experiments	A/B testing & advanced experimentation	Python, Streamlit, Statistics	🔄
flit-ml-api	ML models, APIs & MLOps	FastAPI, XGBoost, Docker, GCP	⏳
flit-ai-assistant	RAG-powered documentation bot	LangChain, ChromaDB, OpenAI/LLama	⏳

🚀 Quick Start

Prerequisites

GCP Account with BigQuery API enabled
Python 3.9+ with pip
Docker for containerized deployment
Git with SSH keys configured

1. Clone All Repositories

# Main orchestration repo
git clone https://github.com/whitehackr/flit-main.git
cd flit-main

# Initialize all submodules
git submodule add https://github.com/whitehackr/flit-data-platform.git data-platform
git submodule add https://github.com/whitehackr/flit-experiments.git experiments  
git submodule add https://github.com/whitehackr/flit-ml-api.git ml-api
git submodule add https://github.com/whitehackr/flit-ai-assistant.git ai-assistant

# Pull all submodules
git submodule update --init --recursive

2. Environment Setup

# Copy environment template
cp .env.example .env

# Edit with your GCP credentials
nano .env

Required Environment Variables:

# GCP Configuration
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_APPLICATION_CREDENTIALS=path/to/service-account.json

# BigQuery Datasets  
FLIT_RAW_DATASET=flit_raw
FLIT_STAGING_DATASET=flit_staging
FLIT_MARTS_DATASET=flit_marts

# API Configuration
ML_API_BASE_URL=https://your-api-url.run.app
OPENAI_API_KEY=your-openai-key

# Streamlit Configuration
STREAMLIT_SHARING_MODE=true

3. Full Platform Deployment

# Deploy entire platform with Docker Compose
docker-compose up -d

# Or deploy components individually
make deploy-data-platform
make deploy-experiments  
make deploy-ml-api
make deploy-ai-assistant

4. Initialize Data Pipeline

# Run initial data ingestion
cd data-platform && dbt run --full-refresh

# Generate sample experiments
cd experiments && python generate_sample_data.py

# Train initial ML models
cd ml-api && python train_models.py

5. Access Live Demos

📊 Main Dashboard: http://localhost:8501
🧪 Experiment Results: http://localhost:8502
🤖 ML API Docs: http://localhost:8000/docs
💬 AI Assistant: http://localhost:8503

📊 Platform Components

Data Platform (dbt + BigQuery)

Repository: flit-data-platform

Raw Data Ingestion: e-commerce transactions, synthetic experiments
Transformation Pipeline: Staging → Intermediate → Marts architecture
Cost Optimization: CC% reduction in BigQuery spend through query optimization
Data Quality: Comprehensive testing and monitoring with Great Expectations

Key Deliverables:

Unified customer 360° view
Real-time experiment exposure tracking
ML-ready data
Executive KPI dashboards

Experimentation Framework

Repository: flit-experiments

Classical A/B Testing: Hypothesis testing with power analysis
CUPED Implementation: Variance reduction using pre-experiment data
Advanced Methods: Multi-armed bandits and sequential testing
Interactive Apps: Streamlit dashboards for live experiment analysis

Key Deliverables:

XX% reduction in required sample sizes with CUPED
Bayesian stopping rules preventing p-hacking
Thompson Sampling for multi-variant optimization
Comprehensive statistical test suite

ML Platform & MLOps

Repository: flit-ml-api

Predictive Models: Customer LTV and churn prediction
Production APIs: FastAPI endpoints with automatic documentation
MLOps Pipeline: CI/CD with GitHub Actions and Cloud Run deployment
Model Monitoring: Drift detection and performance tracking

Key Deliverables:

YY% accuracy in churn prediction
LTV model driving $ZM+ in targeted campaigns
Automated retraining pipeline
SHAP explainability for model interpretability

AI Documentation Assistant

Repository: flit-ai-assistant

RAG Architecture: Retrieval-augmented generation for internal docs
Vector Database: Semantic search with ChromaDB
Chat Interface: Natural language querying of company knowledge
Integration: Embedded in main platform dashboard

Key Deliverables:

BB% reduction in internal support tickets
Instant access to technical documentation
Context-aware responses with source citations
Scalable knowledge base ingestion

🔧 Development Workflow

Local Development

# Setup development environment
make setup-dev

# Run tests across all components
make test-all

# Lint and format code
make lint-fix

# Start development servers
make dev-start

Data Pipeline Management

# dbt workflow
cd data-platform
dbt deps

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flit: End-to-End Analytics Platform 🚀

🏢 Business Context

🏗️ Platform Architecture

🗂️ Repository Structure

🚀 Quick Start

Prerequisites

1. Clone All Repositories

2. Environment Setup

3. Full Platform Deployment

4. Initialize Data Pipeline

5. Access Live Demos

📊 Platform Components

Data Platform (dbt + BigQuery)

Experimentation Framework

ML Platform & MLOps

AI Documentation Assistant

🔧 Development Workflow

Local Development

Data Pipeline Management

About

Uh oh!

Releases

Packages

whitehackr/flit-main

Folders and files

Latest commit

History

Repository files navigation

Flit: End-to-End Analytics Platform 🚀

🏢 Business Context

🏗️ Platform Architecture

🗂️ Repository Structure

🚀 Quick Start

Prerequisites

1. Clone All Repositories

2. Environment Setup

3. Full Platform Deployment

4. Initialize Data Pipeline

5. Access Live Demos

📊 Platform Components

Data Platform (dbt + BigQuery)

Experimentation Framework

ML Platform & MLOps

AI Documentation Assistant

🔧 Development Workflow

Local Development

Data Pipeline Management

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages