Skip to content
View chonzadaniel's full-sized avatar

Block or report chonzadaniel

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
chonzadaniel/README.md

πŸ‘‹ Hi there, I'm Emmanuel Daniel Chonza

πŸš€ Data Scientist | Generative AI Practitioner | LLM/RAG Engineer

I build end-to-end AI systems that make real-world impactβ€”ranging from LLM fine-tuning and image classification apps, to retrieval-augmented generation (RAG) pipelines and AI-assisted job search agents. My work integrates machine learning, deep learning, and natural language processing (NLP) with cutting-edge tooling like OpenAI, HuggingFace, Streamlit, CrewAI, and ChromaDB.


🧠 What I Do

  • βš™οΈ Train and fine-tune LLMs for domain-specific tasks (e.g., sentiment analysis, resumes, instructions).
  • πŸ€– Develop computer vision applications using ResNet, VGG, and EfficientNet.
  • πŸ’» Train and Deploy Supervised Machine Learning Build regression and predictive Machine learning models.
  • πŸ” Build advanced RAG pipelines using ChromaDB, FAISS, and OpenAI APIs.
  • πŸ§ͺ Experiment with PEFT techniques (LoRA, QLoRA, IA3, DPO) on real-world datasets.
  • πŸ“Š Design data science workflows: MLflow tracking, feature engineering, and model evaluation.
  • 🌐 Deploy AI apps with Streamlit, FastAPI, Slack bots, and RESTful APIs.
  • δ·’ Design and Build M&E Systems driven by results-based management approach, craft Theory of Change, Results Frameworks, M&E Plans, Analyze data, Visualize, build Dynamic Dashboards.

πŸ”­ Current Work

  • πŸ”¬ Fine-tuning & evaluating LLMs on domain-specific sentiment and intent classification.
  • 🧱 Implementing MLOps/LLMOps pipelines for scalable experimentation.
  • πŸ“ˆ Improving fairness in ML models trained on imbalanced datasets.
  • 🧠 Prompt engineering for grounded and hallucination-free AI output.
  • 🎯 Deploying AI apps with powerful frontends using Streamlit + LangChain + LlamaIndex.

🧩 Featured Projects

Streamlit-powered GenAI App to retrieve and summarize research papers (PDFs) using a multi-vector retriever, ChromaDB, GPT-4o, and web-augmented generation.
PDF Ingestion β†’ Chunking β†’ Embedding β†’ Retrieval β†’ Generation β†’ UI


Fine-tuned ResNet50 model (transfer learning) trained on 120 Stanford Dog Breeds with >80% validation accuracy. App UI built using Streamlit that predicts breed from uploaded .jpeg/.png image.


πŸ’Ό [Resume & Job Application Advisor]

Agentic Streamlit App powered by CrewAI + Open-source LLMs. Guides users in:

  • Resume feedback.
  • Tailored job openings.
  • Cover letter generation.
  • Interview Q&A.

πŸ’³ [Credit Card Fraud Detector]

Robust ML pipeline for highly imbalanced datasets, including:

  • Stratified train/test splitting.
  • Oversampling (SMOTE).
  • GridSearch + XGBoost.
  • ROC-AUC, confusion matrix.

🐦 [Racist Tweet Classifier]

NLP workflow with:

  • SymSpell spell correction.
  • Stratified cross-validation.
  • Oversampling.
  • Streamlit UI for public demo.

πŸš— [Used Car Price Prediction]

Regression pipeline using XGBoost, feature engineering, and marketplace data (brand, model, mileage, engine size, etc.).


πŸ§ͺ [Parameter Efficient Fine-Tuning (PEFT)]

Experiments with LoRA, QLoRA, IA3, and DPO on binary sentiment tasks using HuggingFace Transformers + bitsandbytes.


πŸ–ΌοΈ [FoodVision & DogVision]

Custom CNN and pretrained ResNet models trained on:

  • 🍣 Food101 (sushi, pizza, steak...).
  • πŸ• Stanford Dog Breeds (with label mapping & confidence overlay).

πŸ“¦ Coming Soon

  • πŸ’¬ Multi-turn chatbot with memory + web search + RAG.
  • πŸ§‘β€πŸ’Ό Job Application Assistant v2 (LangGraph-powered).
  • πŸ›°οΈ LLM inference microservices (FastAPI + LangServe)
  • 🧬 BGE-Large + Llama3 RAG for scientific documents

πŸ“« Reach Me


πŸ› οΈ Tech Stack

Languages: Python, R, SQL, Markdown

Algorithms: LLMs, ML, NLP, Transformers/CNNs/ANN/RNNs/GANs, LSTMs

Frameworks & Tools: PyTorch, scikit-learn, Transformers, Streamlit, MLflow, FastAPI, LangChain, LlamaIndex, ChromaDB, OpenAI, HuggingFace, Plotly, Matplotlib, seaborn , crewai, crewai-tools, APIs

MLOps: MLflow, wandb, Docker, Conda, Git, Kaggle, AWS

Deployment: Huggingface Spaces, Streamlit Cloud, Slack, Local API, Render, AWS

IDEs and Editors: Jupyter, Google Colab, PyCharm, Visual Studio Code, Kaggle, Sublime Text, Thonny


✨ Motto

β€œBuild. Evaluate. Iterate. Deploy. Share.”

Let’s collaborate on AI that matters. Feel free to explore my work or reach out!

Popular repositories Loading

  1. ChatGPT-repository ChatGPT-repository Public

  2. ChatGPT ChatGPT Public

  3. notebook notebook Public

    Forked from jupyter/notebook

    Jupyter Interactive Notebook

    Jupyter Notebook

  4. MLproject MLproject Public

    Project Coding

    Jupyter Notebook

  5. khu-FinalProject khu-FinalProject Public

    Jupyter Notebook

  6. Credit-card-FraudDetection Credit-card-FraudDetection Public

    Submission of Project

    Jupyter Notebook