Skip to content
View junaidzeb123's full-sized avatar

Block or report junaidzeb123

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
junaidzeb123/README.md

⚑ About Me

I am an AI/ML Engineer and Computer Science undergrad at FAST-NUCES. I specialize in the full lifecycle of AI systemsβ€”from training custom models to deploying them on constrained hardware.

  • πŸ”­ I’m currently working at Cowlar Design Studio (YC W17) optimizing industrial vision systems.
  • 🌱 I’m currently researching Mobile Edge Inference and Quantization (TFLite/NNAPI) for my FYP, PokeVision.
  • βš™οΈ I love GPU Optimization, writing custom CUDA kernels, and minimizing inference latency.
  • πŸ’¬ Ask me about TensorRT, Triton Inference Server, FastAPI, and Mobile AI.

πŸ› οΈ The Arsenal

Domain Technologies
🧠 AI & ML PyTorch TensorFlow OpenCV Scikit-Learn
πŸš€ Inference & Ops TensorRT Triton Docker MLflow DVC
πŸ“± Edge & Mobile TFLite Flutter Android
πŸ–₯️ Backend FastAPI NestJS Postgres Redis
⚑ Languages Python C++ TypeScript

πŸ† Featured Projects

Real-time Intelligent Mobile Dashcam System

  • Tech: TFLite, NNAPI, Python, Flutter, FastAPI
  • Impact: Achieved real-time lane detection (UFLDv2) on mobile via INT8 quantization and hybrid cloud/edge architecture.

Low-level GPU Optimization Implementation

  • Tech: C++, CUDA, Matrix Math
  • Impact: Wrote custom kernels for forward/backward propagation on MNIST, achieving 40x speedup over CPU.

πŸ“Š GitHub Stats

streak graph

Profile views

🐍 Contributions

Snake animation

Pinned Loading

  1. multimodal-rag-system multimodal-rag-system Public

    Multimodal RAG system for PDFs using CLIP, FAISS, and Groq. Enables unified semantic search across both text and images with a Streamlit interface.

    Python

  2. MNIST-Acceleration/MNIST-Accleration MNIST-Acceleration/MNIST-Accleration Public

    Accelerating a neural network with CUDA (Project for HPC, Spring 2025)

    Cuda 1

  3. Canny-Edge-Detection-Optimization-using-CUDA Canny-Edge-Detection-Optimization-using-CUDA Public

    This repository contains the optimized cuda code for canny edge detection.

    Cuda

  4. financial-sentiment-analysis financial-sentiment-analysis Public

    Developed an end-to-end financial sentiment analysis system using FinBERT and Mistral-7B. Implemented Retrieval-Augmented Generation (RAG) with FAISS to boost LLM zero-shot performance and fine-tun…

    Jupyter Notebook

  5. trash-detection-yolo-unet trash-detection-yolo-unet Public

    YOLOv8 object detection + U-Net semantic segmentation for waste classification. Features custom and transfer learning implementations with 73% performance improvement on TACO dataset.

    Jupyter Notebook

  6. Semantic-Product-Search-and-Rank--ing Semantic-Product-Search-and-Rank--ing Public

    Comparative analysis of three semantic search architectures (TF-IDF, Word2Vec, BERT Bi-Encoder) on the Amazon ESCI dataset. Includes FAISS indexing and a Streamlit demo.

    Jupyter Notebook