Skip to content
View nevilshah235's full-sized avatar

Block or report nevilshah235

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
nevilshah235/README.md

Hi, I’m Nevil 👋

I’m an Applied AI / MLOps Engineer with nearly 6 years of experience building and operating production-grade AI systems. My work sits at the intersection of backend engineering, distributed systems, and applied machine learning — focused on turning research and experimentation into reliable, scalable platforms.

I’ve worked across LLMs, multimodal pipelines, computer vision, and GPU-accelerated inference, with a strong emphasis on system reliability, performance, and cost efficiency in real production environments.


What I work on

  • AI Platforms & Infrastructure

    • LLM and multimodal inference pipelines
    • Low-latency serving (vLLM, TensorRT, GPU optimization)
    • Vector search and retrieval systems (Milvus, Elasticsearch)
    • Kubernetes-based deployment and scaling
  • MLOps & Reliability

    • Automated evaluation and regression frameworks for LLMs
    • Observability, tracing, and quality gates for AI systems
    • Distributed task orchestration (Celery, async pipelines)
    • Cost and latency optimization under production constraints
  • Applied ML

    • Vision-language models and OCR pipelines
    • RAG systems for text, image, video, and audio
    • Domain-specific model fine-tuning (LoRA, VLMs)

Current focus

  • Building scalable AI platform primitives (evaluation, orchestration, retrieval)
  • Designing governance-friendly AI workflows that are measurable and safe to iterate
  • Improving latency, throughput, and cost efficiency of GPU-backed systems
  • Exploring agentic and multi-step AI workflows in production settings

Tech stack (frequently used)

  • Languages: Python, Bash
  • ML / AI: PyTorch, HuggingFace, TensorRT, vLLM
  • Infra: Docker, Kubernetes, Triton, KEDA
  • Data & Retrieval: Milvus, Elasticsearch, Redis, SurrealDB
  • Backend: FastAPI, Django, Celery

Background

  • Software Development Engineer (Applied AI / MLOps / Backend)
  • Founding AI Engineer (startup experience)
  • Deep Learning Engineer (computer vision & real-time systems)
  • Academic background in Physics (BS–MS, IISER Mohali)
  • Published in Monthly Notices of the Royal Astronomical Society (MNRAS)

What you’ll find here

This GitHub contains:

  • Experiments and prototypes around AI systems and tooling
  • Infrastructure and backend utilities
  • Evaluation frameworks and ML pipelines
  • Notes and references from applied research and engineering work

(Some production work lives in private repositories.)


Get in touch

I’m currently open to roles focused on AI platforms, MLOps, and applied AI engineering, particularly in the UAE / Middle East region.

Popular repositories Loading

  1. HashingDeepLearning HashingDeepLearning Public

    Forked from keroro824/HashingDeepLearning

    Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"

    C++

  2. ARFF ARFF Public

    Forked from teju85/ARFF

    ARFF formatted file reader in C++

    C++

  3. deep-learning-coursera deep-learning-coursera Public

    Forked from Kulbear/deep-learning-coursera

    Deep Learning Specialization by Andrew Ng on Coursera.

    Jupyter Notebook

  4. numpy-posit numpy-posit Public

    Forked from xman/numpy-posit

    posit (unum type III) integrated Numpy

    C

  5. PySigmoid PySigmoid Public

    Forked from mightymercado/pysigmoid

    A Python Implementation of Posits and Quires (Drop-in replacement for IEEE Floats)

    Python

  6. HackerEarth_Gala_Image_Challenge HackerEarth_Gala_Image_Challenge Public

    Jupyter Notebook