I’m an Applied AI / MLOps Engineer with nearly 6 years of experience building and operating production-grade AI systems. My work sits at the intersection of backend engineering, distributed systems, and applied machine learning — focused on turning research and experimentation into reliable, scalable platforms.
I’ve worked across LLMs, multimodal pipelines, computer vision, and GPU-accelerated inference, with a strong emphasis on system reliability, performance, and cost efficiency in real production environments.
-
AI Platforms & Infrastructure
- LLM and multimodal inference pipelines
- Low-latency serving (vLLM, TensorRT, GPU optimization)
- Vector search and retrieval systems (Milvus, Elasticsearch)
- Kubernetes-based deployment and scaling
-
MLOps & Reliability
- Automated evaluation and regression frameworks for LLMs
- Observability, tracing, and quality gates for AI systems
- Distributed task orchestration (Celery, async pipelines)
- Cost and latency optimization under production constraints
-
Applied ML
- Vision-language models and OCR pipelines
- RAG systems for text, image, video, and audio
- Domain-specific model fine-tuning (LoRA, VLMs)
- Building scalable AI platform primitives (evaluation, orchestration, retrieval)
- Designing governance-friendly AI workflows that are measurable and safe to iterate
- Improving latency, throughput, and cost efficiency of GPU-backed systems
- Exploring agentic and multi-step AI workflows in production settings
- Languages: Python, Bash
- ML / AI: PyTorch, HuggingFace, TensorRT, vLLM
- Infra: Docker, Kubernetes, Triton, KEDA
- Data & Retrieval: Milvus, Elasticsearch, Redis, SurrealDB
- Backend: FastAPI, Django, Celery
- Software Development Engineer (Applied AI / MLOps / Backend)
- Founding AI Engineer (startup experience)
- Deep Learning Engineer (computer vision & real-time systems)
- Academic background in Physics (BS–MS, IISER Mohali)
- Published in Monthly Notices of the Royal Astronomical Society (MNRAS)
This GitHub contains:
- Experiments and prototypes around AI systems and tooling
- Infrastructure and backend utilities
- Evaluation frameworks and ML pipelines
- Notes and references from applied research and engineering work
(Some production work lives in private repositories.)
- 📧 Email: nevilshah235@gmail.com
- 💼 LinkedIn: https://linkedin.com/in/nevilshah235
I’m currently open to roles focused on AI platforms, MLOps, and applied AI engineering, particularly in the UAE / Middle East region.


