I’m an AI researcher, specializing in LLM training (SFT, RL, IFT, GRPO, DPO), distributed training (PyTorch, Flash Attention, vLLM,, TensorFlow) , studying model behaviour and LLM Security.
Recently, I trained an open-source model that achieved 5% accuracy on the ARC-AGI-2 challenge, a benchmark for testing fluid intelligence in LLMs. This work earned $2,500 worth of compute credits to further iterate and improve the technique.
I thrive at the intersection of cutting-edge research and real-world impact. I enjoy collaborating with cross-functional teams to translate novel AI techniques into scalable, trustworthy solutions. Let’s connect if you’re interested in AI alignment, reinforcement learning, or next-generation language models.
