-
New York University
- Brooklyn, New York City
-
01:37
(UTC -12:00) - https://nagharjun17.github.io/
- in/nagharjun-mathi-mariappan-b61499169
Pinned Loading
-
MCP-Ollama-Client
MCP-Ollama-Client PublicLightweight MCP client that uses a local Ollama LLM to query multiple MCP servers defined in config.json
-
CUDA-Custom-Kernels
CUDA-Custom-Kernels PublicContains my CUDA kernels implementations and benchmarking like Tiled Matrix Multiplication for learning.
Cuda
-
ECE-GY-9143---High-Performance-Machine-Learning
ECE-GY-9143---High-Performance-Machine-Learning PublicContains laboratory and project work for the course ECE-GY 9143 - High Performance Machine Learning
-
Flash-Attention-Triton
Flash-Attention-Triton PublicThis repository contains the codebase for the Flash Attention implementation on Triton.
Python
-
MLIR-to-PTX-CUDA
MLIR-to-PTX-CUDA PublicCreating an MLIR dialect that fuses Addition + ReLU, lowers to NVVM and LLVM IR and generates PTX to run the kernel on CUDA GPU
C++
-
Multimodal-Architecture-Optimisation-on-RTX3060-using-TVM
Multimodal-Architecture-Optimisation-on-RTX3060-using-TVM PublicThis repository contains the codebase for optimizing a Vision to Text model on a target RTX3060 device using Apache TVM
Python
If the problem persists, check the GitHub status page or contact support.
