“Working with Chaitanya has been an absolute pleasure. He is an outstanding Data Scientist with deep expertise in GenAI, Agentic AI, and full-stack data science. His professionalism, ownership mindset, and consistent reliability make him the go-to person for complex or high-priority tasks. He thinks beyond the immediate requirement, often identifying long-term scalable solutions that create real business impact. He is also a collaborative and supportive teammate who elevates the entire team’s capability. I wholeheartedly recommend Chaitanya — any organisation would be fortunate to have someone with his mindset and technical depth.”
About
Experienced Data Scientist and Machine Learning…
Experience & Education
Licenses & Certifications
-
-
-
-
-
-
OSHA 10-Hour Outreach Training Program - Construction
International Association for Continuing Education & Training (IACET)
IssuedCredential ID 36-005992810 -
Solidworks Professional - Mechanical Design
[M]atrix CAD Academy
Issued -
ANSYS PROFESSIONAL (Innovent-ANSYS Certified)
[M]atrix CAD Academy
Issued -
-
Rio+20 Certification program: A short term course on ‘Sustainable Development’
United Nations Conference on Sustainable Development (UNCSD)
Projects
-
Customer Segmentation Using RFM Analysis for Online Retail
-
This repository showcases RFM (Recency, Frequency, and Monetary) segmentation to analyze customer behavior and provide insights for targeted marketing campaigns. By classifying customers based on their purchasing patterns, strategies can be tailored to improve customer retention, drive growth, and maximize the lifetime value of each customer.
-
HR Policy Query Resolution System using Retrieval-Augmented Generation (RAG) pipeline
-
This project aims at designing an advanced Retrieval-Augmented Generation (RAG) system for efficient and context-aware resolution of HR policy queries, leveraging document retrieval and generation techniques to extract and synthesize relevant information from policy documents.
-
Cross Platform Product Mapping Algorithm for Products
-
This repository contains a product ID mapping solution using TF-IDF vectorizer for weighted text vectors, Facebook AI Similarity Search (FAISS) for coarse filtering with cosine similarity, and Levenshtein distance for refined matching against the Blinkit catalog. Achieved 11.45% match for Zepto and 11.48% for Instamart.
-
Autistic Spectrum Disorder (ASD) Detection using Machine Learning
-
This project aims to develop a robust classification model using the test-taker's demographics data and questionnaire responses from the ASD screening dataset to accurately identify individuals with Autistic Spectrum Disorder (ASD) through optimization of performance metrics.
-
Deploy and Monitor a Machine Learning Workflow for Image Classification using Amazon SageMaker
-
The primary objective of this project was to build and deploy an image classification model for 'Scones Unlimited', a scone-delivery-focused logistic company, using AWS SageMaker and associated AWS Cloud Services.
-
Predict Bike Sharing Demand with AutoGluon (AWS's open-source AutoML Library)
-
This project focuses on using the AWS’s open-source AutoML library, AutoGluon, to predict bike sharing demand using the Kaggle Bike Sharing demand dataset.
-
Multi-Input Multi-Output MNIST Image Digit and Summed Output Classification
-
• Developed a deep neural network that accepts an MNIST handwritten digit (0-9) image and a random number (digit 0-9) as inputs and returns the predicted class label (0-9) for the input image along with its addition (sum) with the input random number as summed output (range 0-18) label as outputs.
• The high-performing custom-built multi-input multi-output neural network achieved test accuracy scores of 99.54% in the digit image classification category and 99.54% in the summed output label…• Developed a deep neural network that accepts an MNIST handwritten digit (0-9) image and a random number (digit 0-9) as inputs and returns the predicted class label (0-9) for the input image along with its addition (sum) with the input random number as summed output (range 0-18) label as outputs.
• The high-performing custom-built multi-input multi-output neural network achieved test accuracy scores of 99.54% in the digit image classification category and 99.54% in the summed output label classification category, respectively. -
Landmark Detection and Tracking using Graph SLAM
-
• Utilized feature detection and key-point descriptors to build a map of a 2-dimensional environment with Graph SLAM
• Implemented a robust method for tracking an object over time using elements of probability, motion models, and linear algebra -
Automated Image Captioning (CNN + RNN model)
-
Built a CNN-RNN encoder-decoder architecture that was fine-tuned and trained on the COCO (Common Object in Context) dataset to automatically predict captions for any given image
-
Facial Key-points Detection
-
• Incorporated OpenCV's pre-trained Haar Cascade classifier to detect any faces in any image
• Devised, fine-tuned, and trained a relatively simple yet effective custom CNN (compared to the SOTA model architectures) with close to zero SmoothL1Loss. This model could accurately detect facial key points (like areas of the face, such as eyes, corners of the mouth, and nose) in the discovered headshot images -
Fraud Analytics – Credit Card Fraud Detection
-
• Conducted EDA using an anonymized credit card transactions imbalanced dataset, and dealt with class imbalance using up-sampling techniques like Random Oversampling, SMOTE, and ADASYN
• Developed, fine-tuned, and trained different ML models (Logistic Regression, Decision Tree, Random Forest, XGBoost, SVM, and KNN classifiers) on the aforementioned up-sampled datasets to predict fraudulent credit card transactions
• Out of all the models trained, the random forest model trained on a…• Conducted EDA using an anonymized credit card transactions imbalanced dataset, and dealt with class imbalance using up-sampling techniques like Random Oversampling, SMOTE, and ADASYN
• Developed, fine-tuned, and trained different ML models (Logistic Regression, Decision Tree, Random Forest, XGBoost, SVM, and KNN classifiers) on the aforementioned up-sampled datasets to predict fraudulent credit card transactions
• Out of all the models trained, the random forest model trained on a SMOTE-based balanced dataset using stratified K-fold cross-validation achieved the highest ROC AUC and recall of 98.54% and 84.7% on the test dataset, respectively -
Maximizing profit of cab drivers using Deep Reinforcement Learning
-
-
Trained an agent to play numerical Tic-Tac-Toe
-
Trained an RL agent using Q-learning such that it beats its opponent every time it plays numerical Tic-Tac-Toe
-
Hand Gesture Recognition using Deep Learning (Video Classification)
-
• Designed a highly accurate CNN that could be utilized in a smart TV to recognize five different gestures performed by the user which will help users control the TV without using a remote control
• Among all the configurations (3D CNN, CNN-RNN stack, and Transfer Learning-based models), a superior (MobileNet + LSTM) CNN achieved an all-time high training and validation accuracies of 95.24% and 93.75%, respectively -
Built a Restaurant Search Chatbot (integrated with Zomato API) using RASA
-
• Developed a restaurant search chatbot with RASA using the Zomato API to help customers locate restaurants across several Indian cities based on their preferences, such as location, cuisine, and budget
• Deployed this restaurant search chatbot on Slack and WhatsApp Messenger using Slack API and Twilio, respectively -
Syntactic Processing Part-of-Speech Tagger for tagging unknown words
-
• Built multiple (Hidden Markov model) HMM-based Part-of-Speech (POS) taggers with the implementation of the Viterbi algorithm for assigning parts-of-speech (POS) tags to unknown words in a corpus
• Developed a highly accurate HMM-Viterbi POS tagger with a tagging accuracy of 95.81% on the validation dataset. It comprised of a combination of multiple taggers namely, Lexicon, Affix, Rule-Based, and default taggers -
Telecom Churn Analysis
-
• Identified the top churn predictors by interpreting customer-level data (from a leading telecom operator in India and Southeast Asia) by leveraging EDA, feature engineering techniques, feature selection (using RFE), and dimensionality reduction (using PCA)
• Mitigated class imbalance in train dataset using ‘SMOTE and Tomek Links’ hybrid technique and retained significant variables using PCA to build multiple logistic regression and tree family-based ML models (like Random Forest and…• Identified the top churn predictors by interpreting customer-level data (from a leading telecom operator in India and Southeast Asia) by leveraging EDA, feature engineering techniques, feature selection (using RFE), and dimensionality reduction (using PCA)
• Mitigated class imbalance in train dataset using ‘SMOTE and Tomek Links’ hybrid technique and retained significant variables using PCA to build multiple logistic regression and tree family-based ML models (like Random Forest and XGBoost)
• Proposed an optimal solution for churn prediction by fine-tuning the Logistic Regression (with PCA) model that achieved ROC AUC, accuracy, and recall of 89.62%, 84%, and 80% on the test dataset, respectively -
House Price Prediction Model for a US-based Housing Company
-
• Conducted EDA to analyze and garner insights into the house price data acquired from a US-based housing company
• Built robust regularized regression models (Ridge and Lasso regression) with high R2 scores equal to approx. 90%
• Employed Recursive Feature Elimination (RFE) to shortlist the top 10 features that would assist the company in buying the houses at a lower price before flipping them at a higher price -
Car Price Prediction Model for an Automobile Consulting Firm
-
• Performed feature engineering and EDA to transform car price data and extract insights from it
• Eliminated multicollinearity by dropping dependent features and built a good-fit linear regression model achieving R-squared scores of 90.88% and 89.93% on the train and test sets, respectively -
Lending Club - Data Analysis for a Consumer Finance Company
-
• Conducted data mining to analyze the loan data gathered from a consumer finance company
• Comprehended and recommended the consumer and loan attributes that influence the tendency of loan default -
Investment Analysis for an Asset Management Company
-
Analyzed data for an asset management firm, and determined where the company should invest based on the firm’s prescribed constraints
-
Lean ways to minimize food wastage in household, and eat healthy on budget, USC
-
• Implemented lean principles to develop a faster and cost-efficient food management system
• Conducted a seven-week experimental study to gain feedback from the system and analyzed it on weekly basis
• Incorporated lean tools to develop the existing system and reduced the total food costs/person/household by 49.28%
• Achieved 21.49% net reduction in process time, and 25% net reduction in lead time in the new system -
Bike Theft Prevention System Design, USC
-
• Designed and developed the concept of traditional bike rack system using ‘Innovative Design Thinking’ (IDT) framework
• Identified, prioritized and translated the stated and unstated customer requirements into product design using Quality Function Deployment (QFD) approach and TRIZ problem-solving method
Other creators -
-
Material Selection for the design of a Light, Stiff deck panel of a skateboard, USC
-
• Surveyed and screened materials using the CES Edupack Software, analyzed conventional materials
• Proposed a replacement that met the design requirements and cost constraints
-
Application of Electromagnetism in contemporary engines to eliminate Internal Combustion, BVCoE
-
Honors & Awards
-
Star Performer (Best Debutant) Award
Reliance Retail
-
Scholarship Recipient - 2022 AWS Machine Learning Engineer Nanodegree Scholarship Program
Amazon Web Services (AWS)
Granted Full Scholarship towards AWS x Udacity’s 2022 Machine Learning Engineer Nanodegree Scholarship Program
-
Top Performer Award - M.Sc. Machine Learning
Liverpool John Moores University
Awarded for academic excellence in M.Sc. (Research Thesis) - Artificial Intelligence and Machine Learning program
-
Top Performer Award (PGD- ML & AI)
International Institute of Information Technology, Bangalore (IIIT-B) in collaboration with upGrad
Awarded for academic excellence in Post Graduate Diploma in Machine Learning and Artificial Intelligence program
-
Hydraulic Machinery - Course Topper
Bharati Vidyapeeth College of Engineering-(University of Mumbai)
-
Machine Design-1 - Course Topper
Bharati Vidyapeeth College of Engineering-(University of Mumbai)
-
Mechanical Measurements and Metrology - Course Topper
Bharati Vidyapeeth College of Engineering-(University of Mumbai)
-
Mechatronics - Course Topper
Bharati Vidyapeeth College of Engineering-(University of Mumbai)
-
Production Process - Course Topper
Bharati Vidyapeeth College of Engineering-(University of Mumbai)
-
Ratan Tata Scholarship
Sir Ratan Tata Trust
Awarded for Academic Excellence in the Sophomore year (2011-12)
-
Ratan Tata Scholarship
Sir Ratan Tata Trust
Awarded for Academic Excellence in the Junior year (2012-13)
Languages
-
English
Full professional proficiency
-
Marathi
Native or bilingual proficiency
-
Hindi
Native or bilingual proficiency
Organizations
-
M.E.S.A (Mechanical Engineering Students Association)
Treasurer
-
Recommendations received
8 people have recommended Chaitanya
Join now to viewOther similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content