San Francisco Bay Area
474 followers 454 connections

Join to view profile

About

Results-driven Data Scientist with 4+ years of experience transforming massive-scale…

Activity

Join now to see all activity

Experience & Education

  • SEIU-UHW West & Joint Employer Education Fund

View Nirvisha’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Courses

  • Advanced SQL

    -

  • Big Data Technologies and Applications

    Data 228

  • Data Mining

    Data 240

  • Data Structure

    -

  • Data Visualization

    Data 230

  • Database Systems for Analysis

    Data 225

  • Deep Learning

    Data 255

  • Machine Learning

    Data 245

  • Mathematical Methods for Data Analysis

    Data 220

  • Object Oriented Programming with C++

    -

  • Object Oriented Programming with JAVA

    -

Projects

  • AI Data Quality Copilot

    -

    An open-source platform that monitors any DuckDB warehouse for schema drift, statistical anomalies, and data quality issues — then uses Claude AI to explain what broke and how to fix it.

    Key features:

    Z-score based anomaly detection on null %, row count, mean, and std dev
    Schema drift detection (renamed, dropped, type-changed columns)
    LLM root-cause analysis with structured JSON output and severity ratings
    Auto-generated dbt-compatible data quality tests from 30-day…

    An open-source platform that monitors any DuckDB warehouse for schema drift, statistical anomalies, and data quality issues — then uses Claude AI to explain what broke and how to fix it.

    Key features:

    Z-score based anomaly detection on null %, row count, mean, and std dev
    Schema drift detection (renamed, dropped, type-changed columns)
    LLM root-cause analysis with structured JSON output and severity ratings
    Auto-generated dbt-compatible data quality tests from 30-day profiles
    Slack alerts with colour-coded severity and fix suggestions
    Interactive pipeline lineage graph (NetworkX + Pyvis)
    Streamlit dashboard + CLI tool (dq_copilot.py)
    YAML-based config — works with any DuckDB database, no code changes needed
    Tech: Python, DuckDB, Pandas, SciPy, Anthropic Claude API, Streamlit, Slack Webhooks, NetworkX, Pytest (68 tests)

    GitHub: github.com/nirvishagarara/ai-data-quality-copilot
    Demo: https://ai-data-quality-copilot-dashboard.streamlit.app

  • Weapon Detection using Deep Learning

    -

    Constructed VGG and AlexNet model, the architecture of CNN to identify between weapons like guns, knives and other objects, also compared theperformance of both models to find better accuracy in detecting weapon.
    Collect image data from google and then label it using three classes deploying python script. Executed Data Engineering by converting images into Numpyarrays and normalizing it as well as splitting the data into train test and validation set.
    Performed hyperparameter tuning using…

    Constructed VGG and AlexNet model, the architecture of CNN to identify between weapons like guns, knives and other objects, also compared theperformance of both models to find better accuracy in detecting weapon.
    Collect image data from google and then label it using three classes deploying python script. Executed Data Engineering by converting images into Numpyarrays and normalizing it as well as splitting the data into train test and validation set.
    Performed hyperparameter tuning using activation as ReLu and loss as Crossentropy which resulted in achieving 88% and 90% accuracy in classifyingweapons with VGG and AlexNet respectively

  • Cryptocurrency Price Prediction using Machine Learning

    -

    -Predict the price of cryptocurrency for the next 30 days and determine the fluctuation of prices in future.
    -Uses live stream data from CryptoCompare’s API to get the updated prices of cryptocurrency everyday.
    -Developed ML model using Long Short term memory(LSTM) method of Recurrent Neural Network(RNN).
    -Performed hyperparameter tuning to improve the model while using ReLu as activation function and Mean Squared Error(MSE) as loss function.
    -Used Mean Squared Error(MSE), Root Mean…

    -Predict the price of cryptocurrency for the next 30 days and determine the fluctuation of prices in future.
    -Uses live stream data from CryptoCompare’s API to get the updated prices of cryptocurrency everyday.
    -Developed ML model using Long Short term memory(LSTM) method of Recurrent Neural Network(RNN).
    -Performed hyperparameter tuning to improve the model while using ReLu as activation function and Mean Squared Error(MSE) as loss function.
    -Used Mean Squared Error(MSE), Root Mean Squared Error(RMSE) and Mean Absolute Error(MAE) to validate predictions.
    -Used python pandas for data cleaning while scikit-learn and TensorFlow for ML model. Matplotlib for visualization.
    -Used MinMaxScalar to scale higher values in the dataset so that accuracy won't be affected.

  • Cryptocurrency Price Prediction using Machine Learning

    -

    Predicted customer's response for marketing campaigns in an organization based on multiple features.
    Executed data engineering on marketing campaign data with 30+ features/dimensions. Random Forest and Regression feature selection techniques isused to select the top 10 affecting features.
    Designed XGBoost, Random Forest and KNN algorithms to make the classification models having 91%, 90% and 87% accuracy in predicting customer'sresponse for marketing campaign.

    See project
  • Big Data Analysis on New York Taxi Data

    -

    -Worked on dynamic optimization strategies for on-demand rides on popular pick up and drop off locations. Tableau visualization was done for the busiest hours of the day for pick up and drop off, trip fares for different areas for long and short trips.
    -Used AWS Glue to perform ETL on the data. Cloud Formation was used to form the cloud of necessary AWS services. Used java script to combine location of the trip and the time of a trip from different files.
    -Used redshift query editor to…

    -Worked on dynamic optimization strategies for on-demand rides on popular pick up and drop off locations. Tableau visualization was done for the busiest hours of the day for pick up and drop off, trip fares for different areas for long and short trips.
    -Used AWS Glue to perform ETL on the data. Cloud Formation was used to form the cloud of necessary AWS services. Used java script to combine location of the trip and the time of a trip from different files.
    -Used redshift query editor to perform queries while also connected redshift with tableau to do visualization of important findings.
    -Did you know, people who are dropped to the airport in a timely manner provide the highest tip.

  • Data Analysis on Covid Vaccine Adverse Effect Data

    -

    -Used Excel and Python pandas for data cleansing, MySQL Workbench for data modeling and Postgres for Database Physical Schema.
    -Used PGAdmin 4 to perform queries and do analysis to find meaningful information.
    Used tableau connecting with Postgres for data visualization.

  • Data Analysis on San Francisco city employee data

    -

    -Used Excel and Python Pandas to remove null and repetitive values. Used Python MySQL Connector to make the database on MySQL and import the data. Performed queries on database to find results using MySQL Python Connector.
    -Prepared the reports of highest/lowest salary for different job titles, average salary according to different age groups, salarygrowth for five years using MySQL Python Connector.

  • Car Parking system - Android app

    -

    A private parking reservation system for Android users. Used material design for UI to meet Google. Android app standards. Hosted backend services on Firebase. Integrated Google maps to route users to the parking spots.

    Other creators
    • Nidhi Kundariya
  • Lexical Analyzer in C/C++

    -

    Implemented Lexical Analyzer which would tokenize the source code. The tokenized stream would provide a source of input to the parser for compilation

  • Multi-threaded Web server on Linux in c/c++

    -

    Designed and implemented a multi-threaded module using thread pool library to handle multiple requests on a single server. Every single request on the server was addressed concurrently such that there was no busy waiting for any requests. Race conditions and deadlocks where handled. The throughput of the web server was improved exponentially when compared with single threaded web service

Test Scores

  • GRE

    Score: 305

More activity by Nirvisha

View Nirvisha’s full profile

  • See who you know in common
  • Get introduced
  • Contact Nirvisha directly
Join to view full profile

Other similar profiles

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses