Instructors: Rav Ahuja, Alex Aklson, Aije Egwaikhide, Svetlana Levitan, Romeo Kienzler, Polong Lin, Joseph Santarcangelo, Azim Hirjani, Hima Vasudevan, Saishruthi Swaminathan, Saeed Aghabozorgi, Yan Luo
This repository contains the hands-on projects I developed as part of the IBM Data Science Professional Certificate on Coursera. The specialization provides a comprehensive foundation in data science, from data analysis to machine learning.
| # | Course | Link |
|---|---|---|
| 01 | What is Data Science? | View |
| 02 | Tools for Data Science | View |
| 03 | Data Science Methodology | View |
| 04 | Python for Data Science, AI & Development | View |
| 05 | Python Project for Data Science | View |
| 06 | Databases and SQL for Data Science | View |
| 07 | Data Analysis with Python | View |
| 08 | Data Visualization with Python | View |
| 09 | Machine Learning with Python | View |
| 10 | Applied Data Science Capstone | View |
From Course 5. In this project, I used Python to scrape and visualize stock data, aiming to create an interactive dashboard.
Tools: pandas, requests, bs4, html5lib, lxml, plotly, yfinance
From Course 6. Created and populated a relational database using IBM Db2 SQL, then analyzed Chicago city data using Python.
Tools: IBM Db2, SQL, Jupyter Notebooks, CSVs
From Course 7. Built regression models to predict housing prices based on property features.
Tools: pandas, numpy, matplotlib, seaborn, scikit-learn
From Course 8. Built a flight performance dashboard using alternative tools due to technical limitations with IBMβs internal platform.
Tools: jupyter_dash, plotly, Google Colab
From Course 9. Compared several classification algorithms on a loan dataset to identify the best-performing model.
Tools: Logistic Regression, SVM, Decision Tree, KNN
The final capstone: predicting the success of SpaceX launches.
Steps involved:
- Data collection via SpaceX API and Wikipedia
- Data wrangling and visualization (SQL, Plotly, Folium)
- Feature engineering + One-hot encoding
- ML with GridSearchCV to optimize model parameters
- Models: Logistic Regression, SVM, Decision Tree, KNN
Accuracy: ~83.33% across all models
This was my favorite and most challenging project β the real kickstart to my data science journey.
Greatness in small beginnings.
Feel free to check out each project individually. Feedback is welcome! π