Skip to content

Vishxnu/Life-Expectancy-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

๐Ÿงฌ Life Expectancy Prediction using Multiple Linear Regression ๐ŸŒ

๐Ÿ“Š A Data Science Project by Vishnu Raj


๐ŸŒŸ Project Overview

Life expectancy is a key indicator of a nationโ€™s overall health and development. This project aims to predict life expectancy using Multiple Linear Regression (MLR) based on socio-economic and health factors. Using the WHO Life Expectancy Dataset, we explore correlations, clean the data, visualize relationships, and build regression models for prediction.


๐ŸŽฏ Project Objectives

โœ… Identify significant socio-economic and health predictors of life expectancy โœ… Build and evaluate a Multiple Linear Regression model โœ… Compare performance with Ridge and Lasso Regression โœ… Visualize patterns through EDA for clear interpretation


โš™๏ธ Tools & Libraries Used

Purpose Libraries
Data Handling pandas, numpy
Visualization matplotlib, seaborn
Modeling scikit-learn
Evaluation r2_score, RMSE, MAE
Development Jupyter Notebook

๐Ÿ“ Dataset Information

๐Ÿ“„ Dataset: Life Expectancy (WHO) โ€“ Kaggle ๐ŸŒ Records: 2,930 ๐Ÿงฉ Features: 22 (GDP, BMI, Schooling, Status, etc.) ๐ŸŽฏ Target Variable: Life Expectancy


๐Ÿงน Data Preprocessing Steps

โœ” Replaced missing values using median imputation โœ” Encoded categorical variables (Status: Developed/Developing) โœ” Handled outliers using IQR โœ” Split data into 70% Train and 30% Test using train_test_split


๐Ÿ“Š Exploratory Data Analysis (EDA)

Explored patterns between health and economic indicators:

  • Correlation heatmaps ๐Ÿ“ˆ
  • Pairplots for relationships ๐Ÿ‘ฅ
  • Schooling,Diphtheria Immunization as key drivers

Heatmap Animation


๐Ÿงฎ Model Development

Model Rยฒ RMSE MAE
Linear Regression 0.8316 3.9104 2.8904
Ridge Regression 0.8316 3.9104 2.8902
Lasso Regression 0.8316 3.9104 2.8897

๐Ÿง  Key Insights

๐Ÿ’ก Higher education levels and income = higher life expectancy. ๐Ÿ’ก Schooling, Health Expenditure, and Diphtheria Immunization strongly correlate. ๐Ÿ’ก Regularization (Ridge/Lasso) provided stable, consistent results.


๐Ÿงพ Requirements

pandas
numpy
matplotlib
seaborn
scikit-learn

๐Ÿ Conclusion

โœ… Achieved Rยฒ โ‰ˆ 0.83, showing strong predictive power โœ… Proved education, economy, and healthcare as vital for longer lives โœ… Demonstrated MLRโ€™s simplicity and interpretability in social datasets

Conclusion Animation


๐Ÿ™Œ Author

๐Ÿ‘จโ€๐Ÿ’ป Vishnu Raj ๐ŸŽ“ Data Science Project ๐Ÿ’ผ GitHub | LinkedIn | ๐Ÿ“ง vishnuskillx@gmail.com

Thank You GIF


โญ If you found this project helpful, please give it a star! โญ

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors