📊Data-Warehouse-Project1

Welcome to the Data Warehouse and Analytics Project repository! 🚀 This project demonstrates a comprehensive data warehousing and analytics solution, from building a data warehouse to generating actionable insights. Designed as a portfolio project, it also highlights industry best practices in data engineering and analytics that I have learnt so far.

🏗️ Data Architecture

The data architecture for this project follows Medallion Architecture Bronze, Silver, and Gold layers:

Bronze Layer: Stores raw data as-is from the source systems. Data is ingested from CSV Files into SQL Server Database.
Silver Layer: This layer includes data cleansing, standardization, and normalization processes to prepare data for analysis.
Gold Layer: Houses business-ready data modeled into a star schema required for reporting and analytics.

📖 Project Overview

This project involves:

Data Architecture: Designing a Modern Data Warehouse Using Medallion Architecture Bronze, Silver, and Gold layers.
ETL Pipelines: Extracting, transforming, and loading data from source systems into the warehouse.
Data Modeling: Developing fact and dimension tables optimized for analytical queries.
Analytics & Reporting: Creating SQL-based reports and dashboards for actionable insights.

🛠️ Important Links & Tools:

Everything is for Free!

Datasets: Access to the project dataset (csv files).
SQL Server Express: Lightweight server for hosting your SQL database.
SQL Server Management Studio (SSMS): GUI for managing and interacting with databases.
Git Repository: Set up a GitHub account and repository to manage, version, and collaborate on your code efficiently.
DrawIO: Design data architecture, models, flows, and diagrams.
Notion: Get the Project Template from Notion
Notion Project Steps: Access to All Project Phases and Tasks.

🚀 Project Requirements

Building the Data Warehouse (Data Engineering)

Objective

Develop a modern data warehouse using SQL Server to consolidate sales data, enabling analytical reporting and informed decision-making.

Specifications

Data Sources: Import data from two source systems (ERP and CRM) provided as CSV files.
Data Quality: Cleanse and resolve data quality issues prior to analysis.
Integration: Combine both sources into a single, user-friendly data model designed for analytical queries.
Scope: Focus on the latest dataset only; historization of data is not required.
Documentation: Provide clear documentation of the data model to support both business stakeholders and analytics teams.

BI: Analytics & Reporting (Data Analysis)

Objective

Develop SQL-based analytics to deliver detailed insights into:

Customer Behavior
Product Performance
Sales Trends

These insights empower stakeholders with key business metrics, enabling strategic decision-making.

📂 Repository Structure

data-warehouse-project/
│
├── dataset/                            # Raw datasets used for the project (ERP and CRM data)
│
├── docs/                               # Project documentation and architecture details
│   ├── ETL.png                         # Draw.io file shows all different techniquies and methods of ETL
│   ├── Architecture of DWH.png         # Draw.io file shows the project's architecture
│   ├── data_catalog.md                 # Catalog of datasets, including field descriptions and metadata
│   ├── Data_Flow.png                   # Draw.io file for the data flow diagram
│   ├── data_models.png                 # Draw.io file for data models (star schema)
│   ├── naming_conventions.md           # Consistent naming guidelines for tables, columns, and files
│
├── scripts/                            # SQL scripts for ETL and transformations
|   ├── init_database.sql               # Script to initialize the bronze, silver and gold schemas & database (`datawarehouse`).
│   ├── bronze/                         # Scripts for extracting and loading raw data
|       ├── ddl_bronze.sql              # Script that defimes the metadata(schema) for loading the files.
|       ├── procedure_load_bronze.sql   # Script that bulk-loads the data from the local device & provides debugging assist and execution time informations. 
│   ├── silver/                         # Scripts for cleaning and transforming data
|       ├── ddl_silver.sql              # Script that defimes the metadata(schema) for loading the files.
|       ├── procedure_load_silver.sql   # Script that transforms & cleans the data, then loads it from the bronze layer in to silver layer. Also provides debugging assist and execution time informations. 
│   ├── gold/                           # Scripts for creating analytical models
│       ├── ddl_gold.sql                # Script for creating views and provide analytical abilities for the data from silver layer.
|
├── tests/                              # Test scripts and quality files
│
├── README.md                           # Project overview and instructions

🌟 About Me

Hi there! I'm Ayushman Bhargav. My field of interest include Internet of Things(IoT), Data Science and Machine Learning(AI/ML) , working towards upgrading my skills and knowledge through a combined learning of Theoretical Concepts as well as using practical projects to advance my learning.

Let's stay in touch! Feel free to connect with me via LinkedIn:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊Data-Warehouse-Project1

🏗️ Data Architecture

📖 Project Overview

🛠️ Important Links & Tools:

🚀 Project Requirements

Building the Data Warehouse (Data Engineering)

Objective

Specifications

BI: Analytics & Reporting (Data Analysis)

Objective

📂 Repository Structure

🌟 About Me

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
dataset		dataset
docs		docs
scripts		scripts
tests		tests
README.md		README.md
datasets		datasets

Ayushman0511/Data-Warehouse-Project1

Folders and files

Latest commit

History

Repository files navigation

📊Data-Warehouse-Project1

🏗️ Data Architecture

📖 Project Overview

🛠️ Important Links & Tools:

🚀 Project Requirements

Building the Data Warehouse (Data Engineering)

Objective

Specifications

BI: Analytics & Reporting (Data Analysis)

Objective

📂 Repository Structure

🌟 About Me

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages