A wholesale distributor operating in different regions of Portugal has information about the annual spending of 440 large retailers on six different product varieties across three regions in Portugal (Lisbon, Oporto, and Other) and various sales channels (Hotel, Retail).
The dataset is provided by a wholesale distributor operating in different regions of Portugal. Here are the key features of the dataset:
- Number of Retailers: 440 large retailers
- Product Varieties: 6 different types
- Regions: Lisbon, Oporto, and Other
- Sales Channels: Hotel, Retail
This data can be used for various analyses, such as understanding spending patterns, identifying regional differences in product demand, and optimizing supply chain management for different sales channels.
- Buyer/Spender- ID's of customers
- Region- Region of the distributor
- Fresh- spending on Fresh Vegetables
- Milk- spending on milk
- Grocery- spending on grocery
- Frozen- spending on frozen food
- Detergents_paper- spending on detergents and toilet paper
- Delicatessen- spending on instant foods
In this project, we implement a method to extract, transform, and analyze transactional sales data.The main goal is likely to group customers into distinct segments (clusters) based on their spending patterns. This helps businesses understand their customer base better and tailor their marketing and sales strategies accordingly.
Jupyter Notebook or Google Collab: Used as the primary environment for developing and executing the code.
- Numpy: Utilized for efficient numerical computations and data manipulation.
- Pandas: For data manipulation and transformation.
- Seaborn: For data visualization and exploratory data analysis.
- Matplotlib: For creating static, interactive, and animated visualizations.
-
Exploratory Data Analysis (EDA) in python was conducted in the data analysis process that involves studying, exploring, and visualizing information to derive important insights.
-
Data cleaning was performed to address missing values and data anomalies.