If your data isn't structurally modelled, you will constantly run into engineering issues 🛑 And I see this in so many different organisations… Here’s a breakdown of four popular data modelling approaches, each with its unique strengths and considerations: 1. 𝐊𝐢𝐦𝐛𝐚𝐥𝐥 ⭐ Approach: Star schema design with a central fact table connected to multiple dimensional tables. Pros: Simple to understand and implement, efficient for querying, and works well with BI tools. Cons: Requires more storage due to denormalization, less flexible for complex relationships, and can be challenging to update and maintain. 2. 𝐈𝐧𝐦𝐨𝐧 🏢 Approach: Top-down, centralized, normalized data warehouse serving as the authoritative data source. Pros: Reduces redundancy, organized by subject area, flexible for changing business requirements. Cons: Complex, time-consuming, resource-intensive, requiring strong engineering support and higher maintenance. 3. 𝐃𝐚𝐭𝐚 𝐕𝐚𝐮𝐥𝐭 🔗 Approach: Modular design combining aspects of Inmon and Kimball, with a hub-and-spoke architecture. Pros: Highly scalable and flexible, excellent for metadata tracking, transparency, and auditability. Cons: Intricate structure, harder to understand and manage, challenging to set up and maintain. 4. 𝐎𝐧𝐞 𝐁𝐢𝐠 𝐓𝐚𝐛𝐥𝐞 (𝐎𝐁𝐓) 🗂️ Approach: Consolidates all data into a single, flat, denormalized table. Pros: Simplifies data retrieval, easy to implement, ideal for simple reporting needs. Cons: Significant data redundancy, storage inefficiencies, poor scalability, and performance issues for large datasets. Being flexible with how you use each one and implement it is crucial to success as well. Don’t just use star schema and call it a day, understand what is needed for your organization and how you may take multiple approaches to get the best result Check out my Data Modelling article from last summer, which goes into what you need to know about the domain from a high-level (subscription link in the comments). And honestly, I think understanding this area is probably one of the most useful things you can learn as a data professional
Kimball is widely used in enterprise but lacks flexibility for sure
Great article, I read it before couple of times, specially like the new layer you defined as business model ..
You forgot the best one mate. FIWDIL modeling. "F*** It we'll do it live" - which seems to be the default since 2015.
Dylan Anderson that must be the best summary around!
Good summary Dylan Anderson - I'm guessing these are Analytics orientated right (OLAP etc.)? as there are whole classes of OLTP modelling approaches - https://blog.erwin.com/blog/data-modeling-101-oltp-data-modeling-design-and-normalization-for-the-cloud/
Would you agree there is a world where you can adopt a hybrid Kimball and inmon approach. The Kimball star schema models serve as a foundational and heavily governed layer. Inmon approach to having a datamart layer for quicker discovery/analysis that reads from the Kimball layer or source. It helps with speed to immediate value + foundation. Also, being intentional to move a datamart dataset to a Kimball approach once kinks have been ironed out
If you don’t have a data model in the warehouse you actual have a model - one that is constantly breaking, failing on any upstream changes and forcing you to constant repairs and patching of failing ETLs 😵💫
ah, I 🩷 data modeling. I'm usually running with a mix of kimball (core layer), OBT (marts layer) and - depending on the business model - activity schema. Works really well for scale-ups building their first data infra.
I love Data Vault for enterprise data warehousing for changing data, like customer data, which comes from many systems. It's not easy to implement but dbt has some packages to help. It's conceptually so elegant.
Bridging the gap between data and strategy ✦ Head of Data Strategy @ Profusion ✦ Author of The Data Ecosystem newsletter ✦ R Programmer ✦ Policy Nerd
1mo📌 check out the article on this from last summer - https://open.substack.com/pub/thedataecosystem/p/issue-14-the-forgotten-guiding-role?r=8frny&utm_medium=ios