There’s a time you don’t want to be DISTINCT. As Data Analyst... In SQL, using DISTINCT seems like the easy fix. But in reality? It often puts a giant spotlight on sloppy joins. If you find yourself slapping DISTINCT after every query, it’s usually a red flag: 🚫 Your join logic is broken 🚫 You’re pulling way more data than needed 🚫 You’re masking the real issue, not solving it Instead, break down the problem cleanly with CTEs. Focus on deduping before the join, not after. Here's a cleaner way to avoid DISTINCT disasters: [Look at the Image] Why this matters: ✅ No unnecessary table bloat: avoids re-reading the entire orders table ✅ Zero post-join deduping: you only ever process the single “latest” row per order ✅ Self-documenting steps: each CTE has a clear purpose, making maintenance a breeze Readable. Maintainable. Faster. And no need to hide behind DISTINCT. What's a small SQL habit you've had to unlearn? I would love to hear it! 👇
Why Use CTEs for Cleaner Code
Explore top LinkedIn content from expert professionals.
Summary
Common table expressions (CTEs) are temporary result sets in SQL that help organize complex queries into clearer, easier-to-read steps. Using CTEs can make your code more maintainable and understandable, especially when dealing with layered logic or large datasets.
- Structure your queries: Break down complex SQL logic into smaller, manageable parts by defining each step with a separate CTE.
- Improve team readability: Make your query flow easier to follow for colleagues by writing CTEs that act like a story, clarifying what each section does.
- Reduce redundant code: Reference results from CTEs multiple times within the same query to avoid repeating calculations or logic.
-
-
𝐃𝐨𝐞𝐬 𝐂𝐓𝐄 𝐢𝐬 𝐚 𝐠𝐚𝐦𝐞 𝐜𝐡𝐚𝐧𝐠𝐞𝐫 𝐢𝐧 𝐰𝐫𝐢𝐭𝐢𝐧𝐠 𝐬𝐪𝐥? I am a great advocate of CTE. Common Table Expressions (CTEs) can indeed be considered a game changer in writing SQL. They offer several advantages that make complex queries more manageable and readable. 𝐊𝐞𝐲 𝐁𝐞𝐧𝐞𝐟𝐢𝐭𝐬 1. Improved Readability and Maintainability: - **Clarity**: CTEs allow you to break down complex queries into simpler, more understandable parts. By defining intermediate results, you can create a clear, step-by-step approach to your query logic. - **Reusability**: You can reference the same CTE multiple times within a query, reducing redundancy and making the query easier to maintain. 2. Enhanced Modularity: - CTEs can be used to structure SQL queries in a modular fashion. You can define and use multiple CTEs in a single query, each performing a specific task, which can then be combined for the final result. 3. Recursive Queries: - CTEs support recursive queries, allowing you to perform operations such as hierarchical data processing. This is particularly useful for querying data with parent-child relationships, such as organizational structures or tree-like data. 4. Optimization Potential: - By simplifying complex queries, CTEs can make it easier for the SQL optimizer to generate efficient execution plans. This can potentially lead to improved performance, especially in complicated queries. 5. Temporary Result Sets: - CTEs provide a way to create temporary result sets that exist only for the duration of the query execution. This can be useful for operations that don’t require persistent data storage. 𝐄𝐱𝐚𝐦𝐩𝐥𝐞 **Without CTE:** SELECT e.employee_id, e.first_name, e.last_name, d.department_name FROM employees e JOIN departments d ON e.department_id = d.department_id JOIN (SELECT department_id, AVG(salary) AS avg_salary FROM employees GROUP BY department_id) avg_dept ON d.department_id = avg_dept.department_id WHERE e.salary > avg_dept.avg_salary; **With CTE:** WITH avg_dept AS ( SELECT department_id, AVG(salary) AS avg_salary FROM employees GROUP BY department_id ) SELECT e.employee_id, e.first_name, e.last_name, d.department_name FROM employees e JOIN departments d ON e.department_id = d.department_id JOIN avg_dept ON d.department_id = avg_dept.department_id WHERE e.salary > avg_dept.avg_salary; Conclusion: CTEs can greatly enhance the way SQL queries are written and understood, making them a valuable tool for any database professional. Their ability to simplify complex queries, improve readability, and support recursive operations makes them a game changer in SQL development. #sql #dataanalyst #sqlseveloper Follow Vishal Jaiswal, PMP® for concepts and crack technical interviews
-
🚀 Why CTEs (Common Table Expressions) Still Matter in Data Engineering In the world of complex SQL and large-scale data transformations, one simple tool continues to make a massive difference: 👉 CTEs (WITH clauses) 🔹 What is a CTE? A Common Table Expression is a temporary result set that you can reference within a SQL query. WITH sales_cte AS ( SELECT customer_id, SUM(amount) AS total_spent FROM sales GROUP BY customer_id ) SELECT * FROM sales_cte WHERE total_spent > 1000; 🔹 Why Data Engineers love CTEs ✔️ Improves readability of complex queries ✔️ Breaks down logic into manageable steps ✔️ Reusable within a single query ✔️ Ideal for transformations in ETL/ELT pipelines ✔️ Works seamlessly with tools like Snowflake, BigQuery, Redshift 🔹 CTE vs Subquery CTESubqueryMore readableCan get messyEasier debuggingHarder to traceModular logicNested complexity🔹 When to use CTEs ✔️ Multi-step transformations ✔️ Data cleaning pipelines ✔️ Aggregations before joins ✔️ Recursive queries (hierarchies, graphs) 🔹 Real-world impact In modern data stacks (dbt, Snowflake, Databricks SQL), CTEs are often the backbone of: 👉 Data modeling 👉 Feature engineering 👉 Business logic layering 💡 Pro Tip: Don’t overuse deeply nested CTEs — balance readability with performance. CTEs aren’t just syntax — they’re a design pattern for clean, scalable SQL. #DataEngineering #SQL #CTE #BigData #Snowflake #Databricks #AnalyticsEngineering #ETL #DataPipeline
-
🥁 🥁 CTEs are not just about clean code—they unlock the full potential of SQL Common Table Expressions (CTEs) are a game-changer in SQL, enhancing query readability and modularity. Here are the essential rules for data engineers: 🎷 Definition: Defined using the WITH clause, CTEs make complex queries simpler and more readable. 🎸 Scope: CTEs exist only for the duration of the query they are part of. They are not stored in the database, making them lightweight and temporary. 🎺 Multiple CTEs: You can define multiple CTEs in a single query, separated by commas. This allows for modular and step-by-step query building. 🎻 Referencing: CTEs can reference other CTEs defined earlier in the same WITH clause, enabling complex query logic and transformations. 🥁 Recursive CTEs: Use the RECURSIVE keyword for self-referencing CTEs, perfect for hierarchical and recursive data structures. 🎹 Filtering Window Function Results: Calculate window functions within a CTE and filter results in an outer query for better performance and clarity. 🎸 DML Operations: CTEs can be utilized in INSERT, UPDATE, DELETE, and MERGE statements, making your data manipulation tasks more readable and maintainable. By mastering these CTE rules, we can craft more efficient, maintainable, and powerful SQL queries. Hence CTEs are not just about clean code—they unlock the full potential of SQL 😃 !!! #DataEngineering #SQL #CTE #Database #TechTips #DataScience Zach Wilson
-
CTE vs Subquery which one should a Data Analyst use? When I started writing SQL queries, I used to nest subqueries inside subqueries... and trust me, by the end, I’d forget what I was even trying to do 😅 That’s when I discovered CTEs (Common Table Expressions) and it completely changed how I think about SQL. Here’s how I see it 👇 Use CTEs when: You want your query to read like a story You need to break logic into clean, readable steps You might reuse the same result again in your query Example: WITH sales_cte AS ( SELECT region, SUM(sales) AS total_sales FROM sales_data GROUP BY region ) SELECT region FROM sales_cte WHERE total_sales > 50000; Much cleaner, right? Use Subqueries when: You just need a quick calculation or filter The logic is simple and used only once SELECT * FROM employees WHERE salary > (SELECT AVG(salary) FROM employees); Both are powerful but here’s the truth 👇 💡 CTEs make your logic easier to explain, review, and debug especially in real-world analytics projects. As a data analyst, writing readable SQL is just as important as writing efficient SQL. 💬 Which one do you use more , CTE or Subquery? I’d love to hear how you approach complex queries! And if you’re trying to level up your SQL skills or build cleaner logic for Power BI dashboards, you can connect with me on Topmate for 1:1 sessions 💬 👉 https://lnkd.in/gWSkyyiv #SQL #DataAnalytics #CTE #Subquery #DataAnalyst
-
How I refactor messy code for readability and maintainability 1. Start with your legacy SQL, warts and all 2. Add import CTE's at the top of the query to import your source tables. 3. Reference the import CTEs throughout the query instead of direct references to the sources. 4. Choose a refactor strategy: In place or alongside. 5. Implement clean CTEs for readability: source_, logic_, aggregate_, filter_, etc. 6. Centralize transformation logic in to distinct steps (layers). 7. End with a final CTE named result, or final with a final query that reads select * from result 8. Audit the output along the way. If you do one thing on this list: add a set of select * from CTEs for your sources and reference those sources throughout your query. Clean code. Clear results.