SQL remains essential in data science landscape

This title was summarized by AI from the post below.

SCASA - Southern California Chapter, American Statistical Association

71 followers

2mo

Do we still need something as old-school as SQL? In the rapidly shifting terrain of data science, where Large Language Models (LLMs) often steal the spotlight, it is easy to assume that the “old guard” of technology—like SQL—is on its way to retirement. The reality, however, is quite the opposite. SQL remains the bedrock of the data science landscape, even as that landscape is reshaped by artificial intelligence. https://lnkd.in/gp5ix3F4

SQL in the Age of Generative AI : The Foundation Behind Retrieval Augmented Generation

https://www.youtube.com/

To view or add a comment, sign in

More Relevant Posts

Daniel So
2mo
Report this post
Is SQL still important in the age of LLMs? Alex makes a strong case that the answer is "yes," particularly for RAG applications. Throughout the RAG pipeline—from data preprocessing to retrieval to response evaluation—SQL databases can provide significant advantages for organizing metadata, applying filters, tracking data provenance, and supporting analytics. When a system combines both SQL and vector capabilities, it becomes even more powerful: a unified platform that stores structured metadata alongside vector embeddings, enabling efficient hybrid retrieval by combining semantic search with traditional filtering. Curious which SQL-plus-vector systems work best for your use cases?

Chong Ho Alex Yu

Professor and Program Director of Data Science at Hawaii Pacific University
2mo

Do we still need something as old-school as SQL? In the rapidly shifting terrain of data science, where Large Language Models (LLMs) often steal the spotlight, it is easy to assume that the “old guard” of technology—like SQL—is on its way to retirement. The reality, however, is quite the opposite. SQL remains the bedrock of the data science landscape, even as that landscape is reshaped by artificial intelligence. https://lnkd.in/gi8achpg

SQL in the Age of Generative AI : The Foundation Behind Retrieval Augmented Generation

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Dor Attias
1mo
Report this post
🚨 𝗪𝗲 𝗱𝗶𝘀𝗰𝗼𝘃𝗲𝗿𝗲𝗱 𝗮 𝗰𝗿𝗶𝘁𝗶𝗰𝗮𝗹 𝘃𝘂𝗹𝗻𝗲𝗿𝗮𝗯𝗶𝗹𝗶𝘁𝘆 (𝗖𝗩𝗦𝗦 𝟗.𝟖) 𝗶𝗻 𝗮 𝗹𝗶𝗯𝗿𝗮𝗿𝘆 𝘂𝘀𝗲𝗱 𝗯𝘆 𝟖𝟕% 𝗼𝗳 𝗙𝗼𝗿𝘁𝘂𝗻𝗲 𝟏𝟎𝟎𝟎 🚨 Today, at Cyera Research, we uncovered a critical vulnerability (CVE-2025–64712, CVSS 9.8) in 𝘂𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱.𝗶𝗼 library - an ETL library used by most of Fortune 1000 companies as an enabler for their AI transformation. The vulnerability we've found is an Arbitrary File Write via Path Traversal bug that allows threat actors to execute code on the machine running the unstructured library and ultimately takeover the machine. Link to the full blog post in the first comment 👇
16 Comments
Like Comment
To view or add a comment, sign in
Vladimir Tokarev
1mo
Report this post
Do yourself a favor and go read this research. This vuln is very cool. Like “drop what you’re doing and open the link” cool.
Dor Attias

Research @ Cyera
1mo

🚨 𝗪𝗲 𝗱𝗶𝘀𝗰𝗼𝘃𝗲𝗿𝗲𝗱 𝗮 𝗰𝗿𝗶𝘁𝗶𝗰𝗮𝗹 𝘃𝘂𝗹𝗻𝗲𝗿𝗮𝗯𝗶𝗹𝗶𝘁𝘆 (𝗖𝗩𝗦𝗦 𝟗.𝟖) 𝗶𝗻 𝗮 𝗹𝗶𝗯𝗿𝗮𝗿𝘆 𝘂𝘀𝗲𝗱 𝗯𝘆 𝟖𝟕% 𝗼𝗳 𝗙𝗼𝗿𝘁𝘂𝗻𝗲 𝟏𝟎𝟎𝟎 🚨 Today, at Cyera Research, we uncovered a critical vulnerability (CVE-2025–64712, CVSS 9.8) in 𝘂𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱.𝗶𝗼 library - an ETL library used by most of Fortune 1000 companies as an enabler for their AI transformation. The vulnerability we've found is an Arbitrary File Write via Path Traversal bug that allows threat actors to execute code on the machine running the unstructured library and ultimately takeover the machine. Link to the full blog post in the first comment 👇
Like Comment
To view or add a comment, sign in
Shay Cohen
1mo
Report this post
Yesterday, Cyera research found a critical vulnerability (CVSS 9.8) in unstructured.io library, an ETL library used by 87% of Fortune 1000 companies. "...The vulnerability we found is a classic path traversal bug that leads to arbitrary file write. In simple words, it allows a threat actor to write a file of any type, with any content, to anywhere on the file system of the machine running the unstructured library..." Read the full article here: https://lnkd.in/eiFP9_Yx
Dor Attias

Research @ Cyera
1mo

🚨 𝗪𝗲 𝗱𝗶𝘀𝗰𝗼𝘃𝗲𝗿𝗲𝗱 𝗮 𝗰𝗿𝗶𝘁𝗶𝗰𝗮𝗹 𝘃𝘂𝗹𝗻𝗲𝗿𝗮𝗯𝗶𝗹𝗶𝘁𝘆 (𝗖𝗩𝗦𝗦 𝟗.𝟖) 𝗶𝗻 𝗮 𝗹𝗶𝗯𝗿𝗮𝗿𝘆 𝘂𝘀𝗲𝗱 𝗯𝘆 𝟖𝟕% 𝗼𝗳 𝗙𝗼𝗿𝘁𝘂𝗻𝗲 𝟏𝟎𝟎𝟎 🚨 Today, at Cyera Research, we uncovered a critical vulnerability (CVE-2025–64712, CVSS 9.8) in 𝘂𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱.𝗶𝗼 library - an ETL library used by most of Fortune 1000 companies as an enabler for their AI transformation. The vulnerability we've found is an Arbitrary File Write via Path Traversal bug that allows threat actors to execute code on the machine running the unstructured library and ultimately takeover the machine. Link to the full blog post in the first comment 👇
Like Comment
To view or add a comment, sign in
Data For Science

386 followers
1mo
Report this post
The latest edition of the Data Science Briefing is now out! This week’s theme is moving from “cool model” to “reliable system.” We highlight an end-to-end RLHF-from-scratch deep dive, practical guidance on writing higher-quality code with AI (without sacrificing correctness), and why production-grade agents need operational memory—not just chat history. We also include a solid Bayesian on-ramp to causal discovery, plus fresh academic reads on where LLM “reasoning” breaks down and what better RL objectives might look like. Check it out! 👉 https://lnkd.in/eAGv8cEf

Data Science Briefing #305 data4science.kit.com

1 Comment
Like Comment
To view or add a comment, sign in
Roger Hoerl
2mo
Report this post
The rapid enhancement of AI's ability to write computer code has led to considerable angst in the data science and computer science communities. This obviously leads to the critical question of how these groups can redefine their unique value add. I have previously argued (https://lnkd.in/g-sbra7i) for the statistics profession to accept and "own" the challenge of data quality, and recently worked with Willis Jensen to make the same suggestion to the data science community. (https://lnkd.in/ga7-nTjx) In my view, the challenge of data quality is a "jump ball" for whoever has the will to accept it. Enjoy!

Why Data Quality Is the New Competitive Edge For Data Scientists – Real World Data Science realworlddatascience.net

16 Comments
Like Comment
To view or add a comment, sign in
Michael O'Donnell
2mo
Report this post
After years in the relational world, I’m officially moving into Graph Tech. 🚀 Relational databases are great, but Graph is the "secret sauce" for the world's most complex data challenges. It’s the difference between seeing dots and seeing the whole picture. Current status: Reached "Oxymoron Level 1" - this is not a thing: Contextual Knowledge Graph. It's like Cold Ice. We’re building a Composite AI Investigation Platform to turn fragmented data into real-world intelligence. The future isn't in rows - it's in the connections. #Graph+AI-Platform #KnowledgeGraph #CompositeAI #DecisionIntelligence #InvestigationIntelligence DataWalk

3 Comments
Like Comment
To view or add a comment, sign in
Georgeta Cristea

Employer Brand Specialist • Creator of Zaba_TheCuteMonster • Illustrator • Marketer • Art lover
1mo
Report this post
Data science may appear to be all about intricate models and AI wizardry, but it’s actually a collaborative, business-oriented, and iterative process. Solomiia Leno, a Data Scientist at SoftServe, sheds light on several misunderstandings and provides insight into the true nature of this role in action. Learn more about what it’s like at SoftServe: https://sftsrv.com/UJrMIA #DataScience #SoftServe
Like Comment
To view or add a comment, sign in
Vaida Leela Rajesh
2mo
Report this post
Decision Trees and Why Ensembles Are Necessary A single Decision Tree is like that one friend who is incredibly logical but also incredibly biased. They have a 'Rule' for everything. 'If the temperature is > 30 and the day is Sunday, then everyone will buy ice cream.' It sounds smart, but it’s fragile. If it snows on a Sunday (a rare event), the tree will still confidently tell you to buy ice cream. This is called 'Overfitting.' To build a production system, you don't need one 'Logical Friend.' You need a 'Jury.' --- THE ENSEMBLE: THE WISDOM OF THE CROWD --- An Ensemble method (like Random Forest or XGBoost) is just a collection of many decision trees. - Random Forest (Bagging): You train 100 trees on different random subsets of your data. Then you let them vote. If 90 trees say 'Spam' and 10 say 'Not Spam,' the answer is Spam. The 'crowd' averages out the individual biases of each tree. - Boosting (XGBoost): You train one tree. It makes mistakes. Then you train a second tree specifically to *fix* the mistakes of the first one. You repeat this until your error is nearly zero. --- WHY TREES STILL BEAT DEEP LEARNING (FOR TABULAR DATA) --- If your data lives in a spreadsheet (CSV, SQL, Excel), Ensembles of Trees almost always beat Neural Networks. 1. They handle 'Missing Data' effortlessly. 2. They don't care about the 'Scale' of your numbers (you don't need to normalize). 3. They are significantly faster to train and deploy. --- THE SARCASTIC BOTTOM LINE --- Stop trying to build a 'Neural Network' for your tabular business data. It’s like using a telescope to read a book. XGBoost or Random Forest is the workhorse of the modern enterprise. It’s not 'sexy,' but it’s what actually powers the models that detect fraud, predict churn, and determine insurance rates across the world. Are you a 'Deep Learning snob' or do you acknowledge the power of the mighty Random Forest? 👇 #MachineLearning #RandomForest #XGBoost #DataScience #AIEngineering #SoftwareEngineering #EnsembleLearning #TabularData
Like Comment
To view or add a comment, sign in

SCASA - Southern California Chapter, American Statistical Association

71 followers

View Profile Follow

SQL remains essential in data science landscape

SQL in the Age of Generative AI : The Foundation Behind Retrieval Augmented Generation

https://www.youtube.com/

More Relevant Posts

SQL in the Age of Generative AI : The Foundation Behind Retrieval Augmented Generation

https://www.youtube.com/

Explore related topics

Explore content categories