Datastrato

Software Development

San Mateo, CA 1,404 followers

Original creator of Apache Gravitino. Unified metadata platform for AI - multi-cloud, multi-engine and multi-modal.

View all 21 employees

About us

Datastrato is building the open data fabric platform to accelerate trusted AI. The company is the original creator of Apache Gravitino, unified metadata platform for AI - multi-cloud, multi-engine and multi-modal.

Website: https://datastrato.ai/
External link for Datastrato
Industry: Software Development
Company size: 11-50 employees
Headquarters: San Mateo, CA
Type: Privately Held
Founded: 2023
Specialties: Apache Gravitino, Universal Data Catalog, Federated Metadata Lake, Unified Metadata Management, Iceberg REST Catalog, Agentic Data Architecture, Data Agents, AI Infrastructure, Model Registry, Multi-Modal Data Management, AI Data Fabric, Unstructured Data Management, Data Governance, Data Lakehouse, Single Source of Truth, Multi-cloud Data Management, Open Source Software, AI-Native Date Lakehouse, and Modern Data System

Locations

Primary

San Mateo, CA 94402, US

Get directions
Palo Alto, US

Get directions

Employees at Datastrato

See all employees

Updates

Datastrato

1,404 followers
1w Edited
Report this post
Data platforms shouldn’t rely on reactive firefighting. 🚫🔥 That’s why we’re excited to introduce Apache Gravitino 1.2.0. This release brings: • Automated lakehouse maintenance with Table Maintenance Service • ClickHouse catalog for real-time analytics governance • End-to-end UDF management • Scan planning offload for faster queries • Expanded ecosystem across Trino, Apache Flink, and Delta Lake We are building the open metadata layer for the modern data stack. 🏗️ 🔗 Check out the full release notes here: https://lnkd.in/gdyiENq5 #ApacheGravitino #OpenSource #DataGovernance #ModernDataStack #DataEngineering #Lakehouse #BigData #Metadata #ClickHouse #Apache
Like Comment Share
Datastrato

1,404 followers
2w
Report this post
Most data systems were built for humans, not AI. That’s why agents hallucinate and fail in production. Datastrato has just unveiled ADP (Agentic Data Protocol) — the "Missing Layer" that gives AI agents the eyes to understand business context and the governed interfaces to act safely. The Agentic Data era starts now. 🚀

Introducing ADP: The Missing Layer for Agentic AI Datastrato on LinkedIn

Like Comment Share
Datastrato

1,404 followers
2w
Report this post
It’s 2:00 AM. A critical pipeline breaks. Who fixes it? 🛠️ Traditionally, an engineer wakes up. In the "Agentic" future, an AI agent detects, analyzes, and resolves the issue before you even wake up. But for this to work, our data systems need a radical redesign. In his latest article, our founder Junping Du explores how the intersection of OpenClaw and open data architecture is paving the way for autonomous data operations. Discover why metadata and a unified control plane (like Apache Gravitino) are the keys to unlocking the next decade of data innovation. Read the full vision here: 🔗 https://lnkd.in/grNsSi5w #AI #DataEngineering #OpenSource #GenerativeAI #Datastrato #ApacheGravitino #AgenticAI

When OpenClaw Met Data: The Beginning of Agentic Data Infrastructure Junping Du on LinkedIn

Like Comment Share
Datastrato

1,404 followers
2w
Report this post
Streaming data management just got a whole lot easier. Learn how Apache Gravitino meets Apache Flink to empower your streaming pipelines with seamless metadata access and unified control. Shout out to FANNG 1 for this deep dive into streamlining the streaming stack! 🙌 If you're building modern data platforms, this is a must-read. Check it out: https://lnkd.in/gXgwbvcT #Datastrato #MetadataManagement #RealTimeData #DataEngineering #OSS

Apache Gravitino

289 followers
2w

Empower Your Streaming Pipelines: Apache Gravitino meets Apache Flink! 🚀 🔗https://lnkd.in/gXgwbvcT Managing metadata across diverse sources in streaming architectures can be a major headache. In our latest Gravitino 101 series, we dive deep into how Apache Gravitino integrates with Apache Flink to simplify streaming data management. What’s inside: ✅ Seamlessly managing Flink catalogs with Gravitino. ✅ Enabling unified metadata access for real-time pipelines. ✅ Step-by-step guide to streamlining your streaming stack. Check out the full blog post to see how you can bring better governance to your streaming data! #ApacheGravitino #ApacheFlink #DataInfrastructure #Streaming #OpenSource #DataEngineering

Using Gravitino with Apache Flink for Streaming dev.to

Like Comment Share
Datastrato

1,404 followers
3w
Report this post
What can a 1989 episode of Star Trek teach us about 2026 AI? 🛰️ Datatrato CEO Junping Du explains why the secret to successful AI Agent systems lies in the balance between: 🤖 Data: Computation, Precision, Automation 🧠 Picard: Judgment, Ethics, Governance It’s not Human vs. Machine—it’s Human + Machine. #AIAgents #DataScience #OpenClaw #StarTrek #Datastrato

What Star Trek Taught Us About AI Agents Junping Du on LinkedIn

Like Comment Share
Datastrato

1,404 followers
1mo
Report this post
👏The Linux Foundation Member Summit wrapped up yesterday in Napa. 🥂We’re grateful to have taken part in the conversations alongside engineers, maintainers, and community leaders from across the Bay Area and beyond. For us, showing up matters — listening, learning, and engaging in thoughtful discussions around open collaboration, governance, and long-term sustainability in open source. 🎉Thank you to everyone who shared insights and perspectives. We value the opportunity to participate and look forward to continuing the dialogue. #OpenSource #LFMemberSummit #BayAreaTech
Like Comment Share
Datastrato

1,404 followers
1mo Edited
Report this post
Everyone talks about data lakes. No one talks about catalog sprawl. That’s the real bottleneck. Apache Gravitino introduces a new layer: A control plane across multimodal catalogs. Not another engine. A catalog of catalogs. 🎥 Watch the short: https://lnkd.in/gmuphqtR 💻 Explore the project on GitHub: https://lnkd.in/g6YpuiGu #AIAgents #AgenticAI #AIInfrastructure #AIDataStack #Metadata #DataGovernance #OpenSource

Like Comment Share
Datastrato

1,404 followers
1mo
Report this post
🚀 New Deployment Guide is Live! We’ve published a step-by-step guide on deploying Iceberg REST Catalog with access control using Apache Gravitino. 👉 Read the full article on DEV: https://lnkd.in/gnKKubMn Learn how to: • Enable RBAC for Iceberg tables • Configure Gravitino + REST service • Integrate with Spark • Implement table-level permissions If you're running Apache Iceberg in production, this guide is for you! #datacatalog #ApacheGravitino #ApacheIceberg #Lakehouse #DataGovernance #OpenSource #DataPlatform
Like Comment Share
Datastrato

1,404 followers
1mo
Report this post
💡A quick walkthrough on how Apache Gravitino can simplify metadata management for Apache Spark–based ETL pipelines.

Apache Gravitino

289 followers
1mo Edited

🚀 Apache Gravitino 101: A New Tutorial on Apache Gravitino + Apache Spark for ETL 🔗 Read now: https://lnkd.in/gatnZvbq ETL pipelines are only as good as the metadata behind them. In our latest Apache Gravitino 101 tutorial, we explore how Apache Gravitino integrates with Apache Spark to enable more consistent, scalable ETL workflows. 💡In this tutorial, you’ll learn how to: - Integrate Apache Gravitino with Apache Spark for ETL workloads - Use a centralized catalog to manage schemas and tables - Reduce metadata inconsistency across Spark jobs - Design ETL pipelines with governance in mind We’ll continue sharing more hands-on Apache Gravitino tutorials focused on real-world data platform challenges. Feel free to share your thoughts or questions in the comments 👇 #ApacheGravitino #OpenSource #ApacheSpark #ETL #DataEngineering #Metadata #DataPlatform

Like Comment Share
Datastrato

1,404 followers
1mo
Report this post
📣 Join Us at Iceberg Summit 2026 Iceberg Summit 2026 is just around the corner — and we’d love to see you there. 👉 Register here: https://lnkd.in/gpmY4gAr 📍 San Francisco 📅 April 8–9, 2026 This year, Datastrato will be attending the Summit to connect with data engineers, architects, and industry leaders, and to exchange ideas on modern data platforms and real-world production use cases. Whether you’re looking to learn, share experiences, or expand your network, Iceberg Summit is a great place to engage with the data community and explore what’s next. Special thanks to the Iceberg Summit team and the Apache Iceberg community for bringing everyone together. #IcebergSummit #ApacheIceberg #DataEngineering #DataInfrastructure #OpenData #ModernDataStack #Datastrato
Like Comment Share

Datastrato

Software Development

San Mateo, CA 1,404 followers

Original creator of Apache Gravitino. Unified metadata platform for AI - multi-cloud, multi-engine and multi-modal.

About us

Locations

Employees at Datastrato

Andrew D.

Datastrato•15K followers

Tom Tan

2K followers

Mark Hoerth

Datastrato•940 followers

Shi Shao Feng

Datastrato•570 followers

Updates

Join now to see what you are missing

Similar pages

Apache Gravitino

VeloDB (Powered by Apache Doris)

Data for AI

EntreConnect

CelerData

WisdomAI

LanceDB

The Apache Software Foundation

Neon Postgres

OpenMetadata

Browse jobs

Architect jobs

Associate Researcher jobs

Accounts Payable Accountant jobs

Ruby on Rails Developer jobs

Visual Designer jobs

Real Estate Manager jobs

SQL Database Administrator jobs

Product Development Specialist jobs

Risk Analyst jobs

Engineer jobs

Photographer jobs

Junior Software Engineer jobs

Program Manager jobs

Frontend Developer jobs

Android Developer jobs

Cyber Security Specialist jobs

Marketing Assistant jobs

Machine Learning Engineer jobs

Python Developer jobs

Senior Software Engineer jobs