Every stage of the ML pipeline has a different data access pattern. 📥 Data ingestion needs high write throughput. 🔁 Preprocessing often needs mixed read and write performance. 🧠 Training needs high read throughput to keep GPUs busy. ⚡ Model deployment and inference need low latency and high concurrency. A single storage strategy rarely fits all of these needs. That is why AI infrastructure teams need to understand how data is accessed across the full pipeline, including file size, file count, access mode, data format, and latency requirements. Alluxio provides a unified data access layer between AI workloads and storage systems, helping teams serve data faster without creating unnecessary copies or moving data every time compute changes location. The result is a more flexible architecture for large-scale AI training, deployment, and inference. Read the white paper: https://lnkd.in/gzDG9tJJ #AIInfrastructure #DataInfrastructure #MLOps #MachineLearning #AI
Alluxio
Software Development
San Mateo, California 4,610 followers
High-performance distributed caching built for large-scale AI workloads.
About us
Alluxio accelerates data access at every stage of the AI lifecycle – from model training to deployment and inference cold starts to feature store queries – all without replacing your storage or changing your code. Alluxio customers achieve sub-millisecond time-to-first-byte (TTFB) latency and push more than a TB/sec of throughput accessing AI data stored in the cloud. Alluxio deploys as a lightweight, distributed cache between your AI compute workloads (training jobs, feature stores, inference servers) and wherever your AI data is persistently stored (e.g., cloud storage like S3, data lakes, HDFS, NFS, etc).
- Website
-
https://www.alluxio.io/
External link for Alluxio
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- San Mateo, California
- Type
- Privately Held
- Founded
- 2015
Locations
-
Primary
Get directions
1825 S Grant St
Suite 800
San Mateo, California 94402, US
Employees at Alluxio
Updates
-
In ML-driven trading, infrastructure performance directly affects decision time. Blackout Power Trading operates in North American power markets, where thousands of models need to run during a critical 15-minute daily trading window. Their feature data lived in S3, which provided the durability and cost benefits they wanted. But as the number of models grew, S3 latency began to limit how far the ML platform could scale. Alluxio helped remove that bottleneck. By adding a low-latency distributed caching layer between compute and S3, Blackout Power Trading was able to serve feature data and model artifacts much faster without redesigning their storage architecture. The impact: ▸ Scaled from 5,000 to 100,000+ ML models ▸ Reduced training large-join query latency by 22–37× ▸ Reduced inference large-join query latency by 37–83× ▸ Preserved S3 as the durable storage layer For Blackout Power Trading, faster feature access means more time for risk analysis and better trading decisions. Read the full case study: https://lnkd.in/gS3C4rbz #AIInfrastructure #MachineLearning #DataInfrastructure #ObjectStorage #MLOps
-
-
Tail latency, not average latency, determines how long a synchronous checkpoint step takes. Because the slowest rank gates the entire training job. This is why we focused on P99 in benchmarking Alluxio AI 3.9's POSIX Write Cache: ➡️ Single-node cluster: 7.6 GiB/s peak write throughput, P99 under 2 ms ➡️ Three-node cluster: 20 GiB/s peak write throughput, P99 under 2 ms Throughput scales near-linearly with worker count. P99 stays flat. Capacity grows with the compute layer instead of bottlenecking at the storage backend. That's what makes checkpoint cycles predictable at GPU-cluster scale. https://lnkd.in/gSb6Hf8F #AIInfrastructure #GPU #Benchmarks
-
-
S3 is a natural place to keep large datasets. The harder part is what happens when AI and analytics jobs need to read from it constantly. Training, inference, RAG, and feature workloads all depend on fast data and metadata access. For many teams, moving everything into faster storage is expensive and hard to operate. Alluxio gives teams another option. It sits between applications and S3, caching frequently accessed data closer to compute while keeping S3 as the source of truth. The architecture is straightforward: → Data stays in S3 → Hot data is cached near applications → Workloads get faster access without a large migration Read more: https://lnkd.in/gQErDMHC #AIInfrastructure #S3 #ObjectStorage #MLOps #DataInfrastructure
-
-
For many AI teams, the goal is simple: ☁️ Keep data in object storage. 📍 Serve hot data closer to compute. That architecture helps teams avoid unnecessary migration while improving access for training, inference, feature stores, and RAG workloads. Alluxio sits between AI applications and object storage, caching frequently accessed data on high-performance storage close to compute. The result is a faster data path without changing the source of truth. Read more: https://lnkd.in/g3yap_a2 #AIInfrastructure #DataInfrastructure #MLOps #ObjectStorage
-
-
PyTorch training performance depends on more than the model. Slow data loading, remote storage, CPU preprocessing, and memory transfers can all leave GPUs waiting. A few places to look first: → I/O throughput → DataLoader time → CPU preprocessing → GPU utilization → Memory copy overhead For teams scaling AI training, the full pipeline needs to keep up. Read the guide: https://lnkd.in/g2nDQSqZ #PyTorch #AIInfrastructure #MLOps #DataInfrastructure
-
-
Modern data lake environments are rarely simple. For GEELY, data lived across public clouds, private clouds, HDFS, and OSS, supporting different workloads across the business. That created familiar challenges: → Complex data movement between clusters → Heavy synchronization work → Kerberos and Ranger configuration across environments → Slower migration from on-prem to cloud storage With Alluxio, Geely built a more efficient data lake architecture based on object storage. Alluxio provides a unified namespace across storage systems, simplifies HDFS-to-OSS access, and improves performance through distributed caching. Read more: https://lnkd.in/gV4Evu4S #DataLake #HybridCloud #DataInfrastructure #CloudStorage
-
-
A global top 10 e-commerce company was training search and recommendation models across multiple AWS regions and an on-prem data center. Their training data lived in S3 and had grown to hundreds of petabytes. The challenge was not model architecture. It was the data path. Training jobs faced storage and network bottlenecks, high S3 API and egress costs, and low GPU utilization. With Alluxio AI, the company achieved: ✦ Over 50% reduction in AWS S3 API and egress charges ✦ 20% improvement in GPU utilization ✦ Less operational complexity in the on-prem data center For AI infrastructure teams, this is a practical example of why data locality matters. Read the white paper: https://lnkd.in/gxYTKaw5 #AIInfrastructure #DataInfrastructure #GPU #MachineLearning
-
-
GenAI is not only changing how teams build models. It is also changing how people interact with enterprise data. For Uptycs, that meant enabling users to analyze large-scale telemetry data through natural language queries, powered by a GenAI text-to-SQL experience. But at that scale, the user experience still depends on the data layer underneath. Alluxio helps Uptycs accelerate access to data across S3 and HDFS, supporting faster analytics over massive operational datasets without requiring major changes to the existing architecture. Read the story: https://lnkd.in/g3DPyUNS #GenAI #DataInfrastructure #AIInfrastructure #Analytics
-
-
Checkpointing is the hidden tax on large-scale training. Most large training jobs checkpoint every few hundred to few thousand steps. When checkpoint writes are synchronous and the backend is remote, every cycle stalls the entire job on the slowest writer — and the GPUs wait. Alluxio AI 3.9, launching today, addresses this directly. → POSIX Write Cache: write-back caching on the POSIX path used by every major training framework. 7.6 GiB/s per node, 20 GiB/s across three nodes, sub-2ms P99. → RDMA support for read I/O: 92.8% of 200G InfiniBand link capacity, 99.0% of 400G NDR, sub-100µs P99 on 4KB reads. The throughline from Alluxio AI 3.8: faster writes, faster reads, no migration, no API changes. 💥 https://lnkd.in/gBEtmE6w #AIInfrastructure #GPU #MLOps #Checkpointing
-