👍My colleagues at Intel Corporation, Debendra Das Sharma, Gerald Pasdast, Sathya Tiagaraj, and Kemal Aygün published a paper titled "High-Performance, Power-Efficient Three-Dimensional System-in-Package Designs with Universal Chiplet Interconnect Express (#UCIe)," with Nature Portfolio. Excerpts: 📝The UCIe 1.0 specification defines interoperability using standard and advanced packaging technologies with planar interconnects. Here we examine the development of UCIe as bump pitches reduce with advances in packaging technologies for 3D integration of #chiplets. We report a die-to-die (#D2D) solution for the continuum of package bump pitches down to 1μm. 📝Our analysis suggests that—contrary to trends seen in traditional signaling interfaces—the most power-efficient performance for these architectures can be achieved by reducing the frequency as bump pitches go down. Our architectural approach provides power, performance, and reliability characteristics approaching/exceeding that of a monolithic system-on-chip design as bump pitches approach 1μm. 📝One recent key trend—especially for 3D packaging technologies, such as Hybrid Bonding (HB)—has been the aggressive shrinking of bump pitches between the chiplets and the consequent reduction of the corresponding #interconnect distances and their associated electrical parasitics. As bump pitches decrease, the area under each bump reduces, and the number of wires for a given area increases as a square of the bump-pitch reduction. With orders of magnitude in wire density increase and area reduction, an architectural approach completely different from UCIe 1.0 should be pursued. When architected correctly, interconnected chiplets with these low bump pitches will offer better latency and power characteristics than large monolithic dies and will offer the same benefits that Moore’s Law has provided with reduced transistor sizes for 50+ years. 📝The UCIe-3D approach is amenable to synthesis and automatic place-and-route tools and adaptable to a wide range of floor plans. It will be highly desirable to enable static timing analysis for timing closure of D2D crossing, and to facilitate that we suggest specifying timing at the HB bump boundary and continuing with the forwarded-clock architecture of UCIe-S and UCIe-A to establish a set of clock-to-data specifications at bump pins. 📝In this UCIe-3D architecture, each #chiplet can be connected to the chiplet above or below in a face-to-face, face-to-back, back-to-face, or back-to-back configuration. In non-face-to-face connection scenarios, signals need to travel through silicon vias. Further exploration is required into the development of silicon-via manufacturing and assembly technologies that can scale with the bump-pitch range. 🔍Observation: TSV isn't going anywhere. Until TGV comes. Additional reading: 🏷️Full paper: https://lnkd.in/gR5euCER 🏷️Heart of Glass: https://lnkd.in/gaDhG5fK 🏷️Chiplet (VII): https://lnkd.in/gyr6ZrV6 #IAmIntel #SiP #AI
Chiplet Technology Developments
Explore top LinkedIn content from expert professionals.
Summary
Chiplet technology developments represent a shift in semiconductor design, where smaller, modular chips—called chiplets—are combined within a package to deliver performance and flexibility beyond traditional single-chip approaches. By connecting multiple chiplets through advanced packaging methods like hybrid bonding and universal interconnects, this approach enables faster innovation and scalability for demanding applications such as AI and high-performance computing.
- Embrace modular design: Consider breaking down large chip functions into smaller, specialized chiplets to boost performance and allow for easier upgrades or repairs.
- Prioritize interoperability: Use standardized interconnects like UCIe to ensure seamless communication between chiplets from different vendors and support plug-and-play integration.
- Focus on testing and reliability: Build robust test strategies that check every chiplet and their connections throughout manufacturing and deployment to maintain system health and identify faults early.
-
-
HYBRID BONDING Hybrid bonding is rapidly emerging as a game-changer in advanced semiconductor packaging. It enables ultra-short vertical connections between dies, delivering major benefits in bandwidth, power efficiency, and scaling—especially for high-performance applications like AI chiplets and HBM. Despite its promise, mass adoption faces hurdles: high costs, front-end level precision for assembly, particle control, and thermal challenges. Still, the move from monolithic SoCs to chiplet-based designs is accelerating, thanks to hybrid bonding’s ability to integrate diverse technologies efficiently. As we push past power and bandwidth limits, hybrid bonding will be key to unlocking next-gen performance. 1. Hybrid Bonding Advantages: Enables submicron interconnect pitches Improves bandwidth, power efficiency, and thermal/electrical performance Provides better scalability than solder bump connections 2. Adoption Challenges: High cost limits mass adoption Requires front-end level precision in assembly (e.g., die placement) Needs improvements in defect control, die alignment, copper dishing, and particle management 3. Manufacturing Complexities: Hybrid bonding integrates front-end and back-end processes Testing is more difficult than with traditional bumped devices Speed binning and pre-sorting required in DRAM stacks 4. Market Drivers: Strong push from AI chiplets, DRAM, HBM, 3D NAND, and image sensors Enables disaggregated SoC design using chiplets on different process nodes Supports customization and cost-efficiency in advanced packaging 5. Power Management Needs: Growing thermal and power density (up to 500W/cm²) requires innovative solutions Shorter interconnects help reduce resistance and improve power delivery Integrated power management and high-voltage DC/DC conversion are key solutions 6. Future Outlook: Transition from hybrid bonding to sequential 3D integration Fusion bonding emerging as an alternative for certain applications Hybrid bonding seen as critical for next-gen chip architectures Fine-pitch hybrid bonding, even with backside power distribution, leads to high heat concentration that requires improved heat sinks. Source: imec
-
We’re no longer designing chips. We’re engineering ecosystems—across die, data, and dimension. From AMD’s Zen5-based 3D V-Cache to UCIe 2.0 and TSMC’s AI-powered 3Dblox workflows, the chiplet era isn’t just here—it’s evolving fast. Here’s how the game is shifting: 🔹 Vertical isn’t just about stacking—it’s about performance density. The Ryzen 9800X3D isn’t just faster—it’s architecturally smarter. • +500 MHz base clocks • 3x L3 cache via vertical die • Uniform latency from equidistant cache layers Result? 15–23% uplift in CPU-bound gaming without increasing power draw. This isn’t just adding cache. It’s about bringing it closer to intent—matching compute paths to workloads. 🔹 UCIe 2.0 is making chiplets truly modular. Forget proprietary socket dances—this is plug-and-play at silicon scale. • <1μm bump pitch = 82% lower latency • Unified DFx = seamless cross-vendor integration • FLIT-based links = 3x energy efficiency Hybrid bonding, protocol-agnostic transport, and thermal/power telemetry are the real infrastructure for composable computing. 🔹 AI is now a co-designer. With 3Dblox 2.1, TSMC is running electrothermal-stress convergence during layout. • 19% thermal improvement in early floorplanning. • 12–15°C lower hotspots—before tapeout. This means AI isn’t optimizing for benchmarks. It’s optimizing for reliability, yield, and lifecycle from day zero. 🔹 This all converges at one truth: 200B+ transistor designs can’t scale with human heuristics alone. - You need AI. - You need interoperability. - You need an abstraction that respects physics. So here’s the real challenge for engineers today: Are you designing for specs? Or for systems? From TSMC to AMD, we’re moving from “how fast this chip can go” to “how robust this stack is at scale.” If you’re in silicon, architecture, or AI-hardware convergence, this moment isn’t optional. It’s defining. Curious: Where do you see the biggest bottlenecks in multi-die design today? Thermals, testing, yield, integration? Let’s trade notes.
-
4 reasons Driving the Shift Toward Advanced Packaging? 1. Moore’s Law Slowdown For decades, the industry relied on shrinking transistors (Moore’s Law) to double performance every 18–24 months. But as we approach sub-3nm nodes, scaling becomes costlier, more complex, and yields drop. It’s no longer economically viable to put everything into one monolithic chip. ➤ Example: Intel and TSMC now integrate multiple smaller chips (chiplets) instead of one giant die. This allows them to continue performance gains without relying solely on node shrinkage. ➤ Analogy: Think of trying to build a mansion on a tiny plot of land — it gets harder and more expensive to squeeze more rooms (transistors) in. Advanced packaging is like building several smaller houses (chiplets) and connecting them with efficient roads (interconnects). 2. Need for Higher Performance and Energy Efficiency Modern applications — especially AI, 5G, AR/VR, and autonomous vehicles — require rapid data transfer between chips, low latency, and reduced power consumption. Advanced packaging allows chips (e.g., logic, memory, I/O) to be placed closer together, reducing signal travel distance, improving speed, and cutting power use. ➤ Example: NVIDIA’s H100 GPU uses HBM3 memory stacked closely using advanced packaging, which massively boosts bandwidth and energy efficiency. ➤ Analogy: It’s like relocating your kitchen, dining, and living areas closer together — less time and effort moving between them means faster and more efficient daily operations. 3. Demand from AI, HPC, and Data Centers AI training models (like ChatGPT), high-performance computing, and hyperscale data centers need massive processing and memory bandwidth — beyond what traditional packaging can deliver. Advanced packaging enables multi-die systems that behave like a single chip but are customized and scalable. ➤ Example: AMD’s EPYC processors use chiplet architecture — separate cores and I/O dies — to scale efficiently while reducing manufacturing cost and complexity. ➤ Analogy: Imagine one person trying to carry everything in a big suitcase (monolithic die). Instead, using multiple backpacks (chiplets) shared across a team (multi-die system) lets you carry more, faster, and more efficiently. 4. Rise of Chiplet-based Architectures to Reduce Cost and Improve Yield Instead of building a large, expensive chip with everything on it (which might fail in testing), companies now split the functions into smaller “chiplets”, manufactured separately and assembled into one package. This improves yield (less waste), flexibility (reuse components), and time-to-market. ➤ Example: Intel’s Meteor Lake uses chiplets built on different process nodes (e.g., TSMC for GPU, Intel for CPU), stitched together using Foveros 3D stacking. ➤ Analogy: It’s like assembling a laptop from modular parts (screen, keyboard, battery) — if one part fails, you can replace or improve just that part, rather than scrapping the entire system.
-
Why DFT will matter even more as chiplets, 3D stacking, and AI silicon take over? I grew up in test. And that lens never left. Chiplets break the “one die” mindset. → You are shipping a tiny system, not a part. DFT has to prove each die alone, then the links, then the entire stack, all at-speed. → 3D stacking adds new fault surfaces. → You need pre-bond, mid-bond, and post-bond test access. → Simple JTAG at the package edge won’t cut it. → Die wrappers and a clean test network inside the stack (IEEE 1838) become table stakes. AI accelerators raise the bar. → Wide fabrics. → Hot power profiles. → Fast clocks. → You need built-in self-test (BIST) that runs at-speed. → You need margin monitors, droop sensors, and thermal eyes that report in the field. Yield math turns ruthless. → One weak die can sink an entire stack. → DFT earns its keep with smart binning, repair, and graceful degrade. → Partial good is revenue when the use case allows. Supply chains are changing. → You will buy third-party chiplets. → Trust is now a test problem too. → Secure test access, data guards, and provenance checks must be designed in. Interfaces are the new fault lines. → Die-to-die links need loopbacks, PRBS, and eye checks you can trigger on demand. → Not just in the lab. In production and in the field. Standards give you a head start. → IEEE 1838 for 3D access. → IEEE 1687 for on-chip instruments. → UCIe features for link test and health. The mindset shift is simple. → Treat test as a product feature. → Expose health as telemetry. → Measure time to isolate a fault, not just coverage. → Feed real silicon data back into design every week. If you are building for chiplets or 3D today, start here: → Make every die a known good citizen. → Design a test backbone that reaches every layer. → Put eyes on the links. → Plan for in-field checks from day one. Want me to unpack a chiplet ready DFT plan with examples?
-
🧇Die-to-Wafer Hybrid Bonding — Key Enabler for the 3D Integration Era SemiVision: As pitch scaling approaches <10 μm and thin die stacking reaches up to 20 layers (30–50 μm each), traditional micro-bump interconnects are reaching their physical limits. This is where Die-to-Wafer (D2W) hybrid bonding steps in — enabling direct copper-to-copper connections for ultra-fine pitch and high-density integration. Tokyo Electron’s latest research, “Die-to-Wafer Advanced Packaging: Challenges for Integration, Yield, Placement Accuracy and Metrology”, explores the critical process innovations required for chiplet disaggregation and vertical stacking, from surface preparation and bonding precision to yield optimization and metrology. By combining Cu hybrid bonding and bump-less stacking, D2W technology paves the way for next-generation HBM, 3D SoC, and chiplet architectures, pushing semiconductor integration beyond the limits of traditional packaging. #AdvancedPackaging #3DIntegration #HybridBonding #TEL #TCB #Chiplet #HBM #Semiconductor #AI
-
Packaging bottlenecks for chiplets, heterogeneous integration, 2.5D/3D packaging, interposer and substrate design. Core packaging bottlenecks Die-to-die interconnect: Bandwidth density, latency, power per bit, equalization at fine pitches; UCIe vs AIB/BoW interoperability and PHY maturity. Power delivery and IR drop: PDN co-design across dies/interposer/substrate; decap placement limits; simultaneous switching noise. Thermals and warpage: Hotspots from asymmetric workloads; buried-die heat removal; CTE mismatch across silicon/organic/glass; assembly-induced stress. Yield multiplication: KGD insufficiency; “known good system” remains hard; redundancy/spare lanes and repair needed. Capacity and cost: Advanced packaging tool/OSAT constraints 2.5D packaging (interposers/bridges) Silicon interposers (CoWoS/SoIC/EMIB): Fine-pitch RDL for HBM and chiplets but high cost, TSV-induced stress, interposer yield, and reticle stitching complexity. Bridges (EMIB/Si-bridge): Localized high-density links reduce full interposer cost but add routing/placement constraints and SI/PI discontinuities. Glass interposers: Lower loss and better CTE vs organic; immature supply chain, via/RDL processes, and reliability data. Active vs passive interposers: Active aids retiming/voltage regulation but adds heat, complexity, and new failure domains. 3D stacking Vertical interconnect: Micro-bumps vs hybrid bonding (Cu–Cu) trade-offs in pitch, parasitics, yield; TSV keep-out zones hurt area. Thermal limits: Stacked logic/HBM create heat removal barriers; need heat vias, thermal TSVs, microfluidics, or die thinning. Power integrity: Tier-to-tier IR drop and resonances; backside power delivery helps but complicates thermal path and process flow. Assembly/yield: Wafer-to-wafer vs die-to-wafer choices; binning alignment; rework ability is low. Interposer and substrate design Signal integrity: Loss/crosstalk at multi-GHz; channel uniformity, impedance control, return paths; accurate S-parameter extraction. PDN architecture: multi-domain power islands, via farms, ground meshes; placement of on-interposer decaps and IVRs. Routing density: Fine L/S on interposer RDL vs limits of organic substrates; escape routing for HBM channels and wide UCIe links. Material choices: Organic (HDI) for cost, silicon for density, glass for low loss/CTE; reliability under temperature/humidity and power cycling. EM isolation: RF/analog coexistence with high-speed digital; guard rings, stitching vias, shielding layers, substrate noise control. Heterogeneous integration pain points Mixed nodes/materials: RF/analog on mature nodes with advanced-node logic; isolation from digital switching noise and supply ripple. Co-packaged optics: Thermal and mechanical co-design; fiber attach tolerances; contamination risk during assembly. Memory proximity: HBM bandwidth vs footprint/thermals; future NVRAM/3D SRAM integration challenges. Please reach out if you are facing any of these challenges
-
CHIPLET CPUs AS FIELD-CONDITIONED SYSTEMS: WHY ARROW LAKE CHANGES THE RULES OF PERFORMANCE TUNING Modern chiplet CPUs behave less like isolated compute islands and more like field‑conditioned networks. Their performance no longer depends on how fast a single core can toggle, but on how coherently the entire interconnect fabric can sustain that activity. It’s the same shift we see in quantum‑adaptive materials, where geometry and field structure—not just local energy scales—govern the effective dynamics. Intel’s Arrow Lake (Robert Hallock) makes this transition impossible to ignore. Overclocking is no longer a single‑variable exercise where raising core frequency automatically lifts the rest of the system. These processors are distributed systems packaged as a single device, and once you adopt chiplets, you inherit the full complexity of distributed computing: multiple latency domains, fabric synchronization challenges, inter‑die bandwidth ceilings, and clock‑domain boundaries. Arrow Lake simply exposes this reality to the end user. In earlier monolithic designs, increasing core frequency implicitly accelerated the internal fabric, cache paths, and memory‑controller interactions because everything lived on the same slab of silicon. Arrow Lake breaks that assumption. The compute tile, the SoC tile, and the cross‑die interconnect each operate in their own frequency and voltage domains. If any one of them lags behind, it becomes the bottleneck, even when the cores themselves remain perfectly stable at higher clocks. This is the core of Robert Hallock’s message: core frequency is no longer the dominant variable. The inter‑tile link is where this becomes most visible. The compute tile and the SoC tile communicate through a physical connection that behaves more like a miniature on‑package network than a traditional on‑die bus. When core clocks rise but the inter‑tile link remains at stock settings, the system enters a mismatch where cores request data faster than the fabric can deliver it. The result is simple: stalls, not speed. This is why some Arrow Lake overclocks show no performance gain despite higher multipliers. The bottleneck has merely shifted. Even the mechanical design reflects this new era. Arrow Lake introduces two non‑functional support dies under the heat spreader. They do not compute anything, but they ensure uniform mechanical pressure and eliminate large internal voids that would otherwise disrupt thermal behavior. As chiplets shrink and packages become more heterogeneous, maintaining predictable thermal and mechanical characteristics becomes harder. These support dies stabilize the environment so that cooling solutions behave consistently, which matters when pushing voltage and frequency margins. Understanding the geometry of the package is now as important as understanding the cores themselves. The structure of the system defines what the system can become. https://lnkd.in/evjq2TYg
How Overclocking Really Works on Intel CPUs | The Blueprint
https://www.youtube.com/
-
*** Chiplet-Based SoCs Bandwidth Problem *** Chiplets are a game changer—but they come with an often-overlooked tradeoff: "Bandwidth scaling isn’t free." In a monolithic SoC, data moves through ultra-high-bandwidth, low-latency on-die interconnects. When that same traffic crosses chiplet boundaries, several things change: o Serialization overhead adds "latency." o Interconnect "power" consumption increases per bit transferred. o Scaling "bandwidth" requires more package-level interconnect density. This is why chiplet success hinges on Interconnect innovation: o UCIe aims to make chiplet integration as seamless as possible. o Intel’s EMIB and Foveros use bridges and stacking to mitigate latency. o TSMC’s CoWoS and SoIC push the limits of bandwidth density. The challenge? There’s no one-size-fits-all solution. A chiplet interconnect optimized for AI acceleration won’t work for high-performance computing, and vice versa.
-
Advanced Packaging is the New Materials battleground. We’ve moved past monolithic chips. Today’s performance gains come from chiplet-based processors mixing CPUs, GPUs, accelerators, and memory in one package. But that leap hinges on materials breakthroughs we still haven’t mastered. → Interposers under fire. Organic build‑up films (ABF) warp at tight pitches and sap signal integrity. Glass and ceramic‑core interposers promise flatter, lower‑loss alternatives—yet scaling them and matching their CTE to silicon is a steep climb. → Die‑attach dilemma. Standard solders and epoxies crack under 3D stacking’s thermal/mechanical stress. We need die‑attach materials that cure at low temperature but stand up to 125 °C+ cycles without delaminating. → TIM bottleneck. Three‑dimensional stacks can push heat flux above 500 W/cm². Liquid‑infused nanocomposite TIMs and graphene‑enhanced interfaces look great in the lab, but integrating them into wafer‑level packaging without voids is a nightmare. → Through‑silicon vias & wafer packaging. Embedding TSVs demands dielectric liners that don’t fracture under thermal cycling. Ultra‑thin wafers only make the mismatch worse. The engineering community is racing on glass interposers, novel underfills, and nano‑TIMs. But until these materials scale reliably, packaging—not transistors—will throttle tomorrow’s computing power. Are materials scientists ready to fill these gaps? Or will advanced packaging remain the Achilles’ heel of chiplet performance? #AdvancedPackaging #HeterogeneousIntegration #ThermalManagement