Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 11064 publications
    CrossCheck: Input Validation for WAN Control Systems
    Rishabh Iyer
    Isaac Keslassy
    Sylvia Ratnasamy
    Networked Systems Design and Implementation (NSDI) (2026) (to appear)
    Preview abstract We present CrossCheck, a system that validates inputs to the Software-Defined Networking (SDN) controller in a Wide Area Network (WAN). By detecting incorrect inputs—often stemming from bugs in the SDN control infrastructure—CrossCheck alerts operators before they trigger network outages. Our analysis at a large-scale WAN operator identifies invalid inputs as a leading cause of major outages, and we show how CrossCheck would have prevented those incidents. We deployed CrossCheck as a shadow validation system for four weeks in a production WAN, during which it accurately detected the single incident of invalid inputs that occurred while sustaining a 0% false positive rate under normal operation, hence imposing little additional burden on operators. In addition, we show through simulation that CrossCheck reliably detects a wide range of invalid inputs (e.g., detecting demand perturbations as small as 5% with 100% accuracy) and maintains a near-zero false positive rate for realistic levels of noisy, missing, or buggy telemetry data (e.g., sustaining zero false positives with up to 30% of corrupted telemetry data). View details
    Preview abstract How many T gates are needed to approximate an arbitrary n-qubit quantum state to within a given precision ϵ? Improving prior work of Low, Kliuchnikov and Schaeffer, we show that the optimal asymptotic scaling is Θ(sqrt{2^n log(1/ε)} + log(1/ε)) if we allow an unlimited number of ancilla qubits. We also show that this is the optimal T-count for implementing an arbitrary diagonal n-qubit unitary to within error ϵ. We describe an application to batched synthesis of single-qubit unitaries: we can approximate a tensor product of m = O(log log(1/ϵ)) arbitrary single-qubit unitaries to within error ϵ with the same asymptotic T-count as is required to approximate just one single-qubit unitary. View details
    Productionizing Quantum Mass Production
    Bill Huggins
    Nathan Wiebe
    arXiv for now (2026) (to appear)
    Preview abstract For many practical applications of quantum computing, the slowest and most costly steps involve coherently accessing classical data. We help address this challenge by applying mass production techniques, which can sometimes allow us to perform operations many times in parallel for a cost that is comparable to a single execution[1-3]. We combine existing mass-production results with modern approaches for loading classical data using ``quantum read-only memory.'' We show that quantum mass production techniques offer no benefit when we consider a cost model that focuses purely on the number of non-Clifford gates. However, analyzing the constant factors in a more nuanced cost model, we find that it may be possible to obtain a reduction in cost of an order or magnitude or more for a variety reasonably-sized fault-tolerant quantum algorithms. We present several applications of quantum mass-production techniques beyond naive parallelization, including a strategy for reducing the cost of serial calls to the same data loading step. View details
    Preview abstract Semantic data models express high-level business concepts and metrics, capturing the business logic needed to query a database correctly. Most data modeling solutions are built as layers above SQL query engines, with bespoke query languages or APIs. The layered approach means that semantic models can’t be used directly in SQL queries. This paper focuses on an open problem in this space – can we define semantic models in SQL, and make them naturally queryable in SQL? In parallel, graph query is becoming increasingly popular, including in SQL. SQL/PGQ extends SQL with an embedded subset of the GQL graph query language, adding property graph views and making graph traversal queries easy. We explore a surprising connection: semantic data models are graphs, and defining graphs is a data modeling problem. In both domains, users start by defining a graph model, and need query language support to easily traverse edges in the graph, which means doing joins in the underlying data. We propose some useful SQL extensions that make it easier to use higher-level data model abstractions in queries. Users can define a “semantic data graph” view of their data, encapsulating the complex business logic required to query the underlying tables correctly. Then they can query that semantic graph model easily with SQL. Our SQL extensions are useful independently, simplifying many queries – particularly, queries with joins. We make declared foreign key relationships usable for joins at query time – a feature that seems obvious but is notably missing in standard SQL. In combination, these extensions provide a practical approach to extend SQL incrementally, bringing semantic modeling and graph query together with the relational model and SQL. View details
    FreshBrew: A Benchmark for Evaluating AI Agents on Java Code Migration
    Diganta Misra
    Yanqi Luo
    Anjali Sridhar
    Justine Gehring
    Silvio Soares Ribeiro Junior
    2026
    Preview abstract AI coding assistants are rapidly becoming integral to modern software development. A key challenge in this space is the continual need to migrate and modernize codebases in response to evolving software ecosystems. Traditionally, such migrations have relied on rule-based systems and human intervention. With the advent of powerful large language models (LLMs), AI-driven agentic frameworks offer a promising alternative—but their effectiveness remains underexplored. In this paper, we introduce FreshBrew, a novel benchmark for evaluating AI-based agentic frameworks on project-level Java migrations. We benchmark several such frameworks, powered by state-of-the-art LLMs, and compare their performance against established rule-based tools. Our evaluation of AI agents on this benchmark of 228 repositories shows that the top-performing model, Gemini 2.5 Flash, can successfully migrate 56.5% of projects to JDK 17. Our empirical analysis reveals novel insights into the critical strengths and limitations of current agentic approaches, offering actionable insights into their real-world applicability. By releasing FreshBrew publicly upon acceptance, we aim to facilitate rigorous, reproducible evaluation and catalyze progress in AI-driven codebase modernization. View details
    Preview abstract The paper is a simple application paper that explores the use of mixture density networks for modeling coupled soil moisture sensors. In the following I attached the abstract of the paper. ## Abstract Soil moisture is a key variable for a range of hydrological and ecological processes, yet capturing its small-scale variability and preferential flow phenomena remains challenging. Recent advancements in deep learning have demonstrated potential in predicting hydrological variables, but conventional data-driven models often struggle to represent small-scale variability effectively. In this study, we integrate Long-Short Term Memory (LSTMs) networks and Gaussian Mixture Models (GMMs) to simulate soil moisture dynamics while explicitly quantifying its associated variability. Unlike deterministic approaches, our probabilistic framework accounts for nonlinear relationships between inputs and outputs while modeling the inherent small-scale variability in soil moisture. We apply this methodology to an in-situ soil moisture dataset from the Attert experimental basin, where the experimental design clusters three soil moisture profiles for each location and depth, providing an ideal framework for examining small-scale variability of soil moisture across distinct geological and pedological settings. Our results demonstrate that the proposed model effectively reproduces soil moisture dynamics across multiple depths and scales, achieving an average Kling-Gupta Efficiency (KGE) of 0.52, Rank Correlation of 0.72, and Root Mean Squared Error of 0.036 m3m-3, while also capturing the key aspects of small-scale variability and sensor uncertainty. Furthermore, the modeled distributions offer new insights into the spatiotemporal structure of soil moisture and underscore the value of probabilistic modeling in hydrological approaches. By explicitly incorporating small-scale variability into the modeling process, our approach enhances both the interpretability and reliability of soil moisture predictions. While LSTMs effectively capture temporal dynamics, our findings underscore the necessity of incorporating variability quantification to improve model accuracy and generalization. This study highlights the potential of probabilistic deep learning frameworks in hydrological modeling and supports their broader application for improved soil moisture estimation and variability assessment. View details
    On the Rational Degree of Boolean Functions and Applications
    Vishnu Iyer
    Siddhartha Jain
    Matt Kovacs-Deak
    Vinayak Kumar
    Luke Schaeffer
    Daochen Wang
    Michael Whitmeyer
    (2025)
    Preview abstract We study a natural complexity measure of Boolean functions known as the (exact) rational degree. For total functions f, it is conjectured that rdeg (f) is polynomially related to deg (f), where deg (f) is the Fourier degree. Towards this conjecture, we show that symmetric functions have rational degree at least deg (f)/2 and monotone functions have rational degree at least deg(f). We observe that both of these lower bounds are tight. In addition, we show that all read-once depth-d Boolean formulae have rational degree at least Ω( deg (f )^1/d). Furthermore, we show that almost every Boolean function on n variables has rational degree at least n/2 − O(√n). In contrast to total functions, we exhibit partial functions that witness unbounded separations between rational and approximate degree, in both directions. As a consequence, we show that for quantum computers, post-selection and bounded-error are incomparable resources in the black-box model. View details
    Preview abstract Transformers, while powerful, suffer from quadratic computational complexity and the ever-growing Key-Value (KV) cache of the attention mechanism. This paper introduces Trellis, a novel Transformer architecture with bounded memory that learns how to compress its key-value memory dynamically at test time. Trellis replaces the standard KV cache with a fixed-size memory and train a two-pass recurrent compression mechanism to store new keys and values into memory. To achieve this, it leverages an online gradient descent procedure with a forget gate, enabling the compressed memory to be updated recursively while learning to retain important contextual information from incoming tokens at test time. Extensive experiments on language modeling, common-sense reasoning, recall-intensive tasks, and time series show that the proposed architecture outperforms strong baselines. Notably, its performance gains increase as the sequence length increases, highlighting its potential for long-context applications. View details
    LLMs achieve adult human performance on higher-order theory of mind tasks
    Oliver Siy
    Geoff Keeling
    Benjamin Barnett
    Michael McKibben
    Tatenda Kanyere
    Robin I.M. Dunbar
    Frontiers in Human Neuroscience (2025)
    Preview abstract This paper examines the extent to which large language models (LLMs) are able to perform tasks which require higher-order theory of mind (ToM)–the human ability to reason about multiple mental and emotional states in a recursive manner (e.g. I think that you believe that she knows). This paper builds on prior work by introducing a handwritten test suite – Multi-Order Theory of Mind Q&A – and using it to compare the performance of five LLMs of varying sizes and training paradigms to a newly gathered adult human benchmark. We find that GPT-4 and Flan-PaLM reach adult-level and near adult-level performance on our ToM tasks overall, and that GPT-4 exceeds adult performance on 6th order inferences. Our results suggest that there is an interplay between model size and finetuning for higher-order ToM performance, and that the linguistic abilities of large models may support more complex ToM inferences. Given the important role that higher-order ToM plays in group social interaction and relationships, these findings have significant implications for the development of a broad range of social, educational and assistive LLM applications. View details
    Approximate Optimal Mechanism Design via the Interim Relaxation
    Alex Psomas
    Marios Mertzanidis
    Divyarthi Mohan
    NeurIPS (2025)
    Preview abstract We study revenue maximization for agents with additive preferences, subject to downward-closed constraints on the set of feasible allocations. In seminal work, Alaei [Ala14] introduced a powerful multi-to-single agent reduction based on an ex-ante relaxation of the multi-agent problem. This reduction employs a rounding procedure which is an online contention resolution scheme (OCRS) in disguise, a now widely-used method for rounding fractional solutions in online Bayesian and stochastic optimization problems. In this paper, we leverage our vantage point, 10 years after the work of Alaei, with a rich OCRS toolkit and modern approaches to analyzing multi-agent mechanisms; we introduce a general framework for designing non-sequential and sequential multi-agent, revenue-maximizing mechanisms, capturing a wide variety of problems Alaei’s framework could not address. Our framework uses an interim relaxation, that is rounded to a feasible mechanism using what we call a two-level OCRS, which allows for some structured dependence between the activation of its input elements. For a wide family of constraints, we can construct such schemes using existing OCRSs as a black box; for other constraints, such as knapsack, we construct such schemes from scratch. We demonstrate numerous applications of our framework, including a sequential mechanism that guarantees a 2e/(e−1) ≈ 3.16 approximation to the optimal revenue for the case of additive agents subject to matroid feasibility constraints. The simplicity of our developed two-level CRSs and OCRSs highlights the strength of our framework: even with a simple analysis, it yields state-of-the-art approximation guarantees across a wide range of settings. Finally, we show how it naturally extends to multi-parameter procurement auctions. View details
    Quantum algorithm for linear matrix equations
    Rolando Somma
    Guang Hao Low
    Dominic Berry
    arXiv (2025)
    Preview abstract We describe an efficient quantum algorithm for solving the linear matrix equation AX+XB=C, where A, B, and C are given complex matrices and X is unknown. This is known as the Sylvester equation, a fundamental equation with applications in control theory and physics. Our approach constructs the solution matrix X/x in a block-encoding, where x is a rescaling factor needed for normalization. This allows us to obtain certain properties of the entries of X exponentially faster than would be possible from preparing X as a quantum state. The query and gate complexities of the quantum circuit that implements this block-encoding are almost linear in a condition number that depends on A and B, and depend logarithmically in the dimension and inverse error. We show how our quantum circuits can solve BQP-complete problems efficiently, discuss potential applications and extensions of our approach, its connection to Riccati equation, and comment on open problems. View details
    Preview abstract First-order optimization methods tend to inherently favor certain solutions over others when minimizing an underdetermined training objective that has multiple global optima. This phenomenon, known as implicit bias, plays a critical role in understanding the generalization capabilities of optimization algorithms. Recent research has revealed that in separable binary classification tasks gradient-descent-based methods exhibit an implicit bias for the L2-maximal margin classifier. Similarly, generic optimization methods, such as mirror descent and steepest descent, have been shown to converge to maximal margin classifiers defined by alternative geometries. While gradient-descent-based algorithms provably achieve fast implicit bias rates, corresponding rates in the literature for generic optimization methods are relatively slow. To address this limitation, we present a series of state-of-the-art implicit bias rates for mirror descent and steepest descent algorithms. Our primary technique involves transforming a generic optimization algorithm into an online optimization dynamic that solves a regularized bilinear game, providing a unified framework for analyzing the implicit bias of various optimization methods. Our accelerated rates are derived by leveraging the regret bounds of online learning algorithms within this game framework. We then show the flexibility of this framework by analyzing the implicit bias in adversarial training, and again obtain significantly improved convergence rates. View details
    Permission Rationales in the Web Ecosystem: An Exploration of Rationale Text and Design Patterns
    Yusra Elbitar
    Soheil Khodayari
    Marian Harbach
    Gianluca De Stefano
    Balazs Engedy
    Giancarlo Pellegrino
    Sven Bugiel
    CHI 2025, ACM
    Preview abstract Modern web applications rely on features like camera and geolocation for personalized experiences, requiring user permission via browser prompts. To explain these requests, applications provide rationales—contextual information on why permissions are needed. Despite their importance, little is known about how rationales appear on the web or their influence on user decisions. This paper presents the first large-scale study of how the web ecosystem handles permission rationales, covering three areas: (i) identifying webpages that use permissions, (ii) detecting and classifying permission rationales, and (iii) analyzing their attributes to understand their impact on user decisions. We examined over 770K webpages from Chrome telemetry, finding 3.6K unique rationale texts and 749 rationale UIs across 85K pages. We extracted key rationale attributes and assessed their effect on user behavior by cross-referencing them with Chrome telemetry data. Our findings reveal nine key insights, providing the first evidence of how different rationales affect user decisions. View details
    Applying multimodal AI to physiological waveforms improves genetic prediction of cardiovascular traits
    Yuchen Zhou
    Mahantesh I. Biradar
    Jacqueline Shreibati
    Dongbing Lai
    Tae-Hwi Schwantes-An
    Robert Luben
    Zachary R. McCaw
    Jorgen Engmann
    Rui Providencia
    Amand Floriaan Schmidt
    Patricia B. Munroe
    Howard Yang
    Andrew Carroll
    Anthony Khawaja
    Babak Behsaz
    American Journal of Human Genetics, 112 (2025), pp. 1562 - 1579
    Preview abstract Electronic health records, biobanks, and wearable biosensors enable the collection of multiple health modalities from many individuals. Access to multimodal health data provides a unique opportunity for genetic studies of complex traits because different modalities relevant to a single physiological system (e.g., circulatory system) encode complementary and overlapping information. We propose a multimodal deep learning method, multimodal representation learning for genetic discovery on low-dimensional embeddings (M-REGLE), for discovering genetic associations from a joint representation of complementary electrophysiological waveform modalities. M-REGLE jointly learns a lower representation (i.e., latent factors) of multimodal physiological waveforms using a convolutional variational autoencoder, performs genome-wide association studies (GWASs) on each latent factor, then combines the results to study the genetics of the underlying system. To validate the advantages of M-REGLE and multimodal learning, we apply it to common cardiovascular modalities (photoplethysmogram [PPG] and electrocardiogram [ECG]) and compare its results to unimodal learning methods in which representations are learned from each data modality separately but are statistically combined for downstream genetic comparison. M-REGLE identifies 19.3% more loci on the 12-lead ECG dataset, 13.0% more loci on the ECG lead I + PPG dataset, and its genetic risk score significantly outperforms the unimodal risk score at predicting cardiac phenotypes, such as atrial fibrillation (Afib), in multiple biobanks. View details
    Preview abstract Electrocardiograms (ECGs) are fundamental to cardiac diagnostics, providing noninvasive insights into cardiovascular conditions. Recent advancements in deep learning have led to foundation models (FMs) capable of learning powerful representations of ECG signals. However, these models often fail to fully exploit the periodic nature and diagnostic frequency bands of ECGs, leading to inefficiencies in computational cost and interpretability. We propose a novel ECG foundation model that learns nested embeddings, where each subset of dimensions encodes progressively higher-frequency information. By explicitly modeling frequency structures and applying a correlation penalty, the method achieves compact, high-rank representations that reduce model size without sacrificing performance. We evaluate our approach on two large-scale datasets for embedding redundancy and prediction performance on downstream clinical tasks such as arrhythmia classification, and cardiac condition detection. We observe similar prediction performance AUROC scores and lower embedding redundancy, offering a computationally efficient and interpretable framework for ECG analysis. Finally, the representations obtained from our model in UK Biobank data capture known cardiovascular variants and detect novel loci, which can be applied to drug discovery. View details
    ×