Online Learning to Transport via the Minimal Selection Principle

Motivated by robust dynamic resource allocation in operations research, we study the Online Learning to Transport (OLT) problem where the decision variable is a probability measure, an infinite-dimensional object. We draw connections between online learning, optimal transport, and partial differential equations through an insight called the minimal selection principle, originally studied in the Wasserstein gradient flow setting by Ambrosio et al. (2005).

Detecting Weak Distribution Shifts via Displacement Interpolation

Detecting weak, systematic distribution shifts and quantitatively modeling individual, heterogeneous responses to policies or incentives have found increasing empirical applications in social and economic sciences. We propose a model for weak distribution shifts via displacement interpolation, drawing from the optimal transport theory.

Denoising Diffusions with Optimal Transport: Localization, Curvature, and Multi-Scale Complexity

Adding noise is easy; what about denoising? Diffusion is easy; what about reverting a diffusion? We provide a fine-grained analysis of the diffuse-then-denoise process. We discover a notion of multi-scale curvature complexity that collectively determines the success or failure mode of probabilistic diffusion models.

No-Regret Generative Modeling via Parabolic Monge-Ampère PDE

We introduce a novel generative modeling framework called parabolic Monge-Ampère PDE sampler. We establish theoretical guarantees for generative modeling through the lens of no-regret analysis, demonstrating that the iterates converge to the optimal Brenier map under a variety of step-size schedules. We derive a new Evolution Variational Inequality connecting geometry, transportation cost, and regret.

Distributional Shrinkage I: Universal Denoiser Beyond Tweedie's Formula

Empirical Bayes tends to produce overly aggressive shrinkage as a denoiser. We introduce new denoisers that optimally shrink the distribution toward the true signal distribution with order-of-magnitude improvements. Unlike empirical Bayes denoiser, our denoisers are universal and agnostic to the signal and noise distributions. One immediate application of our distributional shrinkage theory is to enhance generative modeling: we can replace the stochastic backward diffusion process with optimal deterministic denoisers to achieve higher-order accuracy.

Distributional Shrinkage II: Higher-Order Scores Encode Brenier Map

We revisit the classic signal denoising problem through the lens of optimal transport. We introduce a hierarchy of denoisers that are agnostic to the signal distribution, depending only on higher-order score functions of the noisy observations. Each denoiser is progressively refined using higher-order score functions, achieving better denoising quality measured by the Wasserstein metric. The limiting denoiser identifies the optimal transport map for signal denoising. Our results connect information geometry, optimal transport, and advanced combinatorics.