Newest 'diffusion' Questions

0 votes

0 answers

14 views

Why does the DDPM noise predictor model require both the image and time step as input?

Title: Why does the DDPM noise predictor model require both the image and time step as input? Question: In DDPM (Denoising Diffusion Probabilistic Models), the model predicts noise in the denoising ...

459zyt

1

asked Nov 13, 2025 at 11:54

0 votes

0 answers

51 views

Why do we need the small-$\beta_t$ ( ≒ Gaussian $q(x_{t-1}\mid x_t)$ ) assumption in diffusion models

In diffusion models, the forward process is $q(x_t \mid x_{t-1}) = \mathcal{N}\big(x_{t};\sqrt{1-\beta_t} x_{t-1},\beta_t I\big)$, and the reverse model is parameterized as $p_\theta(x_{t-1}\mid x_t)=\...

user24200147

1

asked Sep 6, 2025 at 10:10

7 votes

1 answer

182 views

Probability of hitting time and additional time $X+Y$ in a diffusion process

Let $X$ follow an inverse Gaussian distribution, and $Y\mid X$ a Gaussian distribution. $$X \sim IG\left( \frac{\alpha}{v_X}, \frac{\alpha^2}{2D_X} \right)$$ $$Y_{\text{given $X=x$}} \sim \mathcal N(...

Sextus Empiricus

95k

asked Jul 30, 2025 at 11:51

1 vote

1 answer

80 views

What are the benefits of consistency loss in consistency model distillation?

When training consistency models with distillation, the loss is designed to drive the model to produce similar outputs on two consecutive points of the discretized probability flow ODE trajectory (eq. ...

Andrea Allais

181

asked May 6, 2025 at 14:25

0 votes

1 answer

115 views

Derive ELBO for Diffusion Models (MHVAE)

I'm trying to derive the ELBO (Evidence Lower Bound) based loss-function used for training Diffusion Models. The following equation(s) are from arXiv:2208.11970 Eq. 43 is written as follows: $$ \...

x.projekt

240

asked Mar 15, 2025 at 13:25

0 votes

0 answers

39 views

Score Matching Algorithim

I've been reading about score matching and I have a very basic question about how one would (naively) implement the algorithm via gradient descent. Say I have some sort of neural network that that ...

Vasting

155

asked Mar 14, 2025 at 18:59

1 vote

1 answer

229 views

Prove DDPM with an optimal noise prediction network has correct posterior

I am reading through the DDPM paper, and I am trying to understand the following. Imagine that $\epsilon_{\theta}(x_t,t)$ is our noise predictor. Further imagine that it is fully expressive, i.e., $\...

Max

11

asked Feb 18, 2025 at 5:21

0 votes

1 answer

309 views

Deriving the score of a diffusion model under DDIM

I am trying to understand how the linear relationship between the diffusion noise prediction model $\epsilon_\theta(x_t)$ which predicts noise added to a sample and the score function is derived $$\...

JustBlaze

57

asked Oct 22, 2024 at 5:21

4 votes

1 answer

303 views

Derivation of expectation with subscript

I am going through the derivation for Denoising Diffusion Probabilistic Models (DDPMs) based on Calvin Luo's Diffusion tutorial, where he finally develops the reconstruction term, the prior matching ...

randomforest42

41

asked Sep 3, 2024 at 17:14

2 votes

1 answer

231 views

How to derive Diffusion Model's reverse conditional probability when it's tractable via conditioning on $x_0$

Can anyone help me with understanding how the $\tilde{\beta}$ and ${\tilde\mu_t{(x_t, x_0)}}$ are derived? It seems to me that exponential term is a 2nd order polynomial term and it doesn't really ...

kaizerbox

21

asked Jul 16, 2024 at 20:37

4 votes

2 answers

1k views

In diffusion models (DDPM), if we predict the total noise, why not just remove the noise in one shot for sampling?

As pointed out by the DDPM paper, we can choose to reparameterize the prediction of the mean to prediction of the total noise "εθ is a function approximator intended to predict ε from x" (...

Daniel Mendoza

293

asked Jul 11, 2024 at 3:07

2 votes

0 answers

163 views

Does probability flow ODE trajectory (in the context of diffusion models) represents a bijective mapping between any distribution to a gaussian?

I have read several papers about diffusion models in the context of deep learning. especially this one As explained in the paper, by learning the score function (∇log(𝑝𝑡(𝑥))) ,probability flow ode ...

saleh

121

asked Jul 9, 2024 at 7:12

2 votes

1 answer

334 views

Why do we say that we're "predicting" the mean/noise in diffusion models?

In DDPM, ${\tilde\mu}_t$ is the mean of the conditional distribution $q(x_{t-1}|x_t,x_0)$ while the neural network $\mu_\theta$ is modeling a different conditional distribution $p_\theta(x_{t-1}|x_t)$....

Daniel Mendoza

293

asked Jul 8, 2024 at 16:35

2 votes

1 answer

155 views

Why is the forward process referred to as the "ground truth" in diffusion models?

I've seen in many tutorials on diffusion models refer to the distribution of the latent variables induced by the forward process as "ground truth". I wonder why. What we can actually see is ...

Daniel Mendoza

293

asked Jun 30, 2024 at 1:16

2 votes

2 answers

197 views

Why does Variational Inference work?

ELBO is a lower bound, and only matches the true likelihood when the q-distribution/encoder we choose equals to the true posterior distribution. Are there any guarantees that maximizing ELBO indeed ...

Daniel Mendoza

293

asked Jun 24, 2024 at 20:36

Stack Exchange Network

Questions tagged [diffusion]

Why does the DDPM noise predictor model require both the image and time step as input?

Why do we need the small-$\beta_t$ ( ≒ Gaussian $q(x_{t-1}\mid x_t)$ ) assumption in diffusion models

Probability of hitting time and additional time $X+Y$ in a diffusion process

What are the benefits of consistency loss in consistency model distillation?

Derive ELBO for Diffusion Models (MHVAE)

Score Matching Algorithim

Prove DDPM with an optimal noise prediction network has correct posterior

Deriving the score of a diffusion model under DDIM

Derivation of expectation with subscript

How to derive Diffusion Model's reverse conditional probability when it's tractable via conditioning on $x_0$

In diffusion models (DDPM), if we predict the total noise, why not just remove the noise in one shot for sampling?

Does probability flow ODE trajectory (in the context of diffusion models) represents a bijective mapping between any distribution to a gaussian?

Why do we say that we're "predicting" the mean/noise in diffusion models?

Why is the forward process referred to as the "ground truth" in diffusion models?

Why does Variational Inference work?

Hot Network Questions