VAR

Created on March 25, 2025

2025 · Autoregressive, VAE · Generative

VAR: AR via Next-Scale Prediction

Next-Scale Prediction

Reconceptualize the autoregressive modeling on images by shifting from “next-token prediction” to “next-scale prediction” strategy.

The autoregressive unit is an entire token map, rather than a single token.

interpolate → resize f, z_k to ($h_k,w_k$)

VAR Transformer

$r_k$ : Token map (可以理解为一个r_k中包含了一组 tokens)
- e.g. $r_3$ 有9个tokens（9个token会被并行的预测出来）

Time Complexity

Enjoy Reading This Article?

Here are some more articles you might like to read next:

Pretrain Diffusion

Latent Diffusion