Alibaba's Qwen-Image-2.0 doubles compression and cuts generation steps from 40 to 4

The Decoder

Jonathan Kemper

May 14, 2026, 09:17 AM

Alibaba's technical report on Qwen-Image-2.0 breaks down how the image model compresses images twice as aggressively as most competitors, stabilizes training with a reworked transformer, and uses a dedicated module that automatically expands short user input into detailed prompts. A distilled version needs just four denoising steps instead of 40. On LMArena, a platform where users run blind comparisons, Qwen-Image-2.0 currently ranks 9th. The article Alibaba's Qwen-Image-2.0 doubles compression and cuts generation steps from 40 to 4 appeared first on The Decoder.