logo
0
0
Login

Qwen-Image-Lightning

We are excited to release the distilled version of Qwen-Image. It preserves the capability of complex text rendering.

🔥 Latest News

📑 Todo List

  • Qwen-Image-Lightning-8steps-V1.1
  • Qwen-Image-Lightning-8steps-V1.0
  • Qwen-Image-Lightning-4steps-V1.0
  • ComfyUI Workflow
  • Improve Quality

📑 Performance Report

To assess the distilled models' performance characteristics, including their strengths and limitations, we compare the performance of the three models, i.e., Qwen-Image, Qwen-Image-Lightning-8steps-V1.1, and Qwen-Image-Lightning-4steps-V1.0, in different scenarios. The results can be reproduced following the section below.

- Quality and Speed

Compared to the base model, the distilled models (8-step and 4-step) deliver a 12–25× speed improvement with no significant loss in performance in most cases.

PromptBase NFE=1008steps-V1.1 NFE=84steps-V1.0 NFE=4
一个会议室,墙上写着"3.14159265-358979-32384626-4338327950",一个小陀螺在桌上转动。111112113
宫崎骏的动漫风格。平视角拍摄,阳光下的古街热闹非凡。一个穿着青衫、手里拿着写着“阿里云”卡片的逍遥派弟子站在中间。旁边两个小孩惊讶的看着他。左边有一家店铺挂着“云存储”的牌子,里面摆放着发光的服务器机箱,门口两个侍卫守护者。右边有两家店铺,其中一家挂着“云计算”的牌子,一个穿着旗袍的美丽女子正看着里面闪闪发光的电脑屏幕;另一家店铺挂着“云模型”的牌子,门口放着一个大酒缸,上面写着“千问”,一位老板娘正在往里面倒发光的代码溶液。121122123
一副典雅庄重的对联悬挂于厅堂之中,房间是个安静古典的中式布置,桌子上放着一些青花瓷,对联上左书“义本生知人机同道善思新”,右书“通云赋智乾坤启数高志远”, 横批“智启通义”,字体飘逸,中间挂在一着一副中国风的画作,内容是岳阳楼。131132133
A movie poster. The first row is the movie title, which reads “Imagination Unleashed”. The second row is the movie subtitle, which reads “Enter a world beyond your imagination”. The third row reads “Cast: Qwen-Image”. The fourth row reads “Director: The Collective Imagination of Humanity”. The central visual features a sleek, futuristic computer from which radiant colors, whimsical creatures, and dynamic, swirling patterns explosively emerge, filling the composition with energy, motion, and surreal creativity. The background transitions from dark, cosmic tones into a luminous, dreamlike expanse, evoking a digital fantasy realm. At the bottom edge, the text “Launching in the Cloud, August 2025” appears in bold, modern sans-serif font with a glowing, slightly transparent effect, evoking a high-tech, cinematic aesthetic. The overall style blends sci-fi surrealism with graphic design flair—sharp contrasts, vivid color grading, and layered visual depth—reminiscent of visionary concept art and digital matte painting, 32K resolution, ultra-detailed.141142143
一张企业级高质量PPT页面图像,整体采用科技感十足的星空蓝为主色调,背景融合流动的发光科技线条与微光粒子特效,营造出专业、现代且富有信任感的品牌氛围;页面顶部左侧清晰展示橘红色Alibaba标志,色彩鲜明、辨识度高。主标题位于画面中央偏上位置,使用大号加粗白色或浅蓝色字体写着“通义千问视觉基础模型”,字体现代简洁,突出技术感;主标题下方紧接一行楷体中文文字:“原生中文·复杂场景·自动布局”,字体柔和优雅,形成科技与人文的融合。下方居中排布展示了四张与图片,分别是:一幅写实与水墨风格结合的梅花特写,枝干苍劲、花瓣清雅,背景融入淡墨晕染与飘雪效果,体现坚韧不拔的精神气质;上方写着黑色的楷体"梅傲"。一株生长于山涧石缝中的兰花,叶片修长、花朵素净,搭配晨雾缭绕的自然环境,展现清逸脱俗的文人风骨;上方写着黑色的楷体"兰幽"。一组迎风而立的翠竹,竹叶随风摇曳,光影交错,背景为青灰色山岩与流水,呈现刚柔并济、虚怀若谷的文化意象;上方写着黑色的楷体"竹清"。一片盛开于秋日庭院的菊花丛,花色丰富、层次分明,配以落叶与古亭剪影,传递恬然自适的生活哲学;上方写着黑色的楷体"菊淡"。所有图片采用统一尺寸与边框样式,呈横向排列。页面底部中央用楷体小字写明“2025年8月,敬请期待”,排版工整、结构清晰,整体风格统一且细节丰富,极具视觉冲击力与品牌调性。151152153

- Dense or Small Text Rendering

In scenarios involving dense or small text, the base model is more likely to produce better results.

PromptBase NFE=1008steps-V1.1 NFE=84steps-V1.0 NFE=4
一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。她身后的玻璃板上手写体写着 “一、Qwen-Image的技术路线: 探索视觉生成基础模型的极限,开创理解与生成一体化的未来。二、Qwen-Image的模型特色:1、复杂文字渲染。支持中英渲染、自动布局; 2、精准图像编辑。支持文字编辑、物体增减、风格变换。三、Qwen-Image的未来愿景:赋能专业内容创作、助力生成式AI发展。”211212213

- Hair-like Details

In scenes containing hair-like details, the base model demonstrates superior rendering fidelity, whereas the distilled models may yield outputs that appear either noticeably blurred or excessively sharpened.

PromptBase NFE=1008steps-V1.1 NFE=84steps-V1.0 NFE=4
A capybara wearing a suit holding a sign that reads Hello World.311312313

- Highly Complex Scenes

In highly complex scenes, all three models may fail to produce satisfactory results.

PromptBase NFE=1008steps-V1.1 NFE=84steps-V1.0 NFE=4
"A vibrant, warm neon-lit street scene in Hong Kong at the afternoon, with a mix of colorful Chinese and English signs glowing brightly. The atmosphere is lively, cinematic, and rain-washed with reflections on the pavement. The colors are vivid, full of pink, blue, red, and green hues. Crowded buildings with overlapping neon signs. 1980s Hong Kong style. Signs include: "龍鳳冰室" "金華燒臘" "HAPPY HAIR" "鴻運茶餐廳" "EASY BAR" "永發魚蛋粉" "添記粥麵" "SUNSHINE MOTEL" "美都餐室" "富記糖水" "太平館" "雅芳髮型屋" "STAR KTV" "銀河娛樂城" "百樂門舞廳" "BUBBLE CAFE" "萬豪麻雀館" "CITY LIGHTS BAR" "瑞祥香燭莊" "文記文具" "GOLDEN JADE HOTEL" "LOVELY BEAUTY" "合興百貨" "興旺電器" And the background is warm yellow street and with all stores' lights on.411412413

- Inconsistencies in Model Rankings Across Test Cases

Test results may vary across different cases. In certain test instances, the base model may perform better, whereas in others, the distilled models may achieve superior results. Even for the same prompt at different resolutions, the relative performance ranking of the models may differ substantially.

PromptBase NFE=1008steps-V1.1 NFE=84steps-V1.0 NFE=4
A young girl wearing school uniform stands in a classroom, writing on a chalkboard. The text "Introducing Qwen-Image, a foundational image generation model that excels in complex text rendering and precise image editing" appears in neat white chalk at the center of the blackboard. Soft natural light filters through windows, casting gentle shadows. The scene is rendered in a realistic photography style with fine details, shallow depth of field, and warm tones. The girl's focused expression and chalk dust in the air add dynamism. Background elements include desks and educational posters, subtly blurred to emphasize the central action. Ultra-detailed 32K resolution, DSLR-quality, soft bokeh effect, documentary-style composition.511512513
A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197".611612613
A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197".621622623

🚀 Run Evaluation and Test

Installation

Please follow Qwen-Image to install the Python Environment and download the Base Model.

Model Download

Download models using huggingface-cli:

pip install "huggingface_hub[cli]" huggingface-cli download lightx2v/Qwen-Image-Lightning --local-dir ./Qwen-Image-Lightning

Run 8-step Model

# 8 steps, cfg 1.0 python generate_with_diffusers.py \ --prompt_list_file examples/prompt_list.txt \ --out_dir test_lora_8_step_results \ --lora_path Qwen-Image-Lightning/Qwen-Image-Lightning-8steps-V1.0.safetensors \ --base_seed 42 --steps 8 --cfg 1.0

Run 4-step Model

# 4 steps, cfg 1.0 python generate_with_diffusers.py \ --prompt_list_file examples/prompt_list.txt \ --out_dir test_lora_4_step_results \ --lora_path Qwen-Image-Lightning/Qwen-Image-Lightning-4steps-V1.0.safetensors \ --base_seed 42 --steps 4 --cfg 1.0

Run base Model

# 50 steps, cfg 4.0 python generate_with_diffusers.py \ --prompt_list_file examples/prompt_list.txt \ --out_dir test_base_results \ --base_seed 42 --steps 50 --cfg 4.0

🎨 ComfyUI Workflow

ComfyUI workflow is available in the workflows/ directory. The workflow is based on the Qwen-Image ComfyUI tutorial and has been verified with ComfyUI repository at commit ID 37d620a6b85f61b824363ed8170db373726ca45a.

Workflow Files

  • workflows/qwen-image-8steps.json - 8-step lightning workflow for Qwen-Image
  • workflows/qwen-image-4steps.json - 4-step lightning workflow for Qwen-Image

Usage

  1. Install ComfyUI following the official instructions
  2. Download and place the Qwen-Image base model following the Qwen-Image ComfyUI tutorial (include UNet/CLIP/VAE files into proper ComfyUI folders)
  3. For 8-step workflow:
    • Load workflows/qwen-image-8steps.json
    • Put Qwen-Image-Lightning-8steps-V1.0.safetensors into ComfyUI/models/loras/
    • Ensure KSampler steps = 8
  4. For 4-step workflow:
    • Load workflows/qwen-image-4steps.json
    • Put Qwen-Image-Lightning-4steps-V1.0.safetensors into ComfyUI/models/loras/
    • Ensure KSampler steps = 4
  5. Run the workflow to generate images

License Agreement

The models in this repository are licensed under the Apache 2.0 License. We claim no rights over your generated contents, granting you the freedom to use them while ensuring that your usage complies with the provisions of this license. You are fully accountable for your use of the models, which must not involve sharing any content that violates applicable laws, causes harm to individuals or groups, disseminates personal information intended for harm, spreads misinformation, or targets vulnerable populations. For a complete list of restrictions and details regarding your rights, please refer to the full text of the license.

Acknowledgements

We built upon and reused code from the following projects: Qwen-Image, licensed under the Apache License 2.0.

The evaluation text prompts are from Qwen-Image, Qwen-Image Blog and Qwen-Image-Service.

Star History

Star History Chart

About

https://github.com/ModelTC/Qwen-Image-Lightning/

180.00 KiB
0 forks0 stars1 branches0 TagREADMEApache-2.0 license
Language
License35.6%
Python31.7%
Markdown29.9%
Others2.8%