logo
0
0
Login
bubbliiiing<3323290568@qq.com>
Update Readme

Z-Image-Turbo-Fun-Controlnet-Union

Github

Model Features

  • This ControlNet is added on 6 blocks.
  • The model was trained from scratch for 10,000 steps on a dataset of 1 million high-quality images covering both general and human-centric content. Training was performed at 1328 resolution using BFloat16 precision, with a batch size of 64, a learning rate of 2e-5, and a text dropout ratio of 0.10.
  • It supports multiple control conditions—including Canny, HED, Depth, Pose and MLSD can be used like a standard ControlNet.
  • You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.65 to 0.80.

TODO

  • Train on more data and for more steps.
  • Support inpaint mode.

Results

PoseOutput
PoseOutput
CannyOutput
HEDOutput
DepthOutput

Inference

Go to the VideoX-Fun repository for more details.

Please clone the VideoX-Fun repository and create the required directories:

# Clone the code git clone https://github.com/aigc-apps/VideoX-Fun.git # Enter VideoX-Fun's directory cd VideoX-Fun # Create model directories mkdir -p models/Diffusion_Transformer mkdir -p models/Personalized_Model

Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.

📦 models/ ├── 📂 Diffusion_Transformer/ │ └── 📂 Z-Image-Turbo/ ├── 📂 Personalized_Model/ │ └── 📦 Z-Image-Turbo-Fun-Controlnet-Union.safetensors

Then run the file examples/z_image_fun/predict_t2i_control.py.