Learning Goals
a. What is ControlNet?
b. How do you connect and use ControlNet?
c. How do you adjust the coefficients in ControlNet?
d. How do you use multiple ControlNets?
e. What is a preprocessor?
f. How do you choose an appropriate preprocessor?
g. Preprocessor-ControlNet model compatibility chart
Plain Text
복사
Workflow
Index | Workflow Name | Actionable links | Summary |
4.2.1 | ControlNet(SDXL) | Using 1 ControlNet in SDXL | |
4.2.2 | ControlNet(Multi, SDXL) | Using 2 ControlNets in SDXL | |
4.2.3 | ControlNet(SD1.5) | Using 1 ControlNet in SD1.5 | |
4.2.4 | ControlNet(Multi, SD1.5) | Using 2 ControlNets in SD1.5 | |
4.2.5 | ComparePreprocessor | Compare the major preprocessors. | |
4.2.6 | ComparePreprocessor(All) | Compares a number of preprocessors.. |
if you need templates more
a. What is ControlNet?
Before ControlNet, generative AI couldn't be effectively used in practical applications by creators because it only produced random results, making it impossible to control the output according to specific intentions. When generating images, the process is primarily based on the provided prompt, which results in random generation of details such as pose, background, and composition. However, by using a ControlNet model, this randomness can be controlled, making it possible to generate images more effectively aligned with desired outcomes.
An example of applying Depth to a landscape image to achieve effects like sunset, snow, and night
Referring to the diagram learned in 4-1, ControlNet should be connected to the conditioning process.
ControlNet has various preprocessors, with Depth, Canny, Line, and Pose being the most commonly used. The previous two sentences might seem difficult, but don't be intimidated by the complex terms; get comfortable with them gradually.
b. How to Connect and Use ControlNet
The basic components of ControlNet are as follows:
1.
Apply ControlNet (Advanced) Node: This is the core component.
2.
Load the Image: Load the image that you want to use as a condition.
3.
Preprocess the Image: Use a preprocessor to convert the loaded image into a suitable format for conditions, such as depth or line.
4.
Load the ControlNet Model: Load a ControlNet model that matches the preprocessed image. (The compatibility between the preprocessor and ControlNet model is described at the bottom of the document.)
Now, you can connect these modules (node bundle) to use ControlNet.
The connection is as follows: Originally, the prompt was fed directly into KSampler, but now ControlNet is inserted in between. Connect the Positive to Positive and Negative to Negative. That’s it. (Clip, Conditioning, Positive Prompt, Negative Prompt, and Positive all serve similar functions, so ControlNet is also indicated in yellow for consistency.)
An example of inserting a cup photo, inputting the prompt as "one green mug, garden background," and applying Depth.
c. How to Adjust the ControlNet Strength Value?
Set the value between 0.4 and 0.6. You can set it to 1 as well. If you set it to 0, ControlNet will not be applied. However, setting the strength too high with a command like "Make it exactly as I specified!" can reduce the image quality, so it's often better to use a lower value for better results. Try setting it around 0.6 and adjust as needed.
(You don’t need to know start percent and end percent, but to explain: Setting start percent to 0.4 and end percent to 0.8, and assuming KSampler's step is 20, means applying the ControlNet effect during the range from 40% (8 steps) to 80% (16 steps) of the image generation process. In other words, ControlNet will only be applied between steps 8 and 16.)
d. How to use Multi-ControlNet?
It would be great if ControlNet were perfect, but like checkpoints, ControlNet is not always flawless. Sometimes, using just one may yield unsatisfactory results. In such cases, multiple ControlNets can be used to mitigate these shortcomings. The connection method is the same. It’s not difficult, so let’s take a look.
Connection Method: Instead of inserting just one ControlNet between the prompt and KSampler, now you insert two ControlNets and then connect positive and negative values in series.
Coefficient Optimization: Start with (0.4, 0.4). Combined, this makes 0.8. As previously explained, it's better to start with lower values rather than higher ones like (1, 1).
Controlnet1 | Controlnet2 | Sum |
0.5 | 0.5 | 1 |
0.4 | 0.4 | 0.8 |
0.3 | 0.5 | 0.8 |
0.4 | 0.2 | 0.6 |
0.3 | 0.3 | 0.6 |
1 | 1 | 2 |
(Example of Using Color Grading) For instance, using only depth might result in the loss of details like the window grilles in the background. To mitigate this drawback, you can use it in conjunction with Canny.
e. What is a preprocessor?
The term "preprocessor" refers to "preprocessing" in Korean and involves extracting only the "features" from the loaded image.
There are various types, but it's helpful to categorize them into roughly five groups: Canny, Line, Depth, Pose, Etc (Others).
Try running this workflow yourself to get a feel for it. Although this workflow is not used frequently, it's worth exploring at least once.
(Note that even within the same Line type, there are exceptions like the SoftEdge series, such as Pidi, and different Line types like MistroLine (AnyLine). These are nuances that come with experience.)
[1] Canny → Captures contour features as lines.
[2] Line → Captures lines with more detail.
[3] Depth → Interprets the image in 3D space. (Normalmap is similar to depth but is less commonly used.)
[4] Pose → Captures human poses. (Cannot handle characters or animals with proportions similar to 2-head tall animations.)
[5] Others → MLSD is used for structures with many straight lines, such as architecture. Logos and QR codes use separately trained models.
f. What preprocessor is appropriate to choose?
In practical scenarios, you often face the question of which preprocessor to apply.
In such cases, run this workflow to check which preprocessor best extracts and enhances the features of the image you want to use.
(It is also recommended to collect the preprocessors you often use and execute the workflow when needed. This depends on personal preference and experience. It would be great if a unified model integrating everything comes out soon.)
Then, match the compatibility according to the relationship between SDXL and SD1.5. For example, SDXL+Canny → control-lora-canny-rank256.safetensors
g. Preprocessor-ControlNet Model Compatibility Table
SDXL
Preprocessor | ControlNet Model |
canny | control-lora-canny-rank256.safetensors |
sktech series
(line, scribble, hed, softedge) | control-lora-sketch-rank256.safetensors |
line | mistoLine_fp16.safetensors |
scribble, hed, softedge | controlnet-sd-xl-1.0-softedge-dexined.safetensors |
depth | control-lora-depth-rank256.safetensors |
depth | depth-zoe-xl-v1.0-controlnet.safetensors |
pose | OpenPoseXL2.safetensors |
Explanation | diffusion_pytorch_model.safetensors |
SD1.5
Preprocessor | ControlNet Model |
canny | [1.5] control_v11p_sd15_canny_fp16.safetensors |
line | [1.5] control_v11p_sd15_lineart_fp16.safetensors |
line | [1.5] control_v11p_sd15s2_lineart_anime_fp16.safetensors |
scribble | [1.5] control_v11p_sd15_scribble_fp16.safetensors |
softedge, hed | [1.5] control_v11p_sd15_softedge_fp16.safetensors |
depthmap | [1.5] control_v11f1p_sd15_depth_fp16.safetensors |
normalmap | [1.5] control_v11p_sd15_normalbae_fp16.safetensors |
pose | [1.5] control_v11p_sd15_openpose_fp16.safetensors |
qrcode | [1.5] qrCodeMonster_v20.safetensors |
mlsd | [1.5] control_v11p_sd15_mlsd_fp16.safetensors |