CFG Scale

From Stable Diffusion Wiki
Revision as of 19:07, 27 August 2023 by StableTiger3 (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

What is the role of CFG Scale in Stable Diffusion? In Stable Diffusion, the CFG Scale operates with an inverse relationship between fidelity and quality. High fidelity means that the generated image closely matches the details and intent described in the text prompt. This includes not just the main subject but also the background, colors, positions, and any other descriptive elements mentioned. Increasing the CFG Scale value leads to an output image that aligns more closely with the input prompt or image, but this comes at the cost of diminished quality. The strength of denoising influences the level of creativity the AI may inject into the output, while the CFG Scale governs how faithfully the result mirrors the input prompts. By fine-tuning both of these settings, I can reach the optimal mix of innovation and precision in my stylization.

For guidance, use a CFG Scale value within the range of 7 to 9. Elevate the value if the generated image deviates from the prompt. It's advisable to avoid the extreme values of 1 and 20, as the correct setting hinges on both the specific results you're aiming for and the intricacy of the text prompt.


Here's a rudimentary chart to show the inverse relationship between fidelity and quality.

Inverse Relationship Between Quality and Fidelity