Canny: Difference between revisions

From Stable Diffusion Wiki
Jump to navigation Jump to search
(Created page with "A powerful computer program like '''Stable Diffusion''' that can create pictures from descriptions is already pretty amazing. But someone created a new tool called '''ControlNet''' that makes it even better. Think of a neural network, like '''Stable Diffusion''', that can turn text into images. '''ControlNet''' is a new system designed to fine-tune this process, making it more flexible to different tasks. It can work with small amounts of data, be trained quickly, and d...")
 
mNo edit summary
 
Line 1: Line 1:
A powerful computer program like '''Stable Diffusion''' that can create pictures from descriptions is already pretty amazing. But someone created a new tool called '''ControlNet''' that makes it even better.
Canny Edge is a control type of the ControlNet extension.  The edge detector obtains 3M edge-imagecaption pairs from the internet. The model is trained with 600 GPU-hours with Nvidia A100 80G.
 
[[File:CannyEdgeDetection.png|center|thumb|470x470px|Figure 1: Control Stable Diffusion with Canny edge map. The canny edge map is input, and the source image is not used when we generate the images on the right. The outputs are achieved with a default prompt “a high-quality, detailed, and professional image”. This prompt is used in this paper as a default prompt that does not mention anything about the image contents and object names. Most of figures in this paper are high-resolution images and best viewed when zoomed in. arXiv:2302.05543v1 [cs.CV] 10 Feb 2023]]
Think of a neural network, like '''Stable Diffusion''', that can turn text into images. '''ControlNet''' is a new system designed to fine-tune this process, making it more flexible to different tasks. It can work with small amounts of data, be trained quickly, and doesn't require huge computers to run. This system can be tailored to various image-related tasks, making it more efficient and useful.
 
'''ControlNet''' is a neural network architecture designed to control pre-trained large diffusion models, enabling them to support additional input conditions and tasks. This end-to-end learning approach ensures robustness, even with small training datasets. Training a '''ControlNet''' is comparable in speed to fine-tuning a diffusion model, and it can be done on personal devices or scaled up if powerful computation clusters are available. This flexibility makes '''ControlNet''' an effective tool for augmenting large diffusion models like '''Stable Diffusion''', allowing for conditional inputs and facilitating diverse applications.

Latest revision as of 11:24, 22 August 2023

Canny Edge is a control type of the ControlNet extension. The edge detector obtains 3M edge-imagecaption pairs from the internet. The model is trained with 600 GPU-hours with Nvidia A100 80G.

Figure 1: Control Stable Diffusion with Canny edge map. The canny edge map is input, and the source image is not used when we generate the images on the right. The outputs are achieved with a default prompt “a high-quality, detailed, and professional image”. This prompt is used in this paper as a default prompt that does not mention anything about the image contents and object names. Most of figures in this paper are high-resolution images and best viewed when zoomed in. arXiv:2302.05543v1 [cs.CV] 10 Feb 2023